Skip to content
Back to Blog
1 min read

Databricks SQL Analytics: BI on the Lakehouse

SQL Analytics is Databricks’ bid for the BI persona who never wanted to learn PySpark. A T-SQL endpoint over Delta Lake, a query editor that looks like every other web SQL UI, and dashboards that don’t require Power BI to render. For lakehouse architectures it closes the last awkward gap—data engineers in notebooks, analysts in SQL, both writing against the same Delta tables. Not a replacement for Power BI, but a credible answer to “I just need to write a query.”

SQL Endpoints

-- Create SQL endpoint (via UI or API)
-- SQL endpoints provide serverless compute for queries

-- Query Delta tables
SELECT
    product_category,
    SUM(revenue) as total_revenue,
    COUNT(DISTINCT customer_id) as unique_customers
FROM delta.`/mnt/data/sales`
WHERE order_date >= '2021-01-01'
GROUP BY product_category
ORDER BY total_revenue DESC

SQL Endpoint Configuration

{
    "name": "analytics-endpoint",
    "cluster_size": "Medium",
    "min_num_clusters": 1,
    "max_num_clusters": 10,
    "auto_stop_mins": 30,
    "enable_photon": true,
    "spot_instance_policy": "COST_OPTIMIZED",
    "channel": "CHANNEL_NAME_CURRENT"
}

Query History and Optimization

-- View query history
SELECT
    query_id,
    query_text,
    execution_time_ms,
    rows_produced,
    bytes_read
FROM system.query_history
WHERE start_time >= current_date() - INTERVAL 7 DAYS
ORDER BY execution_time_ms DESC
LIMIT 20

-- Analyze query plan
EXPLAIN FORMATTED
SELECT * FROM sales WHERE product_id = 'ABC123'

Photon Engine

Photon accelerates SQL queries on Delta Lake:

-- Photon automatically optimizes:
-- - Joins
-- - Aggregations
-- - Filters on Parquet/Delta

-- Check Photon usage
SELECT
    query_id,
    photon_enabled,
    photon_cpu_time_ms,
    total_time_ms
FROM system.query_history
WHERE start_time >= current_date()

Built-in Visualizations

-- Query results can be visualized directly
-- Supported chart types:
-- - Bar, Line, Area charts
-- - Pie, Scatter plots
-- - Maps
-- - Tables with formatting
-- - Counters and gauges

SELECT
    DATE_TRUNC('month', order_date) as month,
    SUM(revenue) as monthly_revenue
FROM sales
WHERE order_date >= '2020-01-01'
GROUP BY 1
ORDER BY 1
-- Visualize as Line Chart

Dashboards

# Create dashboard via API
import requests

dashboard = {
    "name": "Sales Dashboard",
    "tags": ["sales", "monthly"],
    "widgets": [
        {
            "visualization_id": "viz_123",
            "options": {
                "position": {"col": 0, "row": 0, "size_x": 6, "size_y": 4}
            }
        }
    ]
}

response = requests.post(
    f"{workspace_url}/api/2.0/preview/sql/dashboards",
    headers={"Authorization": f"Bearer {token}"},
    json=dashboard
)

Alerts

-- Create alert on query results
-- Alert when daily revenue drops below threshold

SELECT SUM(revenue) as daily_revenue
FROM sales
WHERE order_date = current_date()

-- Alert configuration:
-- Trigger when: daily_revenue < 10000
-- Frequency: Every 1 hour
-- Notify: email, Slack, PagerDuty

Connect BI Tools

Power BI

// Connection string
Server: adb-xxxxx.azuredatabricks.net
HTTP Path: /sql/1.0/endpoints/xxxxx
Authentication: Azure AD

Tableau

Server: adb-xxxxx.azuredatabricks.net
Port: 443
HTTP Path: /sql/1.0/endpoints/xxxxx
Authentication: Personal Access Token

Query Parameters

-- Parameterized queries
SELECT *
FROM sales
WHERE product_category = {{ category }}
  AND order_date BETWEEN {{ start_date }} AND {{ end_date }}
  AND revenue >= {{ min_revenue }}

-- Parameters appear as input fields in the UI

Access Control

-- Grant query access
GRANT SELECT ON TABLE sales TO `analysts@company.com`;
GRANT USAGE ON SCHEMA analytics TO `analysts@company.com`;

-- Grant dashboard access
GRANT VIEW ON DASHBOARD `Sales Dashboard` TO `managers@company.com`;

Cost Optimization

TipBenefit
Use Photon2-8x faster queries
Auto-stopNo idle costs
Spot instancesLower compute costs
Query cachingAvoid redundant scans

Databricks SQL: warehouse speed on lakehouse data.\n\n## Takeaways\n\nAdd a concise, personal takeaway and recommended next steps here.\n

Michael John Peña

Michael John Peña

Senior Data Engineer based in Sydney. Writing about data, cloud, and technology.