Back to Blog
4 min read

Power BI Hybrid Tables: Real-Time Meets Historical

Hybrid tables in Power BI combine the best of import and DirectQuery modes within a single table. Historical data is imported for performance while recent data uses DirectQuery for real-time updates.

Understanding Hybrid Tables

A hybrid table has:

  • Import partitions: Historical data with fast performance
  • DirectQuery partition: Current data with real-time updates

Setting Up Hybrid Tables

Configure incremental refresh with real-time data:

{
  "table": "Sales",
  "incrementalRefresh": {
    "pollingExpression": "DateTime.LocalNow()",
    "incrementalGranularity": "day",
    "refreshRanges": {
      "refresh": {
        "count": 30,
        "unit": "day"
      },
      "archive": {
        "count": 3,
        "unit": "year"
      }
    },
    "detectDataChanges": true,
    "onlyRefreshComplete": true,
    "realTimeData": {
      "enabled": true,
      "pollingInterval": "PT1M"
    }
  }
}

Power Query Setup

// Define RangeStart and RangeEnd parameters
let
    Source = Sql.Database("server.database.windows.net", "SalesDB"),
    Sales = Source{[Schema="dbo", Item="Sales"]}[Data],

    // Filter for incremental refresh
    FilteredRows = Table.SelectRows(Sales, each
        [OrderDate] >= RangeStart and
        [OrderDate] < RangeEnd
    )
in
    FilteredRows

XMLA Endpoint Configuration

For advanced partition management:

<Alter xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <Object>
    <DatabaseID>Sales Analytics</DatabaseID>
    <TableID>Sales</TableID>
  </Object>
  <ObjectDefinition>
    <Table>
      <Name>Sales</Name>
      <Partitions>
        <Partition>
          <Name>Sales_2022_01</Name>
          <Mode>Import</Mode>
          <Source xsi:type="M">
            <Expression>
              let
                  Source = Sales,
                  Filtered = Table.SelectRows(Source, each
                      [OrderDate] >= #date(2022,1,1) and
                      [OrderDate] < #date(2022,2,1))
              in
                  Filtered
            </Expression>
          </Source>
        </Partition>
        <Partition>
          <Name>Sales_DirectQuery</Name>
          <Mode>DirectQuery</Mode>
          <Source xsi:type="M">
            <Expression>
              let
                  Source = Sales,
                  Filtered = Table.SelectRows(Source, each
                      [OrderDate] >= DateTime.LocalNow() - #duration(1,0,0,0))
              in
                  Filtered
            </Expression>
          </Source>
        </Partition>
      </Partitions>
    </Table>
  </ObjectDefinition>
</Alter>

Tabular Editor Script

Manage hybrid tables programmatically:

// Tabular Editor C# script for hybrid table setup
var table = Model.Tables["Sales"];

// Create import partition for historical data
var historicalPartition = table.AddPartition("Sales_Historical");
historicalPartition.Mode = ModeType.Import;
historicalPartition.Expression = @"
let
    Source = Sql.Database(""server"", ""db""),
    Sales = Source{[Schema=""dbo"", Item=""Sales""]}[Data],
    Filtered = Table.SelectRows(Sales, each [OrderDate] < DateTime.LocalNow() - #duration(1,0,0,0))
in
    Filtered
";

// Create DirectQuery partition for real-time data
var realtimePartition = table.AddPartition("Sales_RealTime");
realtimePartition.Mode = ModeType.DirectQuery;
realtimePartition.Expression = @"
let
    Source = Sql.Database(""server"", ""db""),
    Sales = Source{[Schema=""dbo"", Item=""Sales""]}[Data],
    Filtered = Table.SelectRows(Sales, each [OrderDate] >= DateTime.LocalNow() - #duration(1,0,0,0))
in
    Filtered
";

DAX Considerations

DAX queries automatically span both partitions:

// This measure works across both import and DirectQuery partitions
Total Sales =
SUM(Sales[Amount])

// Time intelligence also works seamlessly
Sales YTD =
TOTALYTD([Total Sales], Dates[Date])

// Real-time comparison
Sales vs Yesterday =
VAR TodaySales = CALCULATE([Total Sales], Dates[Date] = TODAY())
VAR YesterdaySales = CALCULATE([Total Sales], Dates[Date] = TODAY() - 1)
RETURN
TodaySales - YesterdaySales

Performance Optimization

// Aggregation table for historical data
SalesAgg =
SUMMARIZE(
    FILTER(Sales, Sales[OrderDate] < TODAY() - 30),
    Dates[YearMonth],
    Products[Category],
    "TotalSales", SUM(Sales[Amount]),
    "TotalQuantity", SUM(Sales[Quantity])
)

Monitoring Hybrid Table Performance

import requests
from datetime import datetime

class HybridTableMonitor:
    def __init__(self, workspace_id: str, dataset_id: str, token: str):
        self.base_url = f"https://api.powerbi.com/v1.0/myorg/groups/{workspace_id}/datasets/{dataset_id}"
        self.headers = {"Authorization": f"Bearer {token}"}

    def get_refresh_history(self):
        """Get refresh history to monitor partition status."""
        url = f"{self.base_url}/refreshes"
        response = requests.get(url, headers=self.headers)
        return response.json().get("value", [])

    def get_partition_info(self):
        """Get partition details via XMLA (requires Premium)."""
        # Use TMSL query to get partition information
        tmsl = {
            "query": """
                SELECT
                    [TABLE_NAME],
                    [PARTITION_NAME],
                    [MODIFIED_TIME],
                    [ROWS_COUNT]
                FROM $SYSTEM.TMSCHEMA_PARTITIONS
            """
        }
        # Execute via XMLA endpoint
        return tmsl

    def analyze_refresh(self, refresh_id: str):
        """Analyze specific refresh operation."""
        url = f"{self.base_url}/refreshes/{refresh_id}"
        response = requests.get(url, headers=self.headers)
        refresh = response.json()

        return {
            "status": refresh["status"],
            "startTime": refresh["startTime"],
            "endTime": refresh.get("endTime"),
            "duration": self._calculate_duration(refresh),
            "refreshType": refresh.get("refreshType")
        }

    def _calculate_duration(self, refresh):
        if refresh.get("endTime"):
            start = datetime.fromisoformat(refresh["startTime"].replace("Z", "+00:00"))
            end = datetime.fromisoformat(refresh["endTime"].replace("Z", "+00:00"))
            return (end - start).total_seconds()
        return None

Best Practices

  1. Choose partition boundary wisely - Recent 1-7 days as DirectQuery
  2. Use incremental refresh - Automate partition management
  3. Monitor both modes - Track import refresh and DirectQuery performance
  4. Optimize source - Index columns used in partition filters
  5. Test thoroughly - Ensure DAX works across partition boundaries

Hybrid tables provide the perfect balance between real-time data freshness and historical data performance.

Michael John Peña

Michael John Peña

Senior Data Engineer based in Sydney. Writing about data, cloud, and technology.