Skip to content
Back to Blog
2 min read

OneLake: The Foundation of Microsoft Fabric

I wrote “OneLake: The Foundation of Microsoft Fabric” to share practical, production-minded guidance on this topic.

What is OneLake?

OneLake is a single, unified, logical data lake for your entire organization. Think of it as “OneDrive for data” - automatically provisioned when you enable Fabric, with no storage accounts to create or manage.

# Traditional Azure Storage Model
# Multiple storage accounts, multiple configurations
storage_accounts = [
    "adlsrawdata",      # Raw data landing
    "adlscurated",      # Curated/transformed data
    "adlsserving",      # Serving layer
    "adlsml",           # ML artifacts
]

# OneLake Model
# One logical lake, organized by workspaces
onelake = {
    "organization": "contoso.onelake.dfs.fabric.microsoft.com",
    "workspaces": [
        "Sales Analytics",
        "Marketing Data",
        "Finance Reporting",
        "Data Science Lab"
    ]
}

OneLake Architecture

┌─────────────────────────────────────────────────────────────┐
│                        OneLake                               │
│                  (Organization Level)                        │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐         │
│  │  Workspace  │  │  Workspace  │  │  Workspace  │         │
│  │   Sales     │  │  Marketing  │  │   Finance   │         │
│  ├─────────────┤  ├─────────────┤  ├─────────────┤         │
│  │ Lakehouse A │  │ Lakehouse C │  │ Warehouse E │         │
│  │ Lakehouse B │  │ Lakehouse D │  │ Lakehouse F │         │
│  └─────────────┘  └─────────────┘  └─────────────┘         │
├─────────────────────────────────────────────────────────────┤
│                    Delta Lake Format                         │
│                   (Parquet + Transaction Log)                │
└─────────────────────────────────────────────────────────────┘

Key OneLake Features

1. Automatic Provisioning

# No infrastructure code needed
# OneLake is automatically available when Fabric is enabled

# Access pattern for Spark
lakehouse_path = "abfss://workspace@onelake.dfs.fabric.microsoft.com/lakehouse.Lakehouse/Tables/sales"

# Read data - no storage account keys, no SAS tokens
df = spark.read.format("delta").load(lakehouse_path)

2. Delta Lake by Default

All data in OneLake uses Delta Lake format:

# Write data to OneLake - automatically uses Delta
df.write \
    .format("delta") \
    .mode("overwrite") \
    .saveAsTable("Tables/customers")

# Benefits of Delta Lake:
# - ACID transactions
# - Schema enforcement
# - Time travel
# - Efficient updates (MERGE)

3. Unified Security

# Security is managed through Fabric workspace roles
# No need to configure:
# - Storage account RBAC
# - ACLs on folders
# - SAS token policies

# Workspace roles map to data access:
workspace_roles = {
    "Admin": "Full control of workspace and all items",
    "Member": "Edit all items, share items",
    "Contributor": "Edit all items",
    "Viewer": "View all items, cannot edit"
}

4. Shortcuts

Shortcuts are a game-changing feature that allows you to reference external data without copying:

# Create a shortcut to existing ADLS Gen2 data
# This appears as a folder in your Lakehouse but data stays in place

shortcut_definition = {
    "name": "external_sales",
    "target": {
        "adlsGen2": {
            "location": "https://existingstorageaccount.dfs.core.windows.net/",
            "path": "/raw/sales/"
        }
    }
}

# After creating the shortcut, access it like native OneLake data
df = spark.read.format("delta").load("Files/external_sales/")

Shortcuts support:

  • Azure Data Lake Storage Gen2
  • Amazon S3
  • Google Cloud Storage (coming)
  • Dataverse

OneLake File Structure

# Lakehouse structure in OneLake
workspace/
└── lakehouse.Lakehouse/
    ├── Tables/           # Managed Delta tables
    │   ├── customers/
    │   │   ├── _delta_log/
    │   │   └── *.parquet
    │   └── orders/
    │       ├── _delta_log/
    │       └── *.parquet
    └── Files/            # Unmanaged files (any format)
        ├── raw/
        │   └── data.csv
        └── staging/
            └── temp.json

Accessing OneLake

From Spark Notebooks

# Relative paths within the Lakehouse
df = spark.read.format("delta").table("sales")

# Absolute OneLake paths
df = spark.read.format("delta").load(
    "abfss://workspace@onelake.dfs.fabric.microsoft.com/lakehouse.Lakehouse/Tables/sales"
)

From T-SQL (SQL Endpoint)

-- Lakehouse tables appear automatically in the SQL endpoint
SELECT * FROM lakehouse.dbo.sales;

-- Query across multiple Lakehouses
SELECT * FROM lakehouse1.dbo.customers c
JOIN lakehouse2.dbo.orders o ON c.customer_id = o.customer_id;

From Power BI

// Direct Lake mode - no import, no DirectQuery limitations
// Power BI reads directly from Delta tables in OneLake

From External Tools

# Use Azure Storage SDK with OneLake endpoint
from azure.storage.filedatalake import DataLakeServiceClient

service = DataLakeServiceClient(
    account_url="https://onelake.dfs.fabric.microsoft.com",
    credential=DefaultAzureCredential()
)

# Access workspace as container, lakehouse as directory
file_system = service.get_file_system_client("workspace-name")
directory = file_system.get_directory_client("lakehouse.Lakehouse/Files")

Migration Considerations

If you are moving from ADLS Gen2 to OneLake:

# Option 1: Use shortcuts (no data movement)
# Best for: Large datasets, gradual migration

# Option 2: Copy data using pipelines
# Best for: Clean break, new governance

# Option 3: Hybrid approach
# Use shortcuts for historical data
# Write new data directly to OneLake

Best Practices

  1. Organize by Workspace: Each business domain gets its own workspace
  2. Use Tables for Structured Data: Leverage Delta table management
  3. Use Files for Landing Zones: Raw files before transformation
  4. Leverage Shortcuts: Avoid copying data when possible
  5. Plan for Cross-Workspace Access: Use workspace roles carefully

OneLake is the foundation that makes Fabric’s unified experience possible. Tomorrow, I will explore the Lakehouse - the primary artifact you will build on top of OneLake.

Resources

Michael John Peña

Michael John Peña

Senior Data Engineer based in Sydney. Writing about data, cloud, and technology.