Back to Blog
5 min read

Fabric Workspace Setup: Organization and Best Practices

A well-organized workspace structure is crucial for success with Microsoft Fabric. Today I’ll share workspace organization patterns and best practices learned from setting up Fabric environments.

Workspace Fundamentals

In Fabric, a workspace is:

  • A container for all your Fabric items
  • A security boundary
  • A capacity assignment unit
  • A collaboration space
# Workspace components
workspace_contains = {
    "data_engineering": ["Lakehouse", "Notebook", "Spark Job Definition"],
    "data_factory": ["Data Pipeline", "Dataflow Gen2", "Copy Job"],
    "data_warehouse": ["Warehouse"],
    "data_science": ["ML Model", "ML Experiment", "Notebook"],
    "real_time_analytics": ["KQL Database", "KQL Queryset", "Eventstream"],
    "power_bi": ["Report", "Dataset", "Dashboard"]
}

Workspace Naming Conventions

Establish a consistent naming pattern:

# Recommended naming pattern
# {organization}-{domain}-{environment}-{purpose}

workspace_examples = [
    "contoso-sales-dev-engineering",
    "contoso-sales-prod-analytics",
    "contoso-finance-dev-reporting",
    "contoso-shared-prod-governance"
]

# Avoid these patterns:
bad_examples = [
    "test",           # Too vague
    "john's workspace",  # Personal names
    "new workspace",  # Non-descriptive
    "copy of sales"   # Indicates poor organization
]

Workspace Organization Patterns

Pattern 1: Environment-Based

Separate workspaces by environment:

contoso-sales-dev/
    sales_lakehouse
    transform_notebook
    sales_pipeline

contoso-sales-test/
    sales_lakehouse
    transform_notebook
    sales_pipeline

contoso-sales-prod/
    sales_lakehouse
    transform_notebook
    sales_pipeline
# Benefits:
# - Clear separation of environments
# - Easy to implement CI/CD
# - Simple permission model

# Drawbacks:
# - Many workspaces to manage
# - Potential for drift between environments

Pattern 2: Layer-Based

Separate by data architecture layer:

contoso-raw-layer/
    raw_sales_lakehouse
    raw_inventory_lakehouse
    ingestion_pipelines

contoso-curated-layer/
    curated_lakehouse
    transformation_notebooks

contoso-serving-layer/
    warehouse
    power_bi_reports
# Benefits:
# - Aligns with medallion architecture
# - Clear data lineage
# - Easy to manage access by layer

# Drawbacks:
# - Cross-workspace operations more complex
# - May need shortcuts for data access

Pattern 3: Domain-Based

Separate by business domain:

sales-domain/
    sales_lakehouse
    sales_warehouse
    sales_reports

marketing-domain/
    marketing_lakehouse
    campaign_analytics

finance-domain/
    finance_warehouse
    financial_reports
# Benefits:
# - Aligns with data mesh principles
# - Domain teams have autonomy
# - Clear ownership

# Drawbacks:
# - Need governance for cross-domain data
# - Potential data duplication

Setting Up a Workspace

# Workspace configuration checklist
workspace_setup = {
    "basic_settings": {
        "name": "Follow naming convention",
        "description": "Include purpose and owner",
        "contact_list": "Team email or distribution list",
        "license_mode": "Fabric capacity or Premium"
    },
    "advanced_settings": {
        "onelake_files_default": True,
        "data_model_settings": "Large dataset storage format",
        "default_semantic_model": "Configure as needed"
    },
    "access_settings": {
        "member_permissions": "Define roles carefully",
        "workspace_admins": "Limit to necessary people",
        "app_permissions": "Configure for sharing"
    }
}

Role-Based Access Control

Fabric workspaces have four roles:

workspace_roles = {
    "Admin": {
        "description": "Full control over workspace",
        "can_do": [
            "Manage workspace settings",
            "Add/remove members",
            "Delete workspace",
            "All Contributor permissions"
        ],
        "assign_to": "Platform team, workspace owners"
    },
    "Member": {
        "description": "Create and manage content",
        "can_do": [
            "Create items",
            "Edit items",
            "Share items",
            "Cannot manage workspace settings"
        ],
        "assign_to": "Data engineers, analysts"
    },
    "Contributor": {
        "description": "Create content, limited sharing",
        "can_do": [
            "Create items",
            "Edit items",
            "Cannot share items"
        ],
        "assign_to": "Developers in training"
    },
    "Viewer": {
        "description": "View content only",
        "can_do": [
            "View items",
            "Run reports",
            "Cannot edit or create"
        ],
        "assign_to": "Business users, stakeholders"
    }
}

# Print role matrix
for role, details in workspace_roles.items():
    print(f"\n{role}:")
    print(f"  {details['description']}")
    print(f"  Assign to: {details['assign_to']}")

Cross-Workspace Data Access

Use shortcuts for cross-workspace data access:

# Creating a shortcut to another workspace's Lakehouse
# In your Lakehouse:
# 1. Right-click Tables or Files
# 2. Select "New shortcut"
# 3. Choose "Microsoft OneLake"
# 4. Select source workspace and Lakehouse
# 5. Choose Tables or Files to shortcut

# This enables:
# - No data duplication
# - Central governance
# - Cross-workspace queries

# Example: Query data across workspaces
"""
-- In SQL endpoint, shortcuts appear as regular tables
SELECT
    s.order_id,
    s.product_id,
    p.product_name,
    s.quantity
FROM sales_data s  -- Local table
JOIN product_catalog p  -- Shortcut from another workspace
    ON s.product_id = p.product_id
"""

Workspace Governance

Implement governance controls:

# Governance checklist
governance_controls = {
    "naming_standards": {
        "enforce": True,
        "pattern": "{prefix}-{domain}-{env}-{purpose}",
        "validation": "Manual review or automated policy"
    },
    "capacity_assignment": {
        "enforce": True,
        "policy": "Only approved capacities",
        "exceptions": "Document in governance wiki"
    },
    "data_classification": {
        "enforce": True,
        "levels": ["Public", "Internal", "Confidential", "Restricted"],
        "labeling": "Required for all datasets"
    },
    "access_reviews": {
        "frequency": "Quarterly",
        "reviewer": "Workspace admin",
        "documentation": "Required"
    }
}

Workspace Monitoring

Monitor workspace health and usage:

# Key metrics to track
monitoring_metrics = {
    "usage": {
        "items_created": "Track growth",
        "active_users": "Identify adoption",
        "query_volume": "Understand demand"
    },
    "performance": {
        "refresh_duration": "Pipeline and dataset",
        "query_latency": "SQL endpoint",
        "spark_job_duration": "Notebooks"
    },
    "capacity": {
        "cu_consumption": "By workspace",
        "storage_used": "OneLake consumption",
        "throttling_events": "Capacity pressure"
    }
}

# Access via Admin Portal > Usage metrics

For most organizations, I recommend a hybrid approach:

# Platform workspaces (shared services)
platform-shared-governance/
platform-shared-reference-data/

# Domain workspaces (per team/domain)
sales-dev/
sales-prod/
marketing-dev/
marketing-prod/

# Sandbox workspaces (exploration)
sandbox-{user-name}/

This provides:

  • Clear separation of concerns
  • Domain autonomy with governance
  • Safe exploration space

Tomorrow we’ll explore OneLake explorer and how to navigate your data across workspaces.

Resources

Michael John Peña

Michael John Peña

Senior Data Engineer based in Sydney. Writing about data, cloud, and technology.