5 min read
Fabric Workspace Setup: Organization and Best Practices
A well-organized workspace structure is crucial for success with Microsoft Fabric. Today I’ll share workspace organization patterns and best practices learned from setting up Fabric environments.
Workspace Fundamentals
In Fabric, a workspace is:
- A container for all your Fabric items
- A security boundary
- A capacity assignment unit
- A collaboration space
# Workspace components
workspace_contains = {
"data_engineering": ["Lakehouse", "Notebook", "Spark Job Definition"],
"data_factory": ["Data Pipeline", "Dataflow Gen2", "Copy Job"],
"data_warehouse": ["Warehouse"],
"data_science": ["ML Model", "ML Experiment", "Notebook"],
"real_time_analytics": ["KQL Database", "KQL Queryset", "Eventstream"],
"power_bi": ["Report", "Dataset", "Dashboard"]
}
Workspace Naming Conventions
Establish a consistent naming pattern:
# Recommended naming pattern
# {organization}-{domain}-{environment}-{purpose}
workspace_examples = [
"contoso-sales-dev-engineering",
"contoso-sales-prod-analytics",
"contoso-finance-dev-reporting",
"contoso-shared-prod-governance"
]
# Avoid these patterns:
bad_examples = [
"test", # Too vague
"john's workspace", # Personal names
"new workspace", # Non-descriptive
"copy of sales" # Indicates poor organization
]
Workspace Organization Patterns
Pattern 1: Environment-Based
Separate workspaces by environment:
contoso-sales-dev/
sales_lakehouse
transform_notebook
sales_pipeline
contoso-sales-test/
sales_lakehouse
transform_notebook
sales_pipeline
contoso-sales-prod/
sales_lakehouse
transform_notebook
sales_pipeline
# Benefits:
# - Clear separation of environments
# - Easy to implement CI/CD
# - Simple permission model
# Drawbacks:
# - Many workspaces to manage
# - Potential for drift between environments
Pattern 2: Layer-Based
Separate by data architecture layer:
contoso-raw-layer/
raw_sales_lakehouse
raw_inventory_lakehouse
ingestion_pipelines
contoso-curated-layer/
curated_lakehouse
transformation_notebooks
contoso-serving-layer/
warehouse
power_bi_reports
# Benefits:
# - Aligns with medallion architecture
# - Clear data lineage
# - Easy to manage access by layer
# Drawbacks:
# - Cross-workspace operations more complex
# - May need shortcuts for data access
Pattern 3: Domain-Based
Separate by business domain:
sales-domain/
sales_lakehouse
sales_warehouse
sales_reports
marketing-domain/
marketing_lakehouse
campaign_analytics
finance-domain/
finance_warehouse
financial_reports
# Benefits:
# - Aligns with data mesh principles
# - Domain teams have autonomy
# - Clear ownership
# Drawbacks:
# - Need governance for cross-domain data
# - Potential data duplication
Setting Up a Workspace
# Workspace configuration checklist
workspace_setup = {
"basic_settings": {
"name": "Follow naming convention",
"description": "Include purpose and owner",
"contact_list": "Team email or distribution list",
"license_mode": "Fabric capacity or Premium"
},
"advanced_settings": {
"onelake_files_default": True,
"data_model_settings": "Large dataset storage format",
"default_semantic_model": "Configure as needed"
},
"access_settings": {
"member_permissions": "Define roles carefully",
"workspace_admins": "Limit to necessary people",
"app_permissions": "Configure for sharing"
}
}
Role-Based Access Control
Fabric workspaces have four roles:
workspace_roles = {
"Admin": {
"description": "Full control over workspace",
"can_do": [
"Manage workspace settings",
"Add/remove members",
"Delete workspace",
"All Contributor permissions"
],
"assign_to": "Platform team, workspace owners"
},
"Member": {
"description": "Create and manage content",
"can_do": [
"Create items",
"Edit items",
"Share items",
"Cannot manage workspace settings"
],
"assign_to": "Data engineers, analysts"
},
"Contributor": {
"description": "Create content, limited sharing",
"can_do": [
"Create items",
"Edit items",
"Cannot share items"
],
"assign_to": "Developers in training"
},
"Viewer": {
"description": "View content only",
"can_do": [
"View items",
"Run reports",
"Cannot edit or create"
],
"assign_to": "Business users, stakeholders"
}
}
# Print role matrix
for role, details in workspace_roles.items():
print(f"\n{role}:")
print(f" {details['description']}")
print(f" Assign to: {details['assign_to']}")
Cross-Workspace Data Access
Use shortcuts for cross-workspace data access:
# Creating a shortcut to another workspace's Lakehouse
# In your Lakehouse:
# 1. Right-click Tables or Files
# 2. Select "New shortcut"
# 3. Choose "Microsoft OneLake"
# 4. Select source workspace and Lakehouse
# 5. Choose Tables or Files to shortcut
# This enables:
# - No data duplication
# - Central governance
# - Cross-workspace queries
# Example: Query data across workspaces
"""
-- In SQL endpoint, shortcuts appear as regular tables
SELECT
s.order_id,
s.product_id,
p.product_name,
s.quantity
FROM sales_data s -- Local table
JOIN product_catalog p -- Shortcut from another workspace
ON s.product_id = p.product_id
"""
Workspace Governance
Implement governance controls:
# Governance checklist
governance_controls = {
"naming_standards": {
"enforce": True,
"pattern": "{prefix}-{domain}-{env}-{purpose}",
"validation": "Manual review or automated policy"
},
"capacity_assignment": {
"enforce": True,
"policy": "Only approved capacities",
"exceptions": "Document in governance wiki"
},
"data_classification": {
"enforce": True,
"levels": ["Public", "Internal", "Confidential", "Restricted"],
"labeling": "Required for all datasets"
},
"access_reviews": {
"frequency": "Quarterly",
"reviewer": "Workspace admin",
"documentation": "Required"
}
}
Workspace Monitoring
Monitor workspace health and usage:
# Key metrics to track
monitoring_metrics = {
"usage": {
"items_created": "Track growth",
"active_users": "Identify adoption",
"query_volume": "Understand demand"
},
"performance": {
"refresh_duration": "Pipeline and dataset",
"query_latency": "SQL endpoint",
"spark_job_duration": "Notebooks"
},
"capacity": {
"cu_consumption": "By workspace",
"storage_used": "OneLake consumption",
"throttling_events": "Capacity pressure"
}
}
# Access via Admin Portal > Usage metrics
My Recommended Structure
For most organizations, I recommend a hybrid approach:
# Platform workspaces (shared services)
platform-shared-governance/
platform-shared-reference-data/
# Domain workspaces (per team/domain)
sales-dev/
sales-prod/
marketing-dev/
marketing-prod/
# Sandbox workspaces (exploration)
sandbox-{user-name}/
This provides:
- Clear separation of concerns
- Domain autonomy with governance
- Safe exploration space
Tomorrow we’ll explore OneLake explorer and how to navigate your data across workspaces.