2 min read
Azure Cosmos DB Partitioning Strategies
Partition key choice is the most critical Cosmos DB design decision. Get it right for scalability; get it wrong for performance nightmares.
Partition Key Fundamentals
- Each logical partition: max 20GB storage
- Partition key is immutable after creation
- All queries should include partition key when possible
- Cross-partition queries are expensive
Good Partition Key Characteristics
- High cardinality - Many distinct values
- Even distribution - No hot partitions
- Query aligned - Most queries filter by it
Common Patterns
E-Commerce Orders
// Good: customerId (queries are per customer)
{
"id": "order-123",
"customerId": "cust-456", // Partition key
"orderDate": "2020-10-12",
"items": [...]
}
// Bad: orderDate (hot partition on current date)
IoT Telemetry
// Good: deviceId (queries are per device)
{
"id": "reading-789",
"deviceId": "sensor-001", // Partition key
"timestamp": "2020-10-12T10:30:00Z",
"temperature": 72.5
}
Multi-Tenant SaaS
// Good: tenantId (isolation and queries per tenant)
{
"id": "doc-123",
"tenantId": "tenant-abc", // Partition key
"type": "invoice",
"data": {...}
}
Synthetic Partition Keys
Combine fields for better distribution.
// Combine userId + year for time-series data
{
"id": "activity-123",
"partitionKey": "user-456_2020", // Synthetic key
"userId": "user-456",
"timestamp": "2020-10-12T10:30:00Z",
"action": "login"
}
Hierarchical Partition Keys (Preview)
// Define multiple levels
containerProperties.PartitionKeyPaths = new List<string> {
"/tenantId",
"/userId",
"/sessionId"
};
Querying with Partition Keys
-- Efficient: single partition
SELECT * FROM c
WHERE c.customerId = 'cust-456'
-- Less efficient: cross-partition (uses fan-out)
SELECT * FROM c
WHERE c.orderDate > '2020-01-01'
-- Add partition key for efficiency
SELECT * FROM c
WHERE c.customerId = 'cust-456'
AND c.orderDate > '2020-01-01'
Monitoring Partition Usage
# Check partition key statistics
az cosmosdb sql container show \
--account-name myaccount \
--database-name mydb \
--name mycontainer \
--query "resource.partitionKey"
Anti-Patterns to Avoid
| Anti-Pattern | Problem |
|---|---|
| Low cardinality key (e.g., status) | Few partitions, can’t scale |
| Timestamp as partition key | Hot partition on current time |
| Random GUID partition key | Cross-partition queries always |
| Same partition key as id | No logical grouping |
Choose your partition key carefully—it’s the foundation of Cosmos DB performance.