1 min read
AKS Cost Optimization: Strategies for Reducing Kubernetes Spending
I wrote “AKS Cost Optimization: Strategies for Reducing Kubernetes Spending” to share practical, production-minded guidance on this topic.
Understanding AKS Costs
AKS costs come from several components:
- Virtual Machine nodes
- Storage (managed disks, Azure Files)
- Networking (load balancers, bandwidth)
- Container Registry
- Log Analytics
Right-Sizing Your Nodes
Use the Kubernetes Vertical Pod Autoscaler (VPA) to understand actual resource usage:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Off" # Just recommend, don't auto-apply
resourcePolicy:
containerPolicies:
- containerName: '*'
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 4
memory: 8Gi
Cluster Autoscaler Configuration
Configure the cluster autoscaler for cost efficiency:
agentPoolProfiles: [
{
name: 'userpool'
count: 2
vmSize: 'Standard_D4s_v3'
enableAutoScaling: true
minCount: 1
maxCount: 20
scaleDownMode: 'Delete'
scaleSetEvictionPolicy: 'Delete'
}
]
Set appropriate scale-down settings:
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--cluster-autoscaler-profile \
scale-down-delay-after-add=10m \
scale-down-unneeded-time=10m \
scale-down-utilization-threshold=0.5
Using Spot Node Pools
Spot VMs can reduce costs by up to 90%:
resource spotPool 'Microsoft.ContainerService/managedClusters/agentPools@2021-10-01' = {
parent: aksCluster
name: 'spotpool'
properties: {
count: 3
vmSize: 'Standard_D4s_v3'
scaleSetPriority: 'Spot'
scaleSetEvictionPolicy: 'Delete'
spotMaxPrice: -1 // Use current spot price
enableAutoScaling: true
minCount: 0
maxCount: 50
nodeTaints: [
'kubernetes.azure.com/scalesetpriority=spot:NoSchedule'
]
nodeLabels: {
'kubernetes.azure.com/scalesetpriority': 'spot'
}
}
}
Schedule workloads on spot nodes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-job
spec:
template:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.azure.com/scalesetpriority
operator: In
values:
- spot
tolerations:
- key: kubernetes.azure.com/scalesetpriority
operator: Equal
value: spot
effect: NoSchedule
containers:
- name: batch
image: myregistry.azurecr.io/batch:latest
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "2Gi"
Reserved Instances for Base Capacity
For predictable workloads, use reserved instances:
# Calculate your base capacity needs first
# Then purchase reservations for consistent savings
az reservations reservation-order purchase \
--sku Standard_D4s_v3 \
--location eastus \
--reserved-resource-type VirtualMachines \
--billing-scope-id /subscriptions/xxx \
--term P1Y \
--quantity 5 \
--applied-scope-type Single \
--applied-scopes /subscriptions/xxx/resourceGroups/myResourceGroup
Implementing Pod Disruption Budgets
Protect critical workloads while allowing cost optimization:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: api
Cost Monitoring with KubeCost
Deploy KubeCost for visibility:
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost \
--create-namespace \
--set kubecostToken="xxx"
Summary
Key cost optimization strategies:
- Right-size resources using VPA recommendations
- Use cluster autoscaler aggressively
- Leverage Spot VMs for fault-tolerant workloads
- Purchase reserved instances for baseline capacity
- Monitor and analyze costs continuously
- Implement resource quotas and limit ranges
A well-optimized AKS cluster can reduce costs by 50-70% without sacrificing performance.