Back to Blog
2 min read

Kubernetes Cost Optimization on AKS: Lessons from 2025

Kubernetes cost management became a critical skill in 2025 as organizations scaled their AKS deployments. Here are the strategies that consistently reduced costs by 40-60% without sacrificing reliability.

Right-Sizing Workloads

The biggest cost savings come from accurate resource requests:

# Before optimization - over-provisioned
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
spec:
  replicas: 5
  template:
    spec:
      containers:
      - name: api
        resources:
          requests:
            memory: "2Gi"
            cpu: "1000m"
          limits:
            memory: "4Gi"
            cpu: "2000m"

# After optimization - based on actual metrics
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: api
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"

Implementing KEDA for Scale-to-Zero

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: queue-processor-scaler
spec:
  scaleTargetRef:
    name: queue-processor
  minReplicaCount: 0  # Scale to zero during off-hours
  maxReplicaCount: 20
  triggers:
  - type: azure-servicebus
    metadata:
      queueName: orders
      messageCount: "5"
      connectionFromEnv: SERVICEBUS_CONNECTION
  advanced:
    horizontalPodAutoscalerConfig:
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 50
            periodSeconds: 60

Spot Instances for Non-Critical Workloads

# Create spot node pool for batch processing
az aks nodepool add \
  --resource-group myRG \
  --cluster-name myAKS \
  --name spotnodes \
  --priority Spot \
  --eviction-policy Delete \
  --spot-max-price -1 \
  --node-count 3 \
  --node-vm-size Standard_D4s_v3 \
  --labels workload-type=batch
# Schedule batch jobs on spot nodes
apiVersion: batch/v1
kind: Job
metadata:
  name: data-processing
spec:
  template:
    spec:
      nodeSelector:
        workload-type: batch
      tolerations:
      - key: "kubernetes.azure.com/scalesetpriority"
        operator: "Equal"
        value: "spot"
        effect: "NoSchedule"
      containers:
      - name: processor
        image: myregistry/processor:latest

Cost Monitoring Dashboard

-- KQL query for cost analysis
ContainerInventory
| where TimeGenerated > ago(7d)
| summarize
    AvgCPU = avg(CPUPercentage),
    AvgMemory = avg(MemoryUsedPercentage),
    TotalGB = sum(ImageSize) / 1024 / 1024 / 1024
by ContainerName, bin(TimeGenerated, 1h)
| where AvgCPU < 10 or AvgMemory < 20
| project ContainerName, AvgCPU, AvgMemory

Key Metrics to Track

  1. CPU/Memory utilization ratio - Target 60-70%
  2. Node count vs workload - Minimize idle nodes
  3. Spot vs on-demand ratio - Aim for 40% spot for suitable workloads
  4. Egress costs - Often overlooked but significant

Kubernetes cost optimization is continuous. Review these metrics weekly and adjust based on actual usage patterns.

Michael John Peña

Michael John Peña

Senior Data Engineer based in Sydney. Writing about data, cloud, and technology.