March 7, 2023 1 min read

Running Azure Cognitive Services in Containers

Azure AI Cognitive Services Containers Docker

Azure Cognitive Services can run in containers, enabling offline scenarios, reduced latency, and data sovereignty. This is crucial for edge computing and regulated industries. Let’s explore how to deploy them.

Why Containerized Cognitive Services?

Latency: Process locally instead of round-trips to Azure
Offline capability: Work without internet connectivity
Data residency: Keep data on-premises
Cost optimization: Reduce API call costs for high-volume scenarios
Air-gapped environments: Deploy in secure networks

Available Container Images

# Text Analytics
mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment
mcr.microsoft.com/azure-cognitive-services/textanalytics/language
mcr.microsoft.com/azure-cognitive-services/textanalytics/keyphrase

# Computer Vision
mcr.microsoft.com/azure-cognitive-services/vision/read

# Speech
mcr.microsoft.com/azure-cognitive-services/speechservices/speech-to-text
mcr.microsoft.com/azure-cognitive-services/speechservices/text-to-speech

# Form Recognizer
mcr.microsoft.com/azure-cognitive-services/form-recognizer/layout
mcr.microsoft.com/azure-cognitive-services/form-recognizer/invoice

Setting Up Sentiment Analysis Container

1. Create Azure Resource for Billing

Even containerized services require an Azure resource for billing:

# Create resource group
az group create --name rg-cognitive-services --location australiaeast

# Create Text Analytics resource
az cognitiveservices account create \
    --name my-text-analytics \
    --resource-group rg-cognitive-services \
    --kind TextAnalytics \
    --sku S \
    --location australiaeast

# Get keys and endpoint
az cognitiveservices account keys list \
    --name my-text-analytics \
    --resource-group rg-cognitive-services

az cognitiveservices account show \
    --name my-text-analytics \
    --resource-group rg-cognitive-services \
    --query "properties.endpoint"

2. Run the Container

docker run -d \
    --name sentiment \
    -p 5000:5000 \
    -e Eula=accept \
    -e Billing=https://australiaeast.api.cognitive.microsoft.com/ \
    -e ApiKey=your-api-key \
    mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest

3. Call the Local API

import requests
import json

def analyze_sentiment_local(texts: list[str]) -> dict:
    """Analyze sentiment using local container."""
    url = "http://localhost:5000/text/analytics/v3.0/sentiment"

    documents = [
        {"id": str(i), "text": text, "language": "en"}
        for i, text in enumerate(texts)
    ]

    response = requests.post(
        url,
        headers={"Content-Type": "application/json"},
        json={"documents": documents}
    )

    return response.json()

# Usage
texts = [
    "Azure is fantastic for enterprise workloads!",
    "The deployment failed again, this is frustrating.",
    "The weather is nice today."
]

results = analyze_sentiment_local(texts)
for doc in results["documents"]:
    print(f"Text {doc['id']}: {doc['sentiment']} (confidence: {doc['confidenceScores']})")

Docker Compose for Multiple Services

version: '3.8'

services:
  sentiment:
    image: mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest
    ports:
      - "5000:5000"
    environment:
      - Eula=accept
      - Billing=${COGNITIVE_SERVICES_ENDPOINT}
      - ApiKey=${COGNITIVE_SERVICES_KEY}
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 4G

  keyphrase:
    image: mcr.microsoft.com/azure-cognitive-services/textanalytics/keyphrase:latest
    ports:
      - "5001:5000"
    environment:
      - Eula=accept
      - Billing=${COGNITIVE_SERVICES_ENDPOINT}
      - ApiKey=${COGNITIVE_SERVICES_KEY}
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 4G

  language-detection:
    image: mcr.microsoft.com/azure-cognitive-services/textanalytics/language:latest
    ports:
      - "5002:5000"
    environment:
      - Eula=accept
      - Billing=${COGNITIVE_SERVICES_ENDPOINT}
      - ApiKey=${COGNITIVE_SERVICES_KEY}
    deploy:
      resources:
        limits:
          cpus: '1'
          memory: 2G

  ocr:
    image: mcr.microsoft.com/azure-cognitive-services/vision/read:3.2
    ports:
      - "5003:5000"
    environment:
      - Eula=accept
      - Billing=${COGNITIVE_SERVICES_ENDPOINT}
      - ApiKey=${COGNITIVE_SERVICES_KEY}
    deploy:
      resources:
        limits:
          cpus: '4'
          memory: 8G

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sentiment-analysis
  namespace: cognitive-services
spec:
  replicas: 3
  selector:
    matchLabels:
      app: sentiment-analysis
  template:
    metadata:
      labels:
        app: sentiment-analysis
    spec:
      containers:
      - name: sentiment
        image: mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest
        ports:
        - containerPort: 5000
        env:
        - name: Eula
          value: "accept"
        - name: Billing
          valueFrom:
            secretKeyRef:
              name: cognitive-services-secret
              key: endpoint
        - name: ApiKey
          valueFrom:
            secretKeyRef:
              name: cognitive-services-secret
              key: key
        resources:
          requests:
            memory: "2Gi"
            cpu: "1"
          limits:
            memory: "4Gi"
            cpu: "2"
        readinessProbe:
          httpGet:
            path: /status
            port: 5000
          initialDelaySeconds: 30
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /status
            port: 5000
          initialDelaySeconds: 60
          periodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
  name: sentiment-analysis-service
  namespace: cognitive-services
spec:
  selector:
    app: sentiment-analysis
  ports:
  - port: 80
    targetPort: 5000
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: cognitive-services-ingress
  namespace: cognitive-services
  annotations:
    kubernetes.io/ingress.class: nginx
spec:
  rules:
  - host: cognitive.internal.company.com
    http:
      paths:
      - path: /sentiment
        pathType: Prefix
        backend:
          service:
            name: sentiment-analysis-service
            port:
              number: 80

Python Client for Container Services

from dataclasses import dataclass
from typing import Optional
import requests
import logging

@dataclass
class CognitiveServicesConfig:
    sentiment_url: str = "http://localhost:5000"
    keyphrase_url: str = "http://localhost:5001"
    language_url: str = "http://localhost:5002"
    ocr_url: str = "http://localhost:5003"

class LocalCognitiveServices:
    def __init__(self, config: CognitiveServicesConfig):
        self.config = config
        self.logger = logging.getLogger(__name__)

    def _call_api(self, base_url: str, endpoint: str, documents: list) -> dict:
        """Make API call to container."""
        url = f"{base_url}/text/analytics/v3.0/{endpoint}"

        try:
            response = requests.post(
                url,
                headers={"Content-Type": "application/json"},
                json={"documents": documents},
                timeout=30
            )
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            self.logger.error(f"API call failed: {e}")
            raise

    def analyze_sentiment(self, texts: list[str], language: str = "en") -> list[dict]:
        """Analyze sentiment of texts."""
        documents = [
            {"id": str(i), "text": text, "language": language}
            for i, text in enumerate(texts)
        ]
        result = self._call_api(self.config.sentiment_url, "sentiment", documents)
        return result.get("documents", [])

    def extract_keyphrases(self, texts: list[str], language: str = "en") -> list[dict]:
        """Extract key phrases from texts."""
        documents = [
            {"id": str(i), "text": text, "language": language}
            for i, text in enumerate(texts)
        ]
        result = self._call_api(self.config.keyphrase_url, "keyPhrases", documents)
        return result.get("documents", [])

    def detect_language(self, texts: list[str]) -> list[dict]:
        """Detect language of texts."""
        documents = [
            {"id": str(i), "text": text}
            for i, text in enumerate(texts)
        ]
        result = self._call_api(self.config.language_url, "languages", documents)
        return result.get("documents", [])

    def ocr_image(self, image_path: str) -> dict:
        """Extract text from image using OCR."""
        url = f"{self.config.ocr_url}/vision/v3.2/read/analyze"

        with open(image_path, "rb") as f:
            response = requests.post(
                url,
                headers={"Content-Type": "application/octet-stream"},
                data=f.read()
            )

        # OCR is async - need to poll for results
        operation_url = response.headers.get("Operation-Location")

        import time
        while True:
            result_response = requests.get(operation_url)
            result = result_response.json()

            if result["status"] == "succeeded":
                return result["analyzeResult"]
            elif result["status"] == "failed":
                raise Exception("OCR failed")

            time.sleep(1)

# Usage
config = CognitiveServicesConfig()
cognitive = LocalCognitiveServices(config)

# Analyze text
texts = ["Azure containers are great for edge computing!"]
sentiment = cognitive.analyze_sentiment(texts)
keyphrases = cognitive.extract_keyphrases(texts)

print(f"Sentiment: {sentiment[0]['sentiment']}")
print(f"Key Phrases: {keyphrases[0]['keyPhrases']}")

Monitoring and Health Checks

import requests
from dataclasses import dataclass
from datetime import datetime

@dataclass
class ContainerHealth:
    name: str
    url: str
    healthy: bool
    response_time_ms: float
    last_check: datetime

class ContainerMonitor:
    def __init__(self, containers: dict[str, str]):
        self.containers = containers

    def check_health(self, name: str, url: str) -> ContainerHealth:
        """Check health of a container."""
        start = datetime.now()

        try:
            response = requests.get(f"{url}/status", timeout=5)
            healthy = response.status_code == 200
        except:
            healthy = False

        elapsed = (datetime.now() - start).total_seconds() * 1000

        return ContainerHealth(
            name=name,
            url=url,
            healthy=healthy,
            response_time_ms=elapsed,
            last_check=datetime.now()
        )

    def check_all(self) -> list[ContainerHealth]:
        """Check health of all containers."""
        return [
            self.check_health(name, url)
            for name, url in self.containers.items()
        ]

# Usage
monitor = ContainerMonitor({
    "sentiment": "http://localhost:5000",
    "keyphrase": "http://localhost:5001",
    "language": "http://localhost:5002"
})

for health in monitor.check_all():
    status = "OK" if health.healthy else "FAILED"
    print(f"{health.name}: {status} ({health.response_time_ms:.0f}ms)")

Cost Comparison

Scenario	Cloud API	Container
1M transactions/month	~$1,000	~$200 (compute)
Latency	100-500ms	10-50ms
Offline	No	Yes
Data residency	Cloud region	On-premises

Containerized Cognitive Services make sense when you need low latency, high volume, or data sovereignty. The setup requires more infrastructure work, but the benefits are significant for the right use cases.