6 min read
Running Azure Cognitive Services in Containers
Azure Cognitive Services can run in containers, enabling offline scenarios, reduced latency, and data sovereignty. This is crucial for edge computing and regulated industries. Let’s explore how to deploy them.
Why Containerized Cognitive Services?
- Latency: Process locally instead of round-trips to Azure
- Offline capability: Work without internet connectivity
- Data residency: Keep data on-premises
- Cost optimization: Reduce API call costs for high-volume scenarios
- Air-gapped environments: Deploy in secure networks
Available Container Images
# Text Analytics
mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment
mcr.microsoft.com/azure-cognitive-services/textanalytics/language
mcr.microsoft.com/azure-cognitive-services/textanalytics/keyphrase
# Computer Vision
mcr.microsoft.com/azure-cognitive-services/vision/read
# Speech
mcr.microsoft.com/azure-cognitive-services/speechservices/speech-to-text
mcr.microsoft.com/azure-cognitive-services/speechservices/text-to-speech
# Form Recognizer
mcr.microsoft.com/azure-cognitive-services/form-recognizer/layout
mcr.microsoft.com/azure-cognitive-services/form-recognizer/invoice
Setting Up Sentiment Analysis Container
1. Create Azure Resource for Billing
Even containerized services require an Azure resource for billing:
# Create resource group
az group create --name rg-cognitive-services --location australiaeast
# Create Text Analytics resource
az cognitiveservices account create \
--name my-text-analytics \
--resource-group rg-cognitive-services \
--kind TextAnalytics \
--sku S \
--location australiaeast
# Get keys and endpoint
az cognitiveservices account keys list \
--name my-text-analytics \
--resource-group rg-cognitive-services
az cognitiveservices account show \
--name my-text-analytics \
--resource-group rg-cognitive-services \
--query "properties.endpoint"
2. Run the Container
docker run -d \
--name sentiment \
-p 5000:5000 \
-e Eula=accept \
-e Billing=https://australiaeast.api.cognitive.microsoft.com/ \
-e ApiKey=your-api-key \
mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest
3. Call the Local API
import requests
import json
def analyze_sentiment_local(texts: list[str]) -> dict:
"""Analyze sentiment using local container."""
url = "http://localhost:5000/text/analytics/v3.0/sentiment"
documents = [
{"id": str(i), "text": text, "language": "en"}
for i, text in enumerate(texts)
]
response = requests.post(
url,
headers={"Content-Type": "application/json"},
json={"documents": documents}
)
return response.json()
# Usage
texts = [
"Azure is fantastic for enterprise workloads!",
"The deployment failed again, this is frustrating.",
"The weather is nice today."
]
results = analyze_sentiment_local(texts)
for doc in results["documents"]:
print(f"Text {doc['id']}: {doc['sentiment']} (confidence: {doc['confidenceScores']})")
Docker Compose for Multiple Services
version: '3.8'
services:
sentiment:
image: mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest
ports:
- "5000:5000"
environment:
- Eula=accept
- Billing=${COGNITIVE_SERVICES_ENDPOINT}
- ApiKey=${COGNITIVE_SERVICES_KEY}
deploy:
resources:
limits:
cpus: '2'
memory: 4G
keyphrase:
image: mcr.microsoft.com/azure-cognitive-services/textanalytics/keyphrase:latest
ports:
- "5001:5000"
environment:
- Eula=accept
- Billing=${COGNITIVE_SERVICES_ENDPOINT}
- ApiKey=${COGNITIVE_SERVICES_KEY}
deploy:
resources:
limits:
cpus: '2'
memory: 4G
language-detection:
image: mcr.microsoft.com/azure-cognitive-services/textanalytics/language:latest
ports:
- "5002:5000"
environment:
- Eula=accept
- Billing=${COGNITIVE_SERVICES_ENDPOINT}
- ApiKey=${COGNITIVE_SERVICES_KEY}
deploy:
resources:
limits:
cpus: '1'
memory: 2G
ocr:
image: mcr.microsoft.com/azure-cognitive-services/vision/read:3.2
ports:
- "5003:5000"
environment:
- Eula=accept
- Billing=${COGNITIVE_SERVICES_ENDPOINT}
- ApiKey=${COGNITIVE_SERVICES_KEY}
deploy:
resources:
limits:
cpus: '4'
memory: 8G
Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: sentiment-analysis
namespace: cognitive-services
spec:
replicas: 3
selector:
matchLabels:
app: sentiment-analysis
template:
metadata:
labels:
app: sentiment-analysis
spec:
containers:
- name: sentiment
image: mcr.microsoft.com/azure-cognitive-services/textanalytics/sentiment:latest
ports:
- containerPort: 5000
env:
- name: Eula
value: "accept"
- name: Billing
valueFrom:
secretKeyRef:
name: cognitive-services-secret
key: endpoint
- name: ApiKey
valueFrom:
secretKeyRef:
name: cognitive-services-secret
key: key
resources:
requests:
memory: "2Gi"
cpu: "1"
limits:
memory: "4Gi"
cpu: "2"
readinessProbe:
httpGet:
path: /status
port: 5000
initialDelaySeconds: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /status
port: 5000
initialDelaySeconds: 60
periodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
name: sentiment-analysis-service
namespace: cognitive-services
spec:
selector:
app: sentiment-analysis
ports:
- port: 80
targetPort: 5000
type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: cognitive-services-ingress
namespace: cognitive-services
annotations:
kubernetes.io/ingress.class: nginx
spec:
rules:
- host: cognitive.internal.company.com
http:
paths:
- path: /sentiment
pathType: Prefix
backend:
service:
name: sentiment-analysis-service
port:
number: 80
Python Client for Container Services
from dataclasses import dataclass
from typing import Optional
import requests
import logging
@dataclass
class CognitiveServicesConfig:
sentiment_url: str = "http://localhost:5000"
keyphrase_url: str = "http://localhost:5001"
language_url: str = "http://localhost:5002"
ocr_url: str = "http://localhost:5003"
class LocalCognitiveServices:
def __init__(self, config: CognitiveServicesConfig):
self.config = config
self.logger = logging.getLogger(__name__)
def _call_api(self, base_url: str, endpoint: str, documents: list) -> dict:
"""Make API call to container."""
url = f"{base_url}/text/analytics/v3.0/{endpoint}"
try:
response = requests.post(
url,
headers={"Content-Type": "application/json"},
json={"documents": documents},
timeout=30
)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
self.logger.error(f"API call failed: {e}")
raise
def analyze_sentiment(self, texts: list[str], language: str = "en") -> list[dict]:
"""Analyze sentiment of texts."""
documents = [
{"id": str(i), "text": text, "language": language}
for i, text in enumerate(texts)
]
result = self._call_api(self.config.sentiment_url, "sentiment", documents)
return result.get("documents", [])
def extract_keyphrases(self, texts: list[str], language: str = "en") -> list[dict]:
"""Extract key phrases from texts."""
documents = [
{"id": str(i), "text": text, "language": language}
for i, text in enumerate(texts)
]
result = self._call_api(self.config.keyphrase_url, "keyPhrases", documents)
return result.get("documents", [])
def detect_language(self, texts: list[str]) -> list[dict]:
"""Detect language of texts."""
documents = [
{"id": str(i), "text": text}
for i, text in enumerate(texts)
]
result = self._call_api(self.config.language_url, "languages", documents)
return result.get("documents", [])
def ocr_image(self, image_path: str) -> dict:
"""Extract text from image using OCR."""
url = f"{self.config.ocr_url}/vision/v3.2/read/analyze"
with open(image_path, "rb") as f:
response = requests.post(
url,
headers={"Content-Type": "application/octet-stream"},
data=f.read()
)
# OCR is async - need to poll for results
operation_url = response.headers.get("Operation-Location")
import time
while True:
result_response = requests.get(operation_url)
result = result_response.json()
if result["status"] == "succeeded":
return result["analyzeResult"]
elif result["status"] == "failed":
raise Exception("OCR failed")
time.sleep(1)
# Usage
config = CognitiveServicesConfig()
cognitive = LocalCognitiveServices(config)
# Analyze text
texts = ["Azure containers are great for edge computing!"]
sentiment = cognitive.analyze_sentiment(texts)
keyphrases = cognitive.extract_keyphrases(texts)
print(f"Sentiment: {sentiment[0]['sentiment']}")
print(f"Key Phrases: {keyphrases[0]['keyPhrases']}")
Monitoring and Health Checks
import requests
from dataclasses import dataclass
from datetime import datetime
@dataclass
class ContainerHealth:
name: str
url: str
healthy: bool
response_time_ms: float
last_check: datetime
class ContainerMonitor:
def __init__(self, containers: dict[str, str]):
self.containers = containers
def check_health(self, name: str, url: str) -> ContainerHealth:
"""Check health of a container."""
start = datetime.now()
try:
response = requests.get(f"{url}/status", timeout=5)
healthy = response.status_code == 200
except:
healthy = False
elapsed = (datetime.now() - start).total_seconds() * 1000
return ContainerHealth(
name=name,
url=url,
healthy=healthy,
response_time_ms=elapsed,
last_check=datetime.now()
)
def check_all(self) -> list[ContainerHealth]:
"""Check health of all containers."""
return [
self.check_health(name, url)
for name, url in self.containers.items()
]
# Usage
monitor = ContainerMonitor({
"sentiment": "http://localhost:5000",
"keyphrase": "http://localhost:5001",
"language": "http://localhost:5002"
})
for health in monitor.check_all():
status = "OK" if health.healthy else "FAILED"
print(f"{health.name}: {status} ({health.response_time_ms:.0f}ms)")
Cost Comparison
| Scenario | Cloud API | Container |
|---|---|---|
| 1M transactions/month | ~$1,000 | ~$200 (compute) |
| Latency | 100-500ms | 10-50ms |
| Offline | No | Yes |
| Data residency | Cloud region | On-premises |
Containerized Cognitive Services make sense when you need low latency, high volume, or data sovereignty. The setup requires more infrastructure work, but the benefits are significant for the right use cases.