May 6, 2024 2 min read

Deploying GPT-4o on Azure OpenAI Service

Azure OpenAI GPT-4o Azure OpenAI Deployment

GPT-4o is available on Azure OpenAI Service with all the enterprise benefits you expect. Here’s how to deploy and configure it.

Creating a GPT-4o Deployment

Azure Portal

Navigate to your Azure OpenAI resource
Go to Model deployments > Deploy model
Select gpt-4o from the model list
Configure deployment settings
Deploy

Azure CLI

# Create Azure OpenAI resource (if needed)
az cognitiveservices account create \
    --name myopenai \
    --resource-group mygroup \
    --kind OpenAI \
    --sku S0 \
    --location eastus

# Deploy GPT-4o model
az cognitiveservices account deployment create \
    --name myopenai \
    --resource-group mygroup \
    --deployment-name gpt4o-deployment \
    --model-name gpt-4o \
    --model-version "2024-05-13" \
    --model-format OpenAI \
    --sku-capacity 10 \
    --sku-name Standard

Bicep/ARM Template

resource openaiAccount 'Microsoft.CognitiveServices/accounts@2023-10-01-preview' = {
  name: 'myopenai'
  location: 'eastus'
  kind: 'OpenAI'
  sku: {
    name: 'S0'
  }
  properties: {
    customSubDomainName: 'myopenai'
  }
}

resource gpt4oDeployment 'Microsoft.CognitiveServices/accounts/deployments@2023-10-01-preview' = {
  parent: openaiAccount
  name: 'gpt4o'
  properties: {
    model: {
      format: 'OpenAI'
      name: 'gpt-4o'
      version: '2024-05-13'
    }
  }
  sku: {
    name: 'Standard'
    capacity: 10
  }
}

Regional Availability

As of May 2024, GPT-4o is available in:

Region	Status
East US	GA
East US 2	GA
West US	GA
West US 3	GA
North Central US	GA
South Central US	GA
West Europe	GA
Sweden Central	GA
UK South	Preview
Australia East	Preview

Quota and Rate Limits

GPT-4o quotas are separate from GPT-4 Turbo:

# Check current quota usage
from azure.identity import DefaultAzureCredential
from azure.mgmt.cognitiveservices import CognitiveServicesManagementClient

credential = DefaultAzureCredential()
client = CognitiveServicesManagementClient(credential, subscription_id)

# List deployments and their quotas
deployments = client.deployments.list(
    resource_group_name="mygroup",
    account_name="myopenai"
)

for deployment in deployments:
    print(f"Deployment: {deployment.name}")
    print(f"  Model: {deployment.properties.model.name}")
    print(f"  Capacity: {deployment.sku.capacity} TPM")

Configuring the Client

from openai import AzureOpenAI
import os

# Using API key authentication
client = AzureOpenAI(
    api_version="2024-05-01-preview",
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_KEY"]
)

# Using Azure AD authentication (recommended)
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(
    credential,
    "https://cognitiveservices.azure.com/.default"
)

client = AzureOpenAI(
    api_version="2024-05-01-preview",
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    azure_ad_token_provider=token_provider
)

Network Security Configuration

Private Endpoint

resource privateEndpoint 'Microsoft.Network/privateEndpoints@2023-05-01' = {
  name: 'openai-pe'
  location: 'eastus'
  properties: {
    subnet: {
      id: subnetId
    }
    privateLinkServiceConnections: [
      {
        name: 'openai-connection'
        properties: {
          privateLinkServiceId: openaiAccount.id
          groupIds: ['account']
        }
      }
    ]
  }
}

resource privateDnsZone 'Microsoft.Network/privateDnsZones@2020-06-01' = {
  name: 'privatelink.openai.azure.com'
  location: 'global'
}

Managed Identity Access

resource roleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(openaiAccount.id, functionApp.id, 'Cognitive Services OpenAI User')
  scope: openaiAccount
  properties: {
    roleDefinitionId: subscriptionResourceId(
      'Microsoft.Authorization/roleDefinitions',
      '5e0bd9bd-7b93-4f28-af87-19fc36ad61bd' // Cognitive Services OpenAI User
    )
    principalId: functionApp.identity.principalId
    principalType: 'ServicePrincipal'
  }
}

Content Filtering

Azure OpenAI includes content filtering by default:

try:
    response = client.chat.completions.create(
        model="gpt4o",
        messages=[{"role": "user", "content": user_input}]
    )
except openai.BadRequestError as e:
    if "content_filter" in str(e):
        # Handle content filter trigger
        print("Content was filtered due to policy")

Configure custom content filtering:

# Create content filter configuration
az cognitiveservices account deployment create \
    --name myopenai \
    --resource-group mygroup \
    --deployment-name gpt4o-filtered \
    --model-name gpt-4o \
    --model-version "2024-05-13" \
    --model-format OpenAI \
    --sku-capacity 10 \
    --sku-name Standard \
    --content-filter-policy-name "custom-filter"

Monitoring and Diagnostics

Enable diagnostic settings:

resource diagnostics 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = {
  name: 'openai-diagnostics'
  scope: openaiAccount
  properties: {
    workspaceId: logAnalyticsWorkspace.id
    logs: [
      {
        category: 'Audit'
        enabled: true
      }
      {
        category: 'RequestResponse'
        enabled: true
      }
    ]
    metrics: [
      {
        category: 'AllMetrics'
        enabled: true
      }
    ]
  }
}

Query usage metrics:

// Azure Monitor query for GPT-4o usage
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| where Category == "RequestResponse"
| where model_s == "gpt-4o"
| summarize
    TotalRequests = count(),
    TotalTokens = sum(toint(total_tokens_s)),
    AvgLatency = avg(DurationMs)
    by bin(TimeGenerated, 1h)
| order by TimeGenerated desc

Cost Management

Set up budgets and alerts:

resource budget 'Microsoft.Consumption/budgets@2023-05-01' = {
  name: 'openai-budget'
  properties: {
    category: 'Cost'
    amount: 1000
    timeGrain: 'Monthly'
    timePeriod: {
      startDate: '2024-05-01'
    }
    filter: {
      dimensions: {
        name: 'ResourceId'
        values: [openaiAccount.id]
      }
    }
    notifications: {
      forecastAlert: {
        enabled: true
        threshold: 80
        operator: 'GreaterThan'
        contactEmails: ['admin@company.com']
      }
    }
  }
}

Best Practices

Use Azure AD authentication - More secure than API keys
Enable private endpoints - Keep traffic on Azure backbone
Set up diagnostic logging - Monitor usage and costs
Configure content filters - Meet compliance requirements
Use managed identities - Avoid credential management

What’s Next

Tomorrow I’ll cover cost reduction strategies when using GPT-4o.