Azure

Overview

Azure is a cloud provider offering access to OpenAI and Anthropic models through the Azure OpenAI Service. DeepIntShield performs conversions including:

Deployment mapping - Model identifiers mapped to Azure deployment IDs with version handling
Authentication modes - API key, Entra ID (Service Principal), or Managed Identity (DefaultAzureCredential) with automatic environment detection
Model routing - Automatic provider detection (OpenAI vs Anthropic) based on deployment
API versioning - Configurable API versions with preview support for Responses API
Custom endpoints - Full control over Azure endpoint configuration
Multi-model support - Unified interface for OpenAI, Anthropic (via Azure), and Gemini models
Request/response pass-through - Support for raw request/response bodies for advanced use cases

Supported Operations

Operation	Non-Streaming	Streaming	Endpoint
Chat Completions	✅	✅	`/openai/v1/chat/completions`
Responses API	✅	✅	`/openai/v1/responses`
Embeddings	✅	-	`/openai/v1/embeddings`
Files	✅	-	`/openai/v1/files`
List Models	✅	-	`/openai/v1/models`
Image Generation	✅	✅	`/openai/v1/images/generations`
Image Edit	✅	✅	`/openai/v1/images/edits`
Video Generation	✅	-	`/openai/v1/videos`
Image Variation	❌	❌	-
Batch	❌	❌	-
Text Completions	❌	❌	-
Speech (TTS)	❌	❌	-

1. Chat Completions

Request Parameters

Core Parameter Mapping

Parameter	Azure Handling	Notes
`model`	Mapped to `deployment_id`	Supports version matching and base model matching
`max_completion_tokens`	Direct pass-through	OpenAI models only
`temperature`, `top_p`	Direct pass-through	Same across all models
All other params	Model-specific conversion	Converted per underlying provider (OpenAI/Anthropic)

Authentication Configuration

Azure uses custom endpoint and deployment configuration:

Gateway
Go SDK

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/gpt-4-deployment",
    "messages": [{"role": "user", "content": "Hello"}],
    "deployment": "my-gpt4-deployment",
    "endpoint": "https://my-org.openai.azure.com"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

resp, err := client.ChatCompletionRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), &schemas.DeepIntShieldChatRequest{
    Provider: schemas.Azure,
    Model:    "gpt-4",
    Input:    messages,
    Params: &schemas.ChatParameters{
        ExtraParams: map[string]interface{}{
            "deployment": "my-gpt4-deployment",
            "endpoint": "https://my-org.openai.azure.com",
        },
    },
})

Key Configuration

Azure supports three authentication methods: Managed Identity (DefaultAzureCredential), Entra ID (Service Principal), and Direct (API Key). Precedence: Entra ID (if configured) → API key (if value set) → DefaultAzureCredential.

Managed Identity / DefaultAzureCredential

If no API key and no Entra ID credentials are provided, DeepIntShield automatically uses DefaultAzureCredential, which detects the auth environment.

{
  "azure_key_config": {
    "endpoint": "https://your-org.openai.azure.com",
    "api_version": "2024-10-21",
    "deployments": {
      "gpt-4": "my-gpt4-deployment"
    }
  }
}

Azure Entra ID (Service Principal)

If you set client_id, client_secret, and tenant_id, Azure Entra ID authentication will be used with priority over API key authentication.

{
  "azure_key_config": {
    "endpoint": "https://your-org.openai.azure.com",
    "client_id": "your-client-id",
    "client_secret": "your-client-secret",
    "tenant_id": "your-tenant-id",
    "scopes": ["https://cognitiveservices.azure.com/.default"],
    "api_version": "2024-10-21",
    "deployments": {
      "gpt-4": "my-gpt4-deployment",
      "gpt-4-turbo": "my-gpt4-turbo-deployment",
      "claude-3": "my-claude-deployment"
    }
  }
}

Required Azure Roles:

For OpenAI models: Cognitive Services OpenAI User
For Anthropic models: Cognitive Services AI Services User

Direct Authentication (API Key)

{
  "azure_key_config": {
    "endpoint": "https://your-org.openai.azure.com",
    "api_version": "2024-10-21",
    "deployments": {
      "gpt-4": "my-gpt4-deployment",
      "gpt-4-turbo": "my-gpt4-turbo-deployment",
      "claude-3": "my-claude-deployment"
    }
  }
}

Configuration Details:

endpoint - Azure OpenAI resource endpoint (required)
client_id - Azure Entra ID client ID (optional, for Service Principal auth)
client_secret - Azure Entra ID client secret (optional, for Service Principal auth)
tenant_id - Azure Entra ID tenant ID (optional, for Service Principal auth)
scopes - OAuth scopes for token requests (default: ["https://cognitiveservices.azure.com/.default"])
api_version - API version to use (default: 2024-10-21)
deployments - Map of model names to deployment IDs (optional, can be provided per-request)
allowed_models - List of allowed models to use from this key (optional)

Deployment Selection

Deployments can be specified at three levels (in order of precedence):

Per-request (highest priority)
```
{"deployment": "custom-deployment"}
```

Key configuration

{"deployments": {"gpt-4": "my-gpt4-deployment"}}

Model name (lowest priority, if no deployment specified) Model name is used as deployment ID directly

OpenAI Models

When using OpenAI models (GPT-4, GPT-4 Turbo, GPT-3.5-Turbo, etc.), DeepIntShield passes through OpenAI-compatible parameters directly.

Parameter Mapping for OpenAI

All OpenAI-standard parameters are supported. Refer to OpenAI documentation for detailed conversion details.

Anthropic Models

When using Anthropic models through Azure (Claude 3 family), DeepIntShield converts requests to Anthropic format.

Parameter Mapping for Anthropic

All Anthropic-standard parameters are supported with special handling:

Reasoning/Thinking: reasoning parameters converted to Anthropic’s thinking structure
System messages: Extracted and placed in separate system field
Tool message grouping: Consecutive tool messages merged

Refer to Anthropic documentation for detailed conversion details.

Special Notes for Azure + Anthropic

API version automatically set to 2023-06-01 for Anthropic models
Endpoints use /anthropic/v1/ paths internally
Authentication uses x-api-key header for Anthropic models
Minimum reasoning budget: 1024 tokens

API Versioning

Default version: 2024-10-21 (supports latest OpenAI features)
Preview version: preview (used for Responses API)
Custom version: Set via api_version in key config

Different API versions may have different feature support. DeepIntShield automatically adjusts endpoint paths and parameters based on the configured version.

Streaming

Streaming uses OpenAI or Anthropic format depending on model type:

OpenAI models: Standard OpenAI streaming with chat.completion.chunk events
Anthropic models: Anthropic streaming format with content blocks

2. Responses API

The Responses API is available for both OpenAI and Anthropic models on Azure and uses the preview API version.

Request Parameters

Core Parameter Mapping

Parameter	Azure Handling	Notes
`instructions`	Becomes system message	Model-specific conversion
`input`	Converted to user message(s)	String or array support
`max_output_tokens`	Model-specific field mapping	OpenAI vs Anthropic conversion
All other params	Model-specific conversion	Converted per underlying provider

OpenAI Models

For OpenAI models (GPT-4, etc.), conversion follows OpenAI’s Responses API format.

Anthropic Models

For Anthropic models (Claude, etc.), conversion follows Anthropic’s message format:

instructions becomes system message
reasoning mapped to thinking structure

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/claude-3-sonnet",
    "input": "Hello, how are you?",
    "instructions": "You are a helpful assistant",
    "deployment": "my-claude-deployment",
    "endpoint": "https://my-org.openai.azure.com"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

resp, err := client.ResponsesRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), &schemas.DeepIntShieldResponsesRequest{
    Provider: schemas.Azure,
    Model:    "claude-3-sonnet",
    Input:    messages,
    Params: &schemas.ResponsesParameters{
        Instructions: schemas.Ptr("You are a helpful assistant"),
    },
})

Special Handling

Uses /openai/v1/responses endpoint with preview API version
All request body conversions handled automatically
Supports raw request body passthrough for advanced cases

OpenAI Models - gpt-oss Special Message Handling: For OpenAI models through Azure, see OpenAI Responses API documentation for details on special gpt-oss model handling regarding reasoning conversion (summaries vs. content blocks).

Anthropic Models: Refer to Anthropic Responses API for parameter details.

3. Embeddings

Embeddings are supported for OpenAI models only (not available for Anthropic models on Azure).

Request Parameters

Parameter	Azure Handling
`input`	Direct pass-through
`model`	Mapped to deployment
`dimensions`	Direct pass-through (when supported)

Gateway
Go SDK

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": ["text to embed"],
    "deployment": "my-embedding-deployment"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

resp, err := client.EmbeddingRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), &schemas.DeepIntShieldEmbeddingRequest{
    Provider: schemas.Azure,
    Model:    "text-embedding-3-small",
    Input: &schemas.EmbeddingInput{
        Texts: []string{"text to embed"},
    },
})

Response Conversion

Embeddings response is passed through directly from Azure OpenAI with standard format:

{
  "data": [
    {
      "object": "embedding",
      "embedding": [0.1234, -0.5678, ...],
      "index": 0
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 10,
    "total_tokens": 10
  }
}

4. Files API

Files operations are supported for OpenAI models only.

Supported Operations

Operation	Support
Upload	✅
List	✅
Retrieve	✅
Delete	✅
Get Content	✅

Files are stored in Azure and can be used with batch operations.

5. Image Generation

Image Generation is supported for OpenAI models on Azure and uses the OpenAI-compatible format.

Request Parameters

Core Parameter Mapping

Parameter	Azure Handling	Notes
`model`	Mapped to `deployment_id`	Deployment ID must be configured
`prompt`	Direct pass-through	Prompt text for image generation
All other params	Direct pass-through	Uses OpenAI format

Azure uses the same conversion as OpenAI (see OpenAI Image Generation):

Model & Prompt: bifrostReq.Model → req.Model (mapped to deployment), bifrostReq.Prompt → req.Prompt
Parameters: All other fields from bifrostReq are embedded directly into the request struct via struct embedding

Configuration

Gateway
Go SDK

curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/dall-e-3",
    "prompt": "A sunset over the mountains",
    "size": "1024x1024",
    "n": 1,
    "deployment": "my-image-gen-deployment"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

resp, err := client.ImageGenerationRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), &schemas.DeepIntShieldImageGenerationRequest{
    Provider: schemas.Azure,
    Model:    "dall-e-3",
    Input: &schemas.ImageGenerationInput{
        Prompt: "A sunset over the mountains",
    },
    Params: &schemas.ImageGenerationParameters{
        Size:    schemas.Ptr("1024x1024"),
        N:       schemas.Ptr(1),
    },
})

Response Conversion

Non-streaming: Azure responses are unmarshaled directly into DeepIntShieldImageGenerationResponse since DeepIntShield’s response schema is a superset of OpenAI’s format. All fields are passed through as-is.
Streaming: Azure streaming responses use Server-Sent Events (SSE) format with the same event types as OpenAI (see OpenAI Image Generation Streaming).

Streaming

Image generation streaming is supported and uses OpenAI’s streaming format with Server-Sent Events (SSE).

6. Image Edit

Image Edit is supported for OpenAI models on Azure and uses the OpenAI-compatible format.

Azure uses the same conversion as OpenAI (see OpenAI Image Edit):

Request Conversion: Uses openai.HandleOpenAIImageEditRequest with Azure-specific URL construction
URL Format: {endpoint}/openai/deployments/{deployment}/images/edits?api-version={apiVersion}
Authentication: Azure API key or OAuth bearer token (via getAzureAuthHeaders)
Deployment Mapping: Model identifier mapped to Azure deployment ID
Response Conversion: Same as OpenAI - responses unmarshaled directly into DeepIntShieldImageGenerationResponse
Streaming: Supported via openai.HandleOpenAIImageEditStreamRequest with Azure-specific URL and authentication

Endpoint: /openai/deployments/{deployment}/images/edits?api-version={apiVersion}

7. List Models

Request Parameters

None required.

Response Conversion

Lists available models/deployments configured in the Azure key. Response includes model metadata, capabilities, and lifecycle status.

{
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1687882411,
      "status": "active",
      "lifecycle_status": "stable",
      "capabilities": {
        "chat_completion": true,
        "embeddings": false
      }
    }
  ]
}

Caveats

Deployment ID Required

Severity: High Behavior: Model names must map to Azure deployment IDs Impact: Request fails without valid deployment mapping Code: azure.go:145-200

Model Provider Detection

Severity: Medium Behavior: Automatic detection of OpenAI vs Anthropic based on model name Impact: Different conversion logic applied transparently Code: azure.go:92-114

Responses API Preview Version (including Anthropic)

Severity: Medium Behavior: Responses API automatically uses preview API version, which differs from Chat Completions API version. For Anthropic models, Responses API specifically uses preview API version. Impact: Different API version for Responses vs Chat Completions. Automatic version override for Responses requests. Code: azure.go:92-114, azure.go:109-113, azure.go:694

Version Matching for Deployments

Severity: Low Behavior: Model version differences ignored when matching to deployments Impact: gpt-4 and gpt-4-turbo can map to same deployment Code: models.go:13-58

8. Video Generation

Azure routes video generation to OpenAI’s Sora models via the Azure OpenAI-compatible endpoint. All parameters are identical to OpenAI Video Generation.

Supported Operations

Operation	Supported	Notes
Generate	✅	`POST /v1/videos`
Retrieve	✅	`GET /v1/videos/{id}`
Download	✅	`GET /v1/videos/{id}/content`
Delete	✅	`DELETE /v1/videos/{id}`
List	✅	`GET /v1/videos`
Remix	❌	Not supported

Configuration

HTTP Settings: API Version 2024-10-21 (configurable) | Max Connections 5000 | Max Idle 60 seconds

Endpoint Format: https://{resource-name}.openai.azure.com/openai/v1/{path}?api-version={version}

Note: DeepIntShield automatically constructs URLs using the endpoint from key configuration and the configured API version.

Setup & Configuration

Azure requires endpoint URLs, deployment mappings, and API version configuration. For detailed instructions on setting up Azure authentication, see the quickstart guides:

Gateway
Go SDK

See Provider-Specific Authentication - Azure in the Gateway Quickstart for configuration steps using Web UI, API, or config.json.