Routing
Overview
Section titled “Overview”DeepIntShield’s governance-based routing capabilities offer granular control over how requests are directed to different AI models and providers through Virtual Key configuration. By configuring routing rules on a Virtual Key, you can enforce which providers and models are accessible, implement weighted load balancing strategies, create automatic fallbacks, and restrict access to specific provider API keys.
This powerful feature enables key use cases like:
- Resilience & Failover: Automatically fall back to a secondary provider if the primary one fails.
- Environment Separation: Dedicate specific virtual keys to development, testing, and production environments with different provider and key access.
- Cost Management: Route traffic to cheaper models or providers based on weights to optimize costs.
- Fine-grained Access Control: Ensure that different teams or applications only use the models and API keys they are explicitly permitted to.
Provider/Model Restrictions
Section titled “Provider/Model Restrictions”Virtual Keys can be restricted to use only specific provider/models. When provider/model restrictions are configured, the VK can only access those designated provider/models, providing fine-grained control over which provider/models different users or applications can utilize.
How It Works:
- No Restrictions (default): VK can use any available provider/models based on global configuration
- With Restrictions: VK limited to only the specified provider/models with weighted load balancing
Model Validation: When you configure provider restrictions on a Virtual Key, DeepIntShield validates that the requested model is allowed for the selected provider:
- Explicit
allowed_models: If you specify models in the provider config, only those models are permitted - Empty
allowed_models: DeepIntShield uses the Model Catalog (populated from pricing data + list models API) to determine which models the provider supports - Model Catalog Sync: On startup and provider updates, DeepIntShield calls each provider’s list models API. If this fails, you’ll see a warning:
{"level":"warn","message":"failed to list models for provider <name>: failed to execute HTTP request to provider API"}
Weighted Load Balancing
Section titled “Weighted Load Balancing”When you configure multiple providers on a Virtual Key, DeepIntShield automatically implements weighted load balancing. Each provider is assigned a weight, and requests are distributed proportionally.
Example Configuration:
Virtual Key: vk-prod-main├── OpenAI│ ├── Allowed Models: [gpt-4o, gpt-4o-mini] ← Explicit whitelist│ └── Weight: 0.2 (20% of traffic)└── Azure ├── Allowed Models: [gpt-4o] ← Explicit whitelist └── Weight: 0.8 (80% of traffic)Load Balancing Behavior:
- For
gpt-4o: 80% Azure, 20% OpenAI (both providers have it in allowed_models) - For
gpt-4o-mini: 100% OpenAI (only OpenAI has it in allowed_models) - For
claude-3-sonnet: ❌ Rejected (neither provider has it in allowed_models)
Usage: To trigger weighted load balancing, send requests with just the model name:
curl -X POST http://localhost:8080/v1/chat/completions \ -H "x-bf-vk: vk-prod-main" \ -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'To bypass load balancing and target a specific provider:
curl -X POST http://localhost:8080/v1/chat/completions \ -H "x-bf-vk: vk-prod-main" \ -d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'Example with Empty allowed_models (using Model Catalog):
{ "provider_configs": [ { "provider": "openai", "allowed_models": [], // Uses Model Catalog "weight": 0.5 }, { "provider": "anthropic", "allowed_models": [], // Uses Model Catalog "weight": 0.5 } ]}With this configuration:
- Request for
gpt-4o→ Routed to OpenAI (Model Catalog shows OpenAI supports this) - Request for
claude-3-sonnet→ Routed to Anthropic (Model Catalog shows Anthropic supports this) - Request for
gpt-4owill NOT route to Anthropic (Model Catalog shows Anthropic doesn’t support OpenAI models)
Automatic Fallbacks
Section titled “Automatic Fallbacks”When multiple providers are configured on a Virtual Key, DeepIntShield automatically creates fallback chains for resilience. This feature provides automatic failover without manual intervention.
How It Works:
- Only activated when: Your request has no existing
fallbacksarray in the request body - Fallback creation: Providers are sorted by weight (highest first) and added as fallbacks
- Respects existing fallbacks: If you manually specify fallbacks, they are preserved
Example Request Flow:
- Primary request goes to weighted-selected provider (e.g., Azure with 80% weight)
- If Azure fails, automatically retry with OpenAI
- Continue until success or all providers exhausted
Request with automatic fallbacks:
# This request will get automatic fallbackscurl -X POST http://localhost:8080/v1/chat/completions \ -H "x-bf-vk: vk-prod-main" \ -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'Request with manual fallbacks (no automatic fallbacks added):
# This request keeps your specified fallbackscurl -X POST http://localhost:8080/v1/chat/completions \ -H "x-bf-vk: vk-prod-main" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}], "fallbacks": ["anthropic/claude-3-sonnet-20240229"] }'Setting Provider/Model Routing
Section titled “Setting Provider/Model Routing”- Go to Virtual Keys
- Create/Edit virtual key

- In Provider Configurations section, add the provider you want to restrict the VK to
- Allowed Models:
- Specify models: Enter specific models (e.g.,
["gpt-4o", "gpt-4o-mini"]) to explicitly whitelist only those models - Leave blank: Uses the Model Catalog to determine which models this provider supports (populated from pricing data and the provider’s list models API)
- Specify models: Enter specific models (e.g.,
- Add the weight you want to give to this provider
- Click on the Save button
curl -X PUT http://localhost:8080/api/governance/virtual-keys/{vk_id} \ -H "Content-Type: application/json" \ -d '{ "provider_configs": [ { "provider": "openai", "allowed_models": ["gpt-4o", "gpt-4o-mini"], "weight": 0.2 }, { "provider": "azure", "allowed_models": ["gpt-4o"], "weight": 0.8 } ] }'{ "governance": { "virtual_keys": [ { "id": "vk-prod-main", "provider_configs": [ { "provider": "openai", "allowed_models": ["gpt-4o", "gpt-4o-mini"], "weight": 0.2 }, { "provider": "azure", "allowed_models": ["gpt-4o"], "weight": 0.8 } ] } ] }}API Key Restrictions
Section titled “API Key Restrictions”Virtual Keys can be restricted to use only specific provider API keys. When key restrictions are configured, the VK can only access those designated keys, providing fine-grained control over which API keys different users or applications can utilize.
How It Works:
- No Restrictions (default): VK can use any available provider keys based on load balancing
- With Restrictions: VK limited to only the specified key IDs, regardless of other available keys
Example Scenario:
Available Provider Keys:├── key-prod-001 → sk-prod-key... (Production OpenAI key)├── key-dev-002 → sk-dev-key... (Development OpenAI key)└── key-test-003 → sk-test-key... (Testing OpenAI key)
Virtual Key Restrictions:├── vk-prod-main│ ├── Allowed Models: [gpt-4o]│ └── Restricted Keys: [key-prod-001] ← ONLY production key├── vk-dev-main│ ├── Allowed Models: [gpt-4o-mini]│ └── Restricted Keys: [key-dev-002, key-test-003] ← Dev + test keys└── vk-unrestricted ├── Allowed Models: [all models] └── Restricted Keys: [] ← Can use ANY available keyRequest Behavior:
# Production VK - will ONLY use key-prod-001curl -X POST http://localhost:8080/v1/chat/completions \ -H "x-bf-vk: vk-prod-main" \ -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'
# Development VK - will load balance between key-dev-002 and key-test-003curl -X POST http://localhost:8080/v1/chat/completions \ -H "x-bf-vk: vk-dev-main" \ -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello!"}]}'
# VK with no key restrictions - can use any available OpenAI keycurl -X POST http://localhost:8080/v1/chat/completions \ -H "x-bf-vk: vk-unrestricted" \ -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello!"}]}'Setting API Key Restrictions:
- Go to Virtual Keys
- Create/Edit virtual key

- In Allowed Keys section, select the API key you want to restrict the VK to
- Click on the Save button
curl -X PUT http://localhost:8080/api/governance/virtual-keys/{vk_id} \ -H "Content-Type: application/json" \ -d '{ "key_ids": ["key-prod-001"] }'{ "governance": { "virtual_keys": [ { "id": "vk-prod-main", "provider_configs": [ { "provider": "openai", "allowed_keys": [ "key-prod-001" ] } ] } ] }}Use Cases:
- Environment Separation - Production VKs use production keys, dev VKs use dev keys
- Cost Control - Different teams use keys with different billing accounts
- Access Control - Restrict sensitive keys to specific VKs only
- Compliance - Ensure certain workloads only use compliant/audited keys
Troubleshooting
Section titled “Troubleshooting”Model Catalog Sync Failures
Section titled “Model Catalog Sync Failures”If you see warnings like this in your DeepIntShield logs during startup or provider updates:
{"level":"warn","time":"2026-01-13T14:18:53+05:30","message":"failed to list models for provider ollama: failed to execute HTTP request to provider API"}What this means:
- DeepIntShield attempted to call the provider’s list models API to populate the Model Catalog
- The request failed (network issue, provider unavailable, incorrect credentials, etc.)
- If your Virtual Key has
allowed_models: [](empty) for this provider, model validation will fall back to the pricing data only
How to fix:
- Check that the provider is correctly configured and accessible
- Verify network connectivity to the provider’s API
- Ensure API credentials are valid
- Consider using explicit
allowed_modelsinstead of relying on the Model Catalog for critical providers