Provider Routing
Understand how routing rules fit into the complete routing pipeline
Routing Rules provide dynamic, expression-based control over request routing. They execute before governance provider selection and can override it, allowing you to make sophisticated routing decisions based on request context, headers, parameters, capacity metrics, and organizational hierarchy.
Unlike governance routing (which uses static provider weights), routing rules use CEL expressions (Common Expression Language) to evaluate conditions at runtime and make routing decisions dynamically.
Routing rules are organized by scope with first-match-wins evaluation:
VirtualKey Scope (Highest Priority) ↓Team Scope ↓Customer Scope ↓Global Scope (Lowest Priority, applies to all)How it works:
Example:
VirtualKey (vk-123) is attached to Team (team-456),which belongs to Customer (cust-789)
Evaluation order:1. Check Virtual Key scope rules (vk-123)2. Check Team scope rules (team-456)3. Check Customer scope rules (cust-789)4. Check Global scope rules
First match → DecisionRouting rules evaluate CEL expressions with these available variables:
model // Requested model (string)provider // Current provider (string)request_type // Request type (chat_completion, embedding, batch, image_generation, moderation, transcription, translation)headers["header-name"] // Request header (case-insensitive key lookup)params["param-name"] // Query parameterHeader Examples:
headers["x-tier"] == "premium"headers["x-api-version"] == "v2"headers["user-agent"].contains("mobile")virtual_key_id // VK ID (string, empty if no VK)virtual_key_name // VK name (string)team_id // Team ID (string, empty if not in team)team_name // Team name (string)customer_id // Customer ID (string)customer_name // Customer name (string)Organization Examples:
team_name == "ml-research"customer_id == "acme-corp"virtual_key_name.startsWith("prod-")budget_used // Budget usage percentage for provider/model (0.0 to 100.0)tokens_used // Token rate limit usage percentage (0.0 to 100.0)request // Request rate limit usage percentage (0.0 to 100.0)Capacity Examples:
budget_used > 80 // Route to fallback when 80%+ of budget usedtokens_used < 50 // Route to fast provider when below 50% token limitrequest > 90 // Switch providers when request limit near max== // Equal!= // Not equal> // Greater than< // Less than>= // Greater or equal<= // Less or equal&& // AND|| // OR! // NOT.startsWith("prefix") // Check string prefix.endsWith("suffix") // Check string suffix.contains("substring") // Check substring.matches("regex") // Regex match"value" in ["item1", "item2", "item3"] // Check membership// Route based on header valueheaders["x-tier"] == "premium"
// Route based on teamteam_name == "research"
// Route based on modelmodel == "gpt-4o"
// Route based on request typerequest_type == "embedding"
// Route to fallback when budget highbudget_used > 80// Premium tier research teamheaders["x-tier"] == "premium" && team_name == "research"
// High capacity or premiumbudget_used > 90 || headers["x-priority"] == "high"
// Specific team and modelteam_name == "ml-ops" && model.startsWith("claude-")
// Region-based with capacity checkheaders["x-region"] == "us-east" && tokens_used < 75
// Route embeddings to cheaper providerrequest_type == "embedding" && budget_used > 50// Match models starting with prefixmodel.startsWith("gpt-4")
// Match custom headersheaders["x-environment"] in ["staging", "testing"]
// Email domain matchingheaders["x-user-email"].contains("@company.com")
// Regex patternsheaders["x-app-version"].matches("[0-9]+\\.[0-9]+\\.[0-9]+")true/false for explicit behavior)Access routing rules from the dashboard:
Routing Rules Dashboard

Features:
Create/Edit Rule Sheet

Fields:
The dashboard includes a visual query builder for CEL expressions:
GET /api/governance/routing-rules
# Optional query parameters:?scope=global&scope_id=<id>&from_memory=trueResponse:
{ "rules": [ { "id": "rule-uuid-123", "name": "Premium Tier Route", "description": "Route premium users to fast provider", "enabled": true, "cel_expression": "headers[\"x-tier\"] == \"premium\"", "targets": [ { "provider": "openai", "model": "gpt-4o", "weight": 0.7 }, { "provider": "azure", "model": "gpt-4o", "weight": 0.3 } ], "fallbacks": ["groq/gpt-3.5-turbo"], "scope": "global", "scope_id": null, "priority": 10, "created_at": "2024-01-15T10:30:00Z", "updated_at": "2024-01-15T10:30:00Z" } ], "count": 1}GET /api/governance/routing-rules/{rule_id}POST /api/governance/routing-rules
Content-Type: application/jsonRequest Body:
{ "name": "Budget Overflow Route", "description": "When budget is high, route to cheaper provider", "enabled": true, "cel_expression": "budget_used > 85", "targets": [ { "provider": "groq", "weight": 1 } ], "fallbacks": ["openai/gpt-4o"], "scope": "team", "scope_id": "team-uuid-456", "priority": 5}Response: 201 Created
{ "message": "Routing rule created successfully", "rule": { /* rule object */ }}PUT /api/governance/routing-rules/{rule_id}
Content-Type: application/jsonRequest Body (all fields optional):
{ "name": "Updated Rule Name", "enabled": false, "cel_expression": "budget_used > 90", "priority": 20}DELETE /api/governance/routing-rules/{rule_id}Response: 200 OK
{ "message": "Routing rule deleted successfully"}Define routing rules in your config.json file under the governance configuration:
Structure:
{ "governance": { "routing_rules": [ { "id": "rule-uuid-123", "name": "Premium Tier Route", "description": "Route premium users to fast provider", "enabled": true, "cel_expression": "headers[\"x-tier\"] == \"premium\"", "targets": [ { "provider": "openai", "model": "gpt-4o", "weight": 0.7 }, { "provider": "azure", "model": "gpt-4o", "weight": 0.3 } ], "fallbacks": ["groq/gpt-3.5-turbo"], "scope": "global", "scope_id": null, "priority": 10 }, { "id": "rule-uuid-456", "name": "Budget Overflow Route", "description": "Route to cheaper provider when budget is high", "enabled": true, "cel_expression": "budget_used > 85", "targets": [ { "provider": "groq", "model": "llama-2-70b", "weight": 1 } ], "fallbacks": [], "scope": "team", "scope_id": "team-ml-ops", "priority": 5 } ] }}Fields:
provider (string, optional): Target provider — omit to use the incoming request providermodel (string, optional): Target model — omit to use the incoming request modelkey_id (string, optional): UUID of the API key to pin — requires provider to be present; omit for load-balanced key selectionweight (number, required): Probability weight — all weights in a rule must sum to 1 (e.g. 0.7 + 0.3 = 1.0)Loading from config.json:
Routes are automatically loaded on startup from the config.json governance section. Changes require application restart.
Example with Multiple Rules:
{ "governance": { "routing_rules": [ { "id": "tier-based", "name": "Premium Tier Fast Track", "enabled": true, "cel_expression": "headers[\"x-tier\"] == \"premium\"", "targets": [ { "provider": "openai", "model": "gpt-4o", "weight": 1 } ], "fallbacks": ["azure/gpt-4o"], "scope": "global", "priority": 0 }, { "id": "capacity-failover", "name": "Budget Exhaustion Fallback", "enabled": true, "cel_expression": "budget_used > 90", "targets": [ { "provider": "groq", "model": "llama-2-70b", "weight": 1 } ], "fallbacks": [], "scope": "global", "priority": 5 }, { "id": "team-preference", "name": "ML Team Anthropic Route", "enabled": true, "cel_expression": "team_name == \"ml-research\"", "targets": [ { "provider": "anthropic", "model": "claude-3-opus-20240229", "weight": 1 } ], "fallbacks": ["bedrock/claude-3-opus"], "scope": "team", "scope_id": "team-ml-research", "priority": 0 } ] }}When to use Routing Rules:
Route requests based on customer tier using headers:
{ "name": "Premium Tier Fast Track", "cel_expression": "headers[\"x-tier\"] == \"premium\"", "targets": [ { "provider": "openai", "model": "gpt-4o", "weight": 1 } ], "fallbacks": ["azure/gpt-4o"], "scope": "global", "priority": 10}Route to cheaper provider when budget is exhausted:
{ "name": "Budget Exhaustion Fallback", "cel_expression": "budget_used > 90", "targets": [ { "provider": "groq", "model": "llama-2-70b", "weight": 1 } ], "fallbacks": [], "scope": "global", "priority": 5}Route team-specific requests to their preferred provider:
{ "name": "ML Team Anthropic Preference", "cel_expression": "team_name == \"ml-research\"", "targets": [ { "provider": "anthropic", "model": "claude-3-opus-20240229", "weight": 1 } ], "fallbacks": ["bedrock/claude-3-opus"], "scope": "team", "scope_id": "team-ml-research-uuid", "priority": 0}Combine multiple criteria for sophisticated routing:
{ "name": "Production Premium Route", "cel_expression": "headers[\"x-environment\"] == \"production\" && headers[\"x-priority\"] == \"high\" && tokens_used < 75", "targets": [ { "provider": "openai", "model": "gpt-4o", "weight": 1 } ], "fallbacks": ["azure/gpt-4o"], "scope": "global", "priority": 5}Split traffic across providers or models by weight for canary deployments or cost optimization:
{ "name": "Split Traffic OpenAI vs Groq", "cel_expression": "true", "targets": [ { "provider": "openai", "model": "gpt-4o", "weight": 0.7 }, { "provider": "groq", "model": "llama-3.1-70b", "weight": 0.3 } ], "scope": "global", "priority": 15}Each request matching this rule has a 70% chance of going to OpenAI and a 30% chance of going to Groq. Weights must always sum to 1.
Route based on region headers:
{ "name": "EU Data Residency", "cel_expression": "headers[\"x-region\"] == \"eu\"", "targets": [ { "provider": "azure", "model": "gpt-4o", "weight": 1 } ], "fallbacks": [], "scope": "global", "priority": 0}Routing Rules run BEFORE governance provider selection and can override it:
If a routing rule matches:
1. Routing Rules → CEL expression evaluation (first-match-wins)2. Rule matches → target selected probabilistically from targets array3. provider/model/key_id/fallbacks overridden from selected target4. Governance provider_configs → SKIPPED5. Load Balancing → selects best key (unless key_id was pinned)If no routing rule matches:
1. Routing Rules → CEL expression evaluation2. No match → continue3. Governance routing → provider/model selection (weighted random)4. Load Balancing → selects best keyExample:
budget_used > 85 → groqRouting rules determine provider BEFORE adaptive load balancing runs:
1. Routing Rules evaluate first → determine provider (if matched) OR2. Governance selects provider (if no routing rule matched) ↓3. Load Balancing Level 1 → skipped (provider already determined by routing rules or governance)4. Load Balancing Level 2 → key selection (performance-based within selected provider)Key Insight: Load balancing Level 2 (key selection) always runs regardless of whether the provider was determined by routing rules or governance. This means you get automatic key-level optimization in all cases.
Routing rules can define fallbacks that flow into load balancing:
{ "provider": "openai", "fallbacks": ["azure/gpt-4o", "groq/gpt-3.5-turbo"]}If OpenAI fails:
Rules within the same scope are evaluated in ascending priority order:
Priority 0 → Priority 5 → Priority 10 → Priority 100 (first match wins)Best Practice: Use priority 0-10 for critical rules, 100+ for fallbacks.
== over .matches() with regex✅ Good names:
❌ Bad names:
✅ Safe patterns:
headers["x-tier"] == "premium" // Exact matchheaders["x-region"] in ["us", "eu", "asia"] // Membershipteam_name.startsWith("prod-") // Prefix checkbudget_used > 80 // Numeric comparison❌ Risky patterns:
headers["x-tier"].matches(".*premium.*") // Complex regexheaders["x-config"].contains("json") // Fragilemodel.length() > 5 && ... // Undocumented behavior✅ Good scope design:
❌ Avoid:
✅ Validate before deployment:
from_memory=true to verify in-memory state❌ Don’t:
✅ Track rule usage:
[RoutingEngine])❌ Don’t forget:
Symptom: Rule expression is correct but doesn’t match
Diagnosis:
enabled: true)from_memory=true to debugSolutions:
# Get current routing rules in memoryGET /api/governance/routing-rules?from_memory=true
# Check if your variables are present# Example: Is team_name actually set?# Verify headers are lowercase in CELSymptom: “Failed to compile rule: invalid CEL syntax”
Common causes:
headers["x-tier (missing closing quote)headers["x"] ?? (not standard CEL)headers["x-\type"] (incorrect escape)Solutions:
(A && B) || (C && D)Symptom: Request routed to unexpected provider
Diagnosis:
Solutions:
[RoutingEngine] Rule matched! Decision: provider=...Symptom: “no such key” error in CEL evaluation
This is normal! DeepIntShield treats missing headers as non-matches:
headers["x-optional"] == "value" # Returns false if header missingIf you need to check if header exists:
headers["x-optional"] != "" # True only if present and non-emptyEnable debug logging to see routing rule evaluation:
[RoutingEngine] Starting rule evaluation for provider=openai, model=gpt-4o[RoutingEngine] Scope chain: [virtual_key(vk-123) team(team-456) customer(cust-789) global][RoutingEngine] Evaluating scope=virtual_key, scopeID=vk-123, ruleCount=2[RoutingEngine] Evaluating rule: id=rule-1, name=Premium Route, expression=headers["x-tier"]=="premium"[RoutingEngine] Rule rule-1 evaluation result: matched=false[RoutingEngine] Evaluating rule: id=rule-2, name=Budget Fallback, expression=budget_used>80[RoutingEngine] Rule rule-2 evaluation result: matched=true[RoutingEngine] Rule matched! Selected target: provider=groq, model=gpt-3.5-turbo (weight=1), fallbacks=[azure/gpt-4o]curl -X POST http://localhost:8080/api/governance/routing-rules \ -H "Content-Type: application/json" \ -d '{ "name": "High Budget Fallback", "description": "Switch to cheaper provider when budget >85%", "enabled": true, "cel_expression": "budget_used > 85", "targets": [ { "provider": "groq", "model": "llama-2-70b", "weight": 1 } ], "fallbacks": ["openai/gpt-3.5-turbo"], "scope": "global", "priority": 10 }'curl -X POST http://localhost:8080/api/governance/routing-rules \ -H "Content-Type: application/json" \ -d '{ "name": "Premium Tier Split", "cel_expression": "headers[\"x-tier\"] == \"premium\"", "targets": [ { "provider": "openai", "model": "gpt-4o", "weight": 0.7 }, { "provider": "azure", "model": "gpt-4o", "weight": 0.3 } ], "scope": "global", "priority": 5 }'curl -X POST http://localhost:8080/api/governance/routing-rules \ -H "Content-Type: application/json" \ -d '{ "name": "Pin Production Key for Premium Tier", "description": "Always use the dedicated production key for premium requests", "enabled": true, "cel_expression": "headers[\"x-tier\"] == \"premium\"", "targets": [ { "provider": "openai", "model": "gpt-4o", "key_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "weight": 1 } ], "scope": "global", "priority": 5 }'curl http://localhost:8080/api/governance/routing-rules \ -H "Authorization: Bearer your-token" \ -G \ --data-urlencode "scope=team" \ --data-urlencode "scope_id=team-uuid-123"curl http://localhost:8080/api/governance/routing-rules?from_memory=true \ -H "Authorization: Bearer your-token"Provider Routing
Understand how routing rules fit into the complete routing pipeline
Virtual Keys
Configure Virtual Keys that scope routing rules
Governance
Learn about the governance layer (applied after routing rules determine provider selection when no rule matches)
CEL Language Spec
Complete CEL expression language documentation