Routing Rules

Overview

Routing Rules provide dynamic, expression-based control over request routing. They execute before governance provider selection and can override it, allowing you to make sophisticated routing decisions based on request context, headers, parameters, capacity metrics, and organizational hierarchy.

Unlike governance routing (which uses static provider weights), routing rules use CEL expressions (Common Expression Language) to evaluate conditions at runtime and make routing decisions dynamically.

How It Works

Request Flow

Scope Hierarchy & Precedence

Routing rules are organized by scope with first-match-wins evaluation:

VirtualKey Scope (Highest Priority)
    ↓
Team Scope
    ↓
Customer Scope
    ↓
Global Scope (Lowest Priority, applies to all)

How it works:

When a request arrives with a Virtual Key, DeepIntShield builds a scope chain
Rules are evaluated in scope order (highest to lowest)
The first matching rule wins - no further rules are evaluated
Within each scope, rules are sorted by priority (ascending: 0 evaluates before 10)
If no rule matches, the incoming provider/model is used unchanged

Example:

VirtualKey (vk-123) is attached to Team (team-456),
which belongs to Customer (cust-789)

Evaluation order:
1. Check Virtual Key scope rules (vk-123)
2. Check Team scope rules (team-456)
3. Check Customer scope rules (cust-789)
4. Check Global scope rules

First match → Decision

CEL Expression Guide

Available Variables

Routing rules evaluate CEL expressions with these available variables:

Request Context

model          // Requested model (string)
provider       // Current provider (string)
request_type   // Request type (chat_completion, embedding, batch, image_generation, moderation, transcription, translation)

Headers & Parameters

headers["header-name"]     // Request header (case-insensitive key lookup)
params["param-name"]        // Query parameter

Header Examples:

headers["x-tier"] == "premium"
headers["x-api-version"] == "v2"
headers["user-agent"].contains("mobile")

Organization Context

virtual_key_id             // VK ID (string, empty if no VK)
virtual_key_name           // VK name (string)
team_id                    // Team ID (string, empty if not in team)
team_name                  // Team name (string)
customer_id                // Customer ID (string)
customer_name              // Customer name (string)

Organization Examples:

team_name == "ml-research"
customer_id == "acme-corp"
virtual_key_name.startsWith("prod-")

Capacity Metrics (as percentages: 0-100)

budget_used      // Budget usage percentage for provider/model (0.0 to 100.0)
tokens_used      // Token rate limit usage percentage (0.0 to 100.0)
request          // Request rate limit usage percentage (0.0 to 100.0)

Capacity Examples:

budget_used > 80           // Route to fallback when 80%+ of budget used
tokens_used < 50           // Route to fast provider when below 50% token limit
request > 90               // Switch providers when request limit near max

CEL Operators & Functions

Comparison Operators

==    // Equal
!=    // Not equal
>     // Greater than
<     // Less than
>=    // Greater or equal
<=    // Less or equal

Logical Operators

&&    // AND
||    // OR
!     // NOT

String Functions

.startsWith("prefix")      // Check string prefix
.endsWith("suffix")        // Check string suffix
.contains("substring")     // Check substring
.matches("regex")          // Regex match

Collections

"value" in ["item1", "item2", "item3"]    // Check membership

Expression Examples

Simple Conditions

// Route based on header value
headers["x-tier"] == "premium"

// Route based on team
team_name == "research"

// Route based on model
model == "gpt-4o"

// Route based on request type
request_type == "embedding"

// Route to fallback when budget high
budget_used > 80

Complex Conditions (Multiple Criteria)

// Premium tier research team
headers["x-tier"] == "premium" && team_name == "research"

// High capacity or premium
budget_used > 90 || headers["x-priority"] == "high"

// Specific team and model
team_name == "ml-ops" && model.startsWith("claude-")

// Region-based with capacity check
headers["x-region"] == "us-east" && tokens_used < 75

// Route embeddings to cheaper provider
request_type == "embedding" && budget_used > 50

Pattern Matching

// Match models starting with prefix
model.startsWith("gpt-4")

// Match custom headers
headers["x-environment"] in ["staging", "testing"]

// Email domain matching
headers["x-user-email"].contains("@company.com")

// Regex patterns
headers["x-app-version"].matches("[0-9]+\\.[0-9]+\\.[0-9]+")

Validation & Error Handling

Invalid CEL syntax → Rule logs warning, skipped, evaluation continues
Missing header/parameter → Expression returns false (graceful no-match)
Type mismatches → Logged as warning, rule skipped
Empty expression → Rule always matches (use true/false for explicit behavior)

Configuration

Access routing rules from the dashboard:

Routing Rules Dashboard

Features:

List all rules with scope, priority, and enabled status
Filter by scope or scope_id
Create/Edit/Delete rules
View rule expressions and targets
Enable/disable rules without deletion
Drag to reorder priority

Create/Edit Rule Sheet

Fields:

Name (required): Unique rule identifier
Description (optional): Internal notes
Enabled: Toggle rule on/off
CEL Expression: Visual or manual expression builder
Targets (required): One or more weighted routing targets — each has Provider (optional), Model (optional), API Key (optional, requires Provider to be set), and Weight (%). Weights must sum to 1. When multiple targets are defined, one is selected probabilistically at request time.
Fallbacks (optional): Array of fallback providers
Scope: Where rule applies (global, customer, team, virtual_key)
Scope ID: Required if scope is not global
Priority: Lower = evaluated first (default: 0)

Visual CEL Builder

The dashboard includes a visual query builder for CEL expressions:

Condition Builder: Select field, operator, value
Logical Operators: Combine conditions with AND/OR
Manual Mode: Switch to edit CEL directly
Validation: Real-time syntax validation
Conversion: Auto-converts visual rules to CEL

List Routing Rules

GET /api/governance/routing-rules

# Optional query parameters:
?scope=global&scope_id=<id>&from_memory=true

Response:

{
  "rules": [
    {
      "id": "rule-uuid-123",
      "name": "Premium Tier Route",
      "description": "Route premium users to fast provider",
      "enabled": true,
      "cel_expression": "headers[\"x-tier\"] == \"premium\"",
      "targets": [
        { "provider": "openai", "model": "gpt-4o", "weight": 0.7 },
        { "provider": "azure",  "model": "gpt-4o", "weight": 0.3 }
      ],
      "fallbacks": ["groq/gpt-3.5-turbo"],
      "scope": "global",
      "scope_id": null,
      "priority": 10,
      "created_at": "2024-01-15T10:30:00Z",
      "updated_at": "2024-01-15T10:30:00Z"
    }
  ],
  "count": 1
}

Get Single Rule

GET /api/governance/routing-rules/{rule_id}

Create Rule

POST /api/governance/routing-rules

Content-Type: application/json

Request Body:

{
  "name": "Budget Overflow Route",
  "description": "When budget is high, route to cheaper provider",
  "enabled": true,
  "cel_expression": "budget_used > 85",
  "targets": [
    { "provider": "groq", "weight": 1 }
  ],
  "fallbacks": ["openai/gpt-4o"],
  "scope": "team",
  "scope_id": "team-uuid-456",
  "priority": 5
}

Response: 201 Created

{
  "message": "Routing rule created successfully",
  "rule": { /* rule object */ }
}

Update Rule

PUT /api/governance/routing-rules/{rule_id}

Content-Type: application/json

Request Body (all fields optional):

{
  "name": "Updated Rule Name",
  "enabled": false,
  "cel_expression": "budget_used > 90",
  "priority": 20
}

Delete Rule

DELETE /api/governance/routing-rules/{rule_id}

Response: 200 OK

{
  "message": "Routing rule deleted successfully"
}

Define routing rules in your config.json file under the governance configuration:

Structure:

{
  "governance": {
    "routing_rules": [
      {
        "id": "rule-uuid-123",
        "name": "Premium Tier Route",
        "description": "Route premium users to fast provider",
        "enabled": true,
        "cel_expression": "headers[\"x-tier\"] == \"premium\"",
        "targets": [
          { "provider": "openai", "model": "gpt-4o", "weight": 0.7 },
          { "provider": "azure",  "model": "gpt-4o", "weight": 0.3 }
        ],
        "fallbacks": ["groq/gpt-3.5-turbo"],
        "scope": "global",
        "scope_id": null,
        "priority": 10
      },
      {
        "id": "rule-uuid-456",
        "name": "Budget Overflow Route",
        "description": "Route to cheaper provider when budget is high",
        "enabled": true,
        "cel_expression": "budget_used > 85",
        "targets": [
          { "provider": "groq", "model": "llama-2-70b", "weight": 1 }
        ],
        "fallbacks": [],
        "scope": "team",
        "scope_id": "team-ml-ops",
        "priority": 5
      }
    ]
  }
}

Fields:

id (string, auto-generated): Unique rule identifier (UUID)
name (string, required): Rule name (must be unique within scope)
description (string, optional): Internal documentation
enabled (boolean): Whether rule is active
cel_expression (string): CEL expression for rule matching
targets (array, required): One or more routing targets. Each target has:
- provider (string, optional): Target provider — omit to use the incoming request provider
- model (string, optional): Target model — omit to use the incoming request model
- key_id (string, optional): UUID of the API key to pin — requires provider to be present; omit for load-balanced key selection
- weight (number, required): Probability weight — all weights in a rule must sum to 1 (e.g. 0.7 + 0.3 = 1.0)
fallbacks (array[string]): Fallback providers in “provider/model” format
scope (string): Scope level - “global”, “customer”, “team”, or “virtual_key”
scope_id (string, optional): ID of scoped entity (null for global scope)
priority (number): Rule evaluation order within scope (lower = evaluated first)

Loading from config.json: Routes are automatically loaded on startup from the config.json governance section. Changes require application restart.

Example with Multiple Rules:

{
  "governance": {
    "routing_rules": [
      {
        "id": "tier-based",
        "name": "Premium Tier Fast Track",
        "enabled": true,
        "cel_expression": "headers[\"x-tier\"] == \"premium\"",
        "targets": [
          { "provider": "openai", "model": "gpt-4o", "weight": 1 }
        ],
        "fallbacks": ["azure/gpt-4o"],
        "scope": "global",
        "priority": 0
      },
      {
        "id": "capacity-failover",
        "name": "Budget Exhaustion Fallback",
        "enabled": true,
        "cel_expression": "budget_used > 90",
        "targets": [
          { "provider": "groq", "model": "llama-2-70b", "weight": 1 }
        ],
        "fallbacks": [],
        "scope": "global",
        "priority": 5
      },
      {
        "id": "team-preference",
        "name": "ML Team Anthropic Route",
        "enabled": true,
        "cel_expression": "team_name == \"ml-research\"",
        "targets": [
          { "provider": "anthropic", "model": "claude-3-opus-20240229", "weight": 1 }
        ],
        "fallbacks": ["bedrock/claude-3-opus"],
        "scope": "team",
        "scope_id": "team-ml-research",
        "priority": 0
      }
    ]
  }
}

Real-World Use Cases

When to use Routing Rules:

Dynamic routing based on request headers or parameters
Capacity-based routing (route to fallback when budget/rate limit is high)
Organization-based routing (different rules for different teams/customers)
A/B testing or canary deployments
Conditional provider override based on complex logic

Use Case 1: Tier-Based Routing

Route requests based on customer tier using headers:

{
  "name": "Premium Tier Fast Track",
  "cel_expression": "headers[\"x-tier\"] == \"premium\"",
  "targets": [
    { "provider": "openai", "model": "gpt-4o", "weight": 1 }
  ],
  "fallbacks": ["azure/gpt-4o"],
  "scope": "global",
  "priority": 10
}

Use Case 2: Capacity-Based Failover

Route to cheaper provider when budget is exhausted:

{
  "name": "Budget Exhaustion Fallback",
  "cel_expression": "budget_used > 90",
  "targets": [
    { "provider": "groq", "model": "llama-2-70b", "weight": 1 }
  ],
  "fallbacks": [],
  "scope": "global",
  "priority": 5
}

Use Case 3: Team-Specific Routing

Route team-specific requests to their preferred provider:

{
  "name": "ML Team Anthropic Preference",
  "cel_expression": "team_name == \"ml-research\"",
  "targets": [
    { "provider": "anthropic", "model": "claude-3-opus-20240229", "weight": 1 }
  ],
  "fallbacks": ["bedrock/claude-3-opus"],
  "scope": "team",
  "scope_id": "team-ml-research-uuid",
  "priority": 0
}

Use Case 4: Complex Multi-Condition Routing

Combine multiple criteria for sophisticated routing:

{
  "name": "Production Premium Route",
  "cel_expression": "headers[\"x-environment\"] == \"production\" && headers[\"x-priority\"] == \"high\" && tokens_used < 75",
  "targets": [
    { "provider": "openai", "model": "gpt-4o", "weight": 1 }
  ],
  "fallbacks": ["azure/gpt-4o"],
  "scope": "global",
  "priority": 5
}

Use Case 5: Probabilistic A/B Testing

Split traffic across providers or models by weight for canary deployments or cost optimization:

{
  "name": "Split Traffic OpenAI vs Groq",
  "cel_expression": "true",
  "targets": [
    { "provider": "openai", "model": "gpt-4o",        "weight": 0.7 },
    { "provider": "groq",   "model": "llama-3.1-70b", "weight": 0.3 }
  ],
  "scope": "global",
  "priority": 15
}

Each request matching this rule has a 70% chance of going to OpenAI and a 30% chance of going to Groq. Weights must always sum to 1.

Use Case 6: Regional Routing

Route based on region headers:

{
  "name": "EU Data Residency",
  "cel_expression": "headers[\"x-region\"] == \"eu\"",
  "targets": [
    { "provider": "azure", "model": "gpt-4o", "weight": 1 }
  ],
  "fallbacks": [],
  "scope": "global",
  "priority": 0
}

Integration with Governance & Load Balancing

Interaction with Governance Routing

Routing Rules run BEFORE governance provider selection and can override it:

If a routing rule matches:

1. Routing Rules → CEL expression evaluation (first-match-wins)
2. Rule matches → target selected probabilistically from targets array
3. provider/model/key_id/fallbacks overridden from selected target
4. Governance provider_configs → SKIPPED
5. Load Balancing → selects best key (unless key_id was pinned)

If no routing rule matches:

1. Routing Rules → CEL expression evaluation
2. No match → continue
3. Governance routing → provider/model selection (weighted random)
4. Load Balancing → selects best key

Example:

Governance configures: 70% Azure, 30% OpenAI
Routing rule exists: budget_used > 85 → groq
Request arrives with budget_used = 90%
Result: Groq selected by routing rule, governance provider_configs ignored

Interaction with Load Balancing

Routing rules determine provider BEFORE adaptive load balancing runs:

1. Routing Rules evaluate first → determine provider (if matched)
   OR
2. Governance selects provider (if no routing rule matched)
   ↓
3. Load Balancing Level 1 → skipped (provider already determined by routing rules or governance)
4. Load Balancing Level 2 → key selection (performance-based within selected provider)

Key Insight: Load balancing Level 2 (key selection) always runs regardless of whether the provider was determined by routing rules or governance. This means you get automatic key-level optimization in all cases.

Fallback Chain

Routing rules can define fallbacks that flow into load balancing:

{
  "provider": "openai",
  "fallbacks": ["azure/gpt-4o", "groq/gpt-3.5-turbo"]
}

If OpenAI fails:

Level 2 load balancing evaluates Azure keys
If all Azure keys fail, tries Groq

Execution & Performance

CEL Compilation

First evaluation: CEL expression is compiled into a bytecode program
Subsequent evaluations: Program is cached and reused
Performance: Cached program evaluation is very fast (microseconds)
Memory: Compiled programs cached in memory until DeepIntShield restart

Priority & Ordering

Rules within the same scope are evaluated in ascending priority order:

Priority 0 → Priority 5 → Priority 10 → Priority 100 (first match wins)

Best Practice: Use priority 0-10 for critical rules, 100+ for fallbacks.

Optimization Tips

Order rules by likelihood: Put frequently matching rules first
Use specific scopes: Avoid global scope when possible (narrower = faster)
Avoid expensive string operations: Prefer == over .matches() with regex
Keep expressions simple: Complex conditions increase evaluation time
Use reasonable priorities: Gaps in priorities (0, 10, 20) make reordering easy

Best Practices

Rule Naming

✅ Good names:

“Premium Tier Fast Track”
“Budget Exhaustion Fallback”
“ML Team Anthropic Route”
“Production High Priority Route”

❌ Bad names:

“Rule 1”
“Fix”
“Temp”
“TODO”

CEL Expression Safety

✅ Safe patterns:

headers["x-tier"] == "premium"                    // Exact match
headers["x-region"] in ["us", "eu", "asia"]      // Membership
team_name.startsWith("prod-")                    // Prefix check
budget_used > 80                                 // Numeric comparison

❌ Risky patterns:

headers["x-tier"].matches(".*premium.*")         // Complex regex
headers["x-config"].contains("json")             // Fragile
model.length() > 5 && ...                        // Undocumented behavior

Scope Management

✅ Good scope design:

Global rules for organization-wide policies
Customer scope for compliance (EU, data residency)
Team scope for team preferences
Virtual Key scope for specific integrations

❌ Avoid:

Too many virtual key-level rules (maintenance nightmare)
Conflicting rules across scopes
Rules that duplicate governance routing

Testing & Validation

✅ Validate before deployment:

Test CEL expression with expected headers
Verify provider/model exist in your setup
Check fallbacks are valid providers
Confirm scope_id matches actual entity
Test with from_memory=true to verify in-memory state

❌ Don’t:

Deploy rules without testing
Use nonexistent providers
Create circular fallback chains

Monitoring

✅ Track rule usage:

Log which rules match (logged in DeepIntShield logs as [RoutingEngine])
Monitor routing decisions by scope
Alert on unexpected provider selection patterns
Review priority order occasionally

❌ Don’t forget:

Disabling unused rules (instead of deleting)
Updating documentation when rules change
Testing failover chains

Troubleshooting

Rule Not Matching

Symptom: Rule expression is correct but doesn’t match

Diagnosis:

Check if rule is enabled (enabled: true)
Verify scope matches (check VirtualKey’s team/customer hierarchy)
Check rule priority vs other rules in scope (lower priority evaluates first)
Verify variable values: Use from_memory=true to debug

Solutions:

# Get current routing rules in memory
GET /api/governance/routing-rules?from_memory=true

# Check if your variables are present
# Example: Is team_name actually set?
# Verify headers are lowercase in CEL

Expression Compilation Error

Symptom: “Failed to compile rule: invalid CEL syntax”

Common causes:

Unclosed quotes: headers["x-tier (missing closing quote)
Invalid operators: headers["x"] ?? (not standard CEL)
String escaping: headers["x-\type"] (incorrect escape)

Solutions:

Use the visual CEL builder to avoid syntax errors
Test expressions incrementally
Check CEL operator documentation above
Wrap complex expressions in parentheses: (A && B) || (C && D)

Wrong Provider Selected

Symptom: Request routed to unexpected provider

Diagnosis:

Multiple rules matching? (first-match-wins means earlier rules take precedence)
Governance routing already determined provider? (check scope hierarchy)
Load balancing changed key? (rule sets provider, LB sets key)

Solutions:

Lower priority of matching rules
Verify scope precedence (VirtualKey > Team > Customer > Global)
Check if another rule has lower priority and matches first
Review logs: [RoutingEngine] Rule matched! Decision: provider=...

Header/Parameter Not Found

Symptom: “no such key” error in CEL evaluation

This is normal! DeepIntShield treats missing headers as non-matches:

headers["x-optional"] == "value"  # Returns false if header missing

If you need to check if header exists:

headers["x-optional"] != ""  # True only if present and non-empty

Debugging with Logs

Enable debug logging to see routing rule evaluation:

[RoutingEngine] Starting rule evaluation for provider=openai, model=gpt-4o
[RoutingEngine] Scope chain: [virtual_key(vk-123) team(team-456) customer(cust-789) global]
[RoutingEngine] Evaluating scope=virtual_key, scopeID=vk-123, ruleCount=2
[RoutingEngine] Evaluating rule: id=rule-1, name=Premium Route, expression=headers["x-tier"]=="premium"
[RoutingEngine] Rule rule-1 evaluation result: matched=false
[RoutingEngine] Evaluating rule: id=rule-2, name=Budget Fallback, expression=budget_used>80
[RoutingEngine] Rule rule-2 evaluation result: matched=true
[RoutingEngine] Rule matched! Selected target: provider=groq, model=gpt-3.5-turbo (weight=1), fallbacks=[azure/gpt-4o]

API Reference

Request/Response Examples

Create Capacity-Based Rule

curl -X POST http://localhost:8080/api/governance/routing-rules \
  -H "Content-Type: application/json" \
  -d '{
    "name": "High Budget Fallback",
    "description": "Switch to cheaper provider when budget >85%",
    "enabled": true,
    "cel_expression": "budget_used > 85",
    "targets": [
      { "provider": "groq", "model": "llama-2-70b", "weight": 1 }
    ],
    "fallbacks": ["openai/gpt-3.5-turbo"],
    "scope": "global",
    "priority": 10
  }'

Create Probabilistic Split Rule

curl -X POST http://localhost:8080/api/governance/routing-rules \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Premium Tier Split",
    "cel_expression": "headers[\"x-tier\"] == \"premium\"",
    "targets": [
      { "provider": "openai", "model": "gpt-4o",   "weight": 0.7 },
      { "provider": "azure",  "model": "gpt-4o",   "weight": 0.3 }
    ],
    "scope": "global",
    "priority": 5
  }'

Create Rule with Pinned API Key

curl -X POST http://localhost:8080/api/governance/routing-rules \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Pin Production Key for Premium Tier",
    "description": "Always use the dedicated production key for premium requests",
    "enabled": true,
    "cel_expression": "headers[\"x-tier\"] == \"premium\"",
    "targets": [
      {
        "provider": "openai",
        "model": "gpt-4o",
        "key_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
        "weight": 1
      }
    ],
    "scope": "global",
    "priority": 5
  }'

List Rules by Team Scope

curl http://localhost:8080/api/governance/routing-rules \
  -H "Authorization: Bearer your-token" \
  -G \
  --data-urlencode "scope=team" \
  --data-urlencode "scope_id=team-uuid-123"

Get In-Memory Rules (Debug)

curl http://localhost:8080/api/governance/routing-rules?from_memory=true \
  -H "Authorization: Bearer your-token"

Additional Resources

Provider Routing

Understand how routing rules fit into the complete routing pipeline

Open →

Virtual Keys

Configure Virtual Keys that scope routing rules

Open →

Governance

Learn about the governance layer (applied after routing rules determine provider selection when no rule matches)

Open →

CEL Language Spec

Complete CEL expression language documentation

Open →