Embedded guard runtime

The guard runtime can run in-process inside the gateway, eliminating the gRPC/HTTP round-trip to a separate deepintshield_guard service. In single-binary deployments this saves ~20–300ms per request depending on network topology.

Default behavior

On. When neither DEEPINTSHIELD_GUARD_URL nor DEEPINTSHIELD_GUARD_GRPC_TARGET is configured, the gateway auto-selects the embedded runtime. When they are configured, the gateway respects them unless you also set DEEPINTSHIELD_GUARD_USE_EMBEDDED_RUNTIME=true.

# Force embedded mode (overrides any URL/GRPC target).
DEEPINTSHIELD_GUARD_USE_EMBEDDED_RUNTIME=true

# Switch back to a remote runtime.
DEEPINTSHIELD_GUARD_URL="https://guard.internal:8443"
# or
DEEPINTSHIELD_GUARD_GRPC_TARGET="guard.internal:9443"

How to tell which mode is active

The gateway logs the chosen mode at startup:

[Guardrails] runtime mode=embedded (overhead-min path: in-process, no RPC hop)
[Guardrails] runtime mode=grpc     (overhead-min path: remote gRPC — set DEEPINTSHIELD_GUARD_USE_EMBEDDED_RUNTIME=true to skip the hop in single-binary deploys)
[Guardrails] runtime mode=http     (overhead-min path: remote HTTP — gRPC preferred when available)

When to use a remote runtime

Embedded is the right default for almost every deployment. Consider the remote runtime only when:

You need to scale guard evaluation independently of the gateway (e.g. a dedicated GPU pool for heavy hallucination classifiers).
You’re running multiple gateway replicas that must share a single guard tenant cache.
A security boundary (PCI/HIPAA) requires guard evaluation in a separate network zone.

Tunable embedded knobs

You can pass tuning to the embedded runtime via plugin config:

{
  "name": "guardrails",
  "enabled": true,
  "config": {
    "embedded_adapter_timeout_ms": 1500,
    "embedded_rag_chunk_parallelism": 8,
    "embedded_timeouts_by_category": {
      "pii": 150,
      "toxicity": 600,
      "jailbreak": 1200
    }
  }
}

See Per-category timeouts for the embedded_timeouts_by_category map.