Embedded guard runtime
The guard runtime can run in-process inside the gateway, eliminating the
gRPC/HTTP round-trip to a separate deepintshield_guard service. In
single-binary deployments this saves ~20–300ms per request depending on
network topology.
Default behavior
Section titled “Default behavior”On. When neither DEEPINTSHIELD_GUARD_URL nor DEEPINTSHIELD_GUARD_GRPC_TARGET
is configured, the gateway auto-selects the embedded runtime. When they are
configured, the gateway respects them unless you also set
DEEPINTSHIELD_GUARD_USE_EMBEDDED_RUNTIME=true.
# Force embedded mode (overrides any URL/GRPC target).DEEPINTSHIELD_GUARD_USE_EMBEDDED_RUNTIME=true
# Switch back to a remote runtime.DEEPINTSHIELD_GUARD_URL="https://guard.internal:8443"# orDEEPINTSHIELD_GUARD_GRPC_TARGET="guard.internal:9443"How to tell which mode is active
Section titled “How to tell which mode is active”The gateway logs the chosen mode at startup:
[Guardrails] runtime mode=embedded (overhead-min path: in-process, no RPC hop)[Guardrails] runtime mode=grpc (overhead-min path: remote gRPC — set DEEPINTSHIELD_GUARD_USE_EMBEDDED_RUNTIME=true to skip the hop in single-binary deploys)[Guardrails] runtime mode=http (overhead-min path: remote HTTP — gRPC preferred when available)When to use a remote runtime
Section titled “When to use a remote runtime”Embedded is the right default for almost every deployment. Consider the remote runtime only when:
- You need to scale guard evaluation independently of the gateway (e.g. a dedicated GPU pool for heavy hallucination classifiers).
- You’re running multiple gateway replicas that must share a single guard tenant cache.
- A security boundary (PCI/HIPAA) requires guard evaluation in a separate network zone.
Tunable embedded knobs
Section titled “Tunable embedded knobs”You can pass tuning to the embedded runtime via plugin config:
{ "name": "guardrails", "enabled": true, "config": { "embedded_adapter_timeout_ms": 1500, "embedded_rag_chunk_parallelism": 8, "embedded_timeouts_by_category": { "pii": 150, "toxicity": 600, "jailbreak": 1200 } }}See Per-category timeouts for
the embedded_timeouts_by_category map.