Skip to content

Datadog

Datadog LLM Observability dashboard

The Datadog plugin provides native integration with the Datadog observability platform, offering three pillars of observability for your LLM operations:

  • APM Traces - Distributed tracing via dd-trace-go v2 with W3C Trace Context support for end-to-end request visibility
  • LLM Observability - Native Datadog LLM Obs integration for AI/ML-specific monitoring
  • Metrics - Operational metrics via DogStatsD or the Metrics API

Unlike the OTel plugin which sends generic OpenTelemetry data, the Datadog plugin leverages Datadog’s native SDKs for richer integration with Datadog-specific features like LLM Observability dashboards and ML App grouping.


Datadog LLM Observability dashboard

The plugin supports two deployment modes:

ModeDescriptionRequirementsBest For
Agent (default)Sends data through a local Datadog AgentDatadog Agent running on hostProduction deployments with existing agent infrastructure
AgentlessSends data directly to Datadog APIsAPI key onlyServerless, containers, or simplified deployments

In agent mode, the plugin communicates with a locally running Datadog Agent:

  • APM Traces → Agent at localhost:8126
  • Metrics → DogStatsD at localhost:8125

The agent handles batching, retries, and provides lower latency. This is the recommended mode for production deployments where you already have the Datadog Agent installed.

In agentless mode, the plugin sends data directly to Datadog’s intake APIs:

  • APM Traceshttps://trace.agent.{site}
  • LLM Observability → Direct API submission
  • Metrics → Datadog Metrics API

This mode requires an API key but simplifies deployment by eliminating the need for a local agent. Ideal for serverless environments, Kubernetes pods, or quick testing.


FieldTypeRequiredDefaultDescription
service_namestringNodeepintshieldService name displayed in Datadog APM
ml_appstringNo(uses service_name)ML application name for LLM Observability grouping
agent_addrstringNolocalhost:8126Datadog Agent address (agent mode only)
dogstatsd_addrstringNolocalhost:8125DogStatsD server address (agent mode only)
envstringNo-Environment tag (e.g., production, staging)
versionstringNo-Service version tag
custom_tagsobjectNo-Additional tags for all traces and metrics
enable_metricsboolNotrueEnable metrics emission
enable_tracesboolNotrueEnable APM traces
enable_llm_obsboolNotrueEnable LLM Observability
agentlessboolNofalseUse agentless mode (direct API)
api_keyEnvVarAgentless only-Datadog API key (supports env.VAR_NAME)
sitestringNodatadoghq.comDatadog site/region

The api_key and custom_tags fields support environment variable substitution using the env. prefix:

{
"api_key": "env.DD_API_KEY",
"custom_tags": {
"team": "env.TEAM_NAME",
"cost_center": "env.COST_CENTER"
}
}

Datadog LLM Observability dashboard

Configure the Datadog plugin through the DeepIntShield UI:

  1. Navigate to SettingsPlugins
  2. Enable the Datadog plugin
  3. Configure the required fields based on your deployment mode

The plugin supports all Datadog regional sites. Set the site field to match your Datadog account region:

SiteRegionValue
US1 (default)United Statesdatadoghq.com
US3United Statesus3.datadoghq.com
US5United Statesus5.datadoghq.com
EU1Europedatadoghq.eu
AP1Asia Pacific (Japan)ap1.datadoghq.com
AP2Asia Pacific (Australia)ap2.datadoghq.com
US1-FEDUS Governmentddog-gov.com

Datadog LLM Observability dashboard

The Datadog plugin integrates with Datadog LLM Observability to provide AI/ML-specific monitoring capabilities.

LLM traces are grouped under an ML App in Datadog. By default, this uses your service_name, but you can specify a dedicated ML App name:

{
"service_name": "deepintshield-gateway",
"ml_app": "customer-support-ai"
}

This allows you to:

  • Group related LLM operations across multiple services
  • Track costs and performance by application
  • Apply ML-specific alerts and dashboards

The plugin supports session tracking via the x-bf-session-id header. Include this header in your requests to group related LLM calls into a conversation session:

Terminal window
curl -X POST https://your-deepintshield-gateway/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "x-bf-session-id: user-123-session-456" \
-d '{...}'

Sessions appear in Datadog LLM Observability, allowing you to trace entire conversation flows.

The plugin supports W3C Trace Context for distributed tracing across services. When your upstream service sends a traceparent header, DeepIntShield automatically links its spans as children of the parent trace.

Terminal window
curl -X POST https://your-deepintshield-gateway/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01" \
-d '{...}'

This enables:

  • End-to-end visibility - See LLM calls in the context of your full application trace
  • Cross-service correlation - Link frontend requests → backend services → DeepIntShield → LLM providers
  • Latency attribution - Understand how LLM latency contributes to overall request time

The traceparent header format follows the W3C standard:

traceparent: {version}-{trace-id}-{parent-id}-{trace-flags}

All Datadog APM spans created by DeepIntShield will be linked to the parent span, appearing as children in the Datadog trace view.

For each LLM operation, the plugin sends to LLM Observability:

  • Input/Output Messages - Full conversation history with role attribution
  • Token Usage - Input, output, and total token counts
  • Cost - Calculated cost in USD based on model pricing
  • Latency - Request duration and time-to-first-token for streaming
  • Model Info - Provider, model name, and request parameters
  • Tool Calls - Function/tool call details for agentic workflows

The plugin emits the following metrics to Datadog:

MetricTypeDescriptionTags
deepintshield.requests.totalCounterTotal LLM requestsprovider, model, request_type
deepintshield.success.totalCounterSuccessful requestsprovider, model, request_type
deepintshield.errors.totalCounterFailed requestsprovider, model, request_type, reason
deepintshield.latency.secondsHistogramRequest latency distributionprovider, model, request_type
deepintshield.tokens.inputCounterInput/prompt tokens consumedprovider, model
deepintshield.tokens.outputCounterOutput/completion tokens generatedprovider, model
deepintshield.tokens.totalCounterTotal tokens (input + output)provider, model
deepintshield.cost.usdGaugeRequest cost in USDprovider, model
deepintshield.cache.hitsCounterCache hitsprovider, model, cache_type
deepintshield.stream.first_token_latencyHistogramTime to first token (streaming)provider, model
deepintshield.stream.inter_token_latencyHistogramInter-token latency (streaming)provider, model

All metrics include your configured custom_tags plus automatic tags for:

  • provider - LLM provider (openai, anthropic, etc.)
  • model - Model name
  • request_type - Type of request (chat, embedding, etc.)
  • env - Environment from configuration

Each APM trace includes comprehensive LLM operation metadata:

  • Span Name - Based on request type (genai.chat, genai.embedding, etc.)
  • Service Info - service.name, service.version, env
  • Provider & Model - gen_ai.provider.name, gen_ai.request.model
  • Temperature, max_tokens, top_p, stop sequences
  • Presence/frequency penalties
  • Tool configurations and parallel tool calls
  • Custom parameters via ExtraParams
  • Complete chat history with role-based messages
  • Prompt text for completions
  • Response content with role attribution
  • Tool calls and results
  • Reasoning and refusal content (when present)
  • Token usage (prompt, completion, total)
  • Cost calculations in USD
  • Latency and timing (start/end timestamps)
  • Time to first token (streaming)
  • Error details with status codes
  • Virtual key ID and name
  • Selected key ID and name
  • Team ID and name
  • Customer ID and name
  • Retry count and fallback index

The Datadog plugin captures all DeepIntShield request types:

Request TypeSpan NameLLM Obs Type
Chat Completiongenai.chatLLM Span
Chat Completion (streaming)genai.chatLLM Span
Text Completiongenai.textLLM Span
Text Completion (streaming)genai.textLLM Span
Embeddingsgenai.embeddingEmbedding Span
Speech Generationgenai.speechTask Span
Speech Generation (streaming)genai.speechTask Span
Transcriptiongenai.transcriptionTask Span
Transcription (streaming)genai.transcriptionTask Span
Responses APIgenai.responsesLLM Span
Responses API (streaming)genai.responsesLLM Span

Choose the Datadog plugin when you:

  • Use Datadog as your primary observability platform
  • Want native LLM Observability integration with ML App grouping
  • Need seamless correlation with existing Datadog APM traces via W3C distributed tracing
  • Require Datadog-specific features like notebooks and dashboards
  • Want session tracking for conversation flows

Use the OTel plugin when you:

  • Need multi-vendor observability (send to multiple backends)
  • Are using Datadog via an OpenTelemetry Collector
  • Want vendor flexibility to switch backends without code changes
  • Prefer standardized OpenTelemetry semantic conventions

Use Built-in Observability for:

  • Local development and testing
  • Simple self-hosted deployments
  • No external dependencies required
  • Direct database access to logs

Verify the Datadog Agent is running and accessible:

Terminal window
# Check agent status
datadog-agent status
# Test APM endpoint
curl -v http://localhost:8126/info
# Test DogStatsD (should accept UDP packets)
echo "test.metric:1|c" | nc -u -w1 localhost 8125
  1. Verify your API key is valid:
Terminal window
curl -X GET "https://api.datadoghq.com/api/v1/validate" \
-H "DD-API-KEY: $DD_API_KEY"
  1. Ensure the site matches your API key’s region

  2. Check that the API key environment variable is set:

Terminal window
echo $DD_API_KEY
  1. Enable debug logging in DeepIntShield:
Terminal window
deepintshield-http --log-level debug
  1. Verify traces are enabled in your configuration:
{
"enable_traces": true,
"enable_llm_obs": true
}
  1. Check for errors in the DeepIntShield logs related to the Datadog plugin
  1. Verify DogStatsD is running (agent mode):
Terminal window
datadog-agent status | grep DogStatsD
  1. Ensure metrics are enabled:
{
"enable_metrics": true
}
  1. For agentless mode, verify your API key has metrics submission permissions
  1. LLM Observability requires enable_llm_obs: true (default)
  2. Verify your Datadog plan includes LLM Observability
  3. Check the ML App name in Datadog under LLM ObservabilityApplications