Anthropic
Overview
Section titled “Overview”Anthropic has significant structural differences from OpenAI’s format. DeepIntShield performs extensive conversion including:
- System message extraction - Removed from messages array, placed in separate
systemfield - Tool message grouping - Consecutive tool messages merged into single user message
- Thinking block transformation -
reasoningparameters mapped to Anthropic’sthinkingstructure - Parameter renaming - e.g.,
max_completion_tokens→max_tokens,stop→stop_sequences - Content format conversion - Images, files, and other content types adapted to Anthropic’s schema
Supported Operations
Section titled “Supported Operations”| Operation | Non-Streaming | Streaming | Endpoint |
|---|---|---|---|
| Chat Completions | ✅ | ✅ | /v1/messages |
| Responses API | ✅ | ✅ | /v1/messages |
| Text Completions | ✅ | ❌ | /v1/complete |
| Embeddings | ❌ | ❌ | - |
| Speech (TTS) | ❌ | ❌ | - |
| Transcriptions (STT) | ❌ | ❌ | - |
| Image Generation | ❌ | ❌ | - |
| Files | ✅ | - | /v1/files |
| Batch | ✅ | - | /v1/messages/batches |
| List Models | ✅ | - | /v1/models |
1. Chat Completions
Section titled “1. Chat Completions”Request Parameters
Section titled “Request Parameters”Parameter Mapping
Section titled “Parameter Mapping”| Parameter | Transformation |
|---|---|
max_completion_tokens | Renamed to max_tokens |
temperature, top_p | Direct pass-through |
stop | Renamed to stop_sequences |
response_format | Converted to output_format |
tools | Schema restructured (see Tool Conversion) |
tool_choice | Type mapped (see Tool Conversion) |
reasoning | Mapped to thinking (see Reasoning / Thinking) |
user | Wrapped in metadata.user_id |
top_k | Via extra_params (Anthropic-specific) |
Dropped Parameters
Section titled “Dropped Parameters”The following parameters are silently ignored: frequency_penalty, presence_penalty, logit_bias, logprobs, top_logprobs, seed, parallel_tool_calls, service_tier
Extra Parameters
Section titled “Extra Parameters”Use extra_params (SDK) or pass directly in request body (Gateway) for Anthropic-specific fields:
curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "Hello"}], "top_k": 40 }'resp, err := client.ChatCompletionRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), &schemas.DeepIntShieldChatRequest{ Provider: schemas.Anthropic, Model: "claude-3-5-sonnet", Input: messages, Params: &schemas.ChatParameters{ ExtraParams: map[string]interface{}{ "top_k": 40, }, },})Anthropic also accepts a top-level "cache_control": {"type": "ephemeral"} object on /anthropic/v1/messages requests to enable automatic prompt caching, and DeepIntShield now forwards that directive through unchanged.
Cache Control
Section titled “Cache Control”Cache directives can be added to system messages, user messages, and tool definitions to enable prompt caching:
curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "anthropic/claude-3-5-sonnet", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "This is cached context", "cache_control": {"type": "ephemeral"} } ] } ], "system": [ { "type": "text", "text": "You are a helpful assistant", "cache_control": {"type": "ephemeral"} } ] }'resp, err := client.ChatCompletionRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), &schemas.DeepIntShieldChatRequest{ Provider: schemas.Anthropic, Model: "claude-3-5-sonnet", Input: []schemas.ChatMessage{ { Role: schemas.ChatMessageRoleUser, Content: &schemas.ChatMessageContent{ ContentBlocks: []schemas.ChatContentBlock{ { Text: schemas.Ptr("This is cached context"), CacheControl: &schemas.CacheControl{ Type: schemas.Ptr("ephemeral"), }, }, }, }, }, }, SystemMessages: []schemas.ChatMessage{ { Role: schemas.ChatMessageRoleSystem, Content: &schemas.ChatMessageContent{ ContentBlocks: []schemas.ChatContentBlock{ { Text: schemas.Ptr("You are a helpful assistant"), CacheControl: &schemas.CacheControl{ Type: schemas.Ptr("ephemeral"), }, }, }, }, }, },})Reasoning / Thinking
Section titled “Reasoning / Thinking”Documentation: See DeepIntShield Reasoning Reference
Parameter Mapping
Section titled “Parameter Mapping”reasoning.effort→thinking.type(always mapped to"enabled")reasoning.max_tokens→thinking.budget_tokens(token budget for thinking)
Critical Constraints
Section titled “Critical Constraints”- Minimum budget: 1024 tokens required; requests below this fail with error
- Dynamic budget:
-1is converted to1024automatically
Example
Section titled “Example”// Request{"reasoning": {"effort": "high", "max_tokens": 2048}}
// Anthropic conversion{"thinking": {"type": "enabled", "budget_tokens": 2048}}Message Conversion
Section titled “Message Conversion”Critical Caveats
Section titled “Critical Caveats”- System message extraction: System messages are removed from messages array and placed in separate
systemfield. Multiple system messages become separate text blocks in the system array. - Tool message grouping: Consecutive tool messages are merged into single user message with
tool_resultcontent blocks.
Image Conversion
Section titled “Image Conversion”- URL images:
{"type": "image_url", "image_url": {}}→{"type": "image", "source": {"type": "url", ...}} - Base64 images: Data URL →
{"type": "image", "source": {"type": "base64", "media_type": "image/png", ...}}
Cache Control Locations
Section titled “Cache Control Locations”Cache directives supported on: system content blocks, user message content blocks, tool definitions (see Cache Control examples above)
Tool Conversion
Section titled “Tool Conversion”Tool definitions are restructured: function.name → name, function.parameters → input_schema, function.strict is dropped.
Tool choice mapping: "auto" → auto | "none" → none | "required" → any | Specific tool → {"type": "tool", "name": "X"}
Response Conversion
Section titled “Response Conversion”Field Mapping
Section titled “Field Mapping”stop_reason→finish_reason:end_turn/stop_sequence→stop,max_tokens→length,tool_use→tool_callsinput_tokens + cache_read_input_tokens + cache_creation_input_tokens→prompt_tokens(all cache counts rolled into the total)- Cache token breakdown surfaced in
prompt_tokens_details:cache_read_input_tokens→prompt_tokens_details.cached_read_tokenscache_creation_input_tokens→prompt_tokens_details.cached_write_tokens
output_tokens→completion_tokensthinkingblocks →reasoning_detailswith index, type, text, and signature fields- Tool call arguments converted from JSON object → JSON string
Streaming
Section titled “Streaming”Event sequence: message_start → content_block_start → content_block_delta → content_block_stop → message_delta → message_stop
Delta types: text_delta → content | input_json_delta → tool arguments | thinking_delta → reasoning text | signature_delta → reasoning signature
Caveats
Section titled “Caveats”System Message Extraction
Severity: High
Behavior: System messages removed from array, placed in separate system field
Impact: Message array structure differs from input
Code: chat.go:145-167
Tool Message Grouping
Severity: High
Behavior: Consecutive tool messages merged into single user message
Impact: Message count and structure changes
Code: chat.go:169-216
Minimum Reasoning Budget
Severity: High
Behavior: reasoning.max_tokens must be >= 1024
Impact: Requests with lower values fail with error
Code: chat.go:113-115
Dynamic Budget Conversion
Severity: Medium
Behavior: reasoning.max_tokens = -1 converted to 1024
Impact: Dynamic budgeting not supported
Code: chat.go:107-111
Strict Tool Mode Dropped
Severity: Medium
Behavior: strict: true in tool definitions silently dropped
Impact: No schema validation enforcement
Code: chat.go:43-72
Arguments Serialization
Severity: Low
Behavior: Tool call input (object) serialized to arguments (JSON string)
Code: chat.go:341-350
2. Responses API
Section titled “2. Responses API”The Responses API uses the same underlying /v1/messages endpoint but converts between OpenAI’s Responses format and Anthropic’s Messages format.
Request Parameters
Section titled “Request Parameters”Parameter Mapping
Section titled “Parameter Mapping”| Parameter | Transformation |
|---|---|
max_output_tokens | Renamed to max_tokens |
temperature, top_p | Direct pass-through |
instructions | Becomes system message |
tools | Schema restructured (see Chat Completions) |
tool_choice | Type mapped (see Chat Completions) |
reasoning | Mapped to thinking (see Reasoning / Thinking) |
user | Wrapped in metadata.user_id |
text | Converted to output_format |
include | Via extra_params (Anthropic-specific) |
stop | Via extra_params, renamed to stop_sequences |
top_k | Via extra_params (Anthropic-specific) |
truncation | Auto-set to "auto" for computer tools |
Extra Parameters
Section titled “Extra Parameters”Use extra_params (SDK) or pass directly in request body (Gateway):
curl -X POST http://localhost:8080/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "anthropic/claude-3-5-sonnet", "input": "Hello, how are you?", "top_k": 40 }'resp, err := client.ResponsesRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), &schemas.DeepIntShieldResponsesRequest{ Provider: schemas.Anthropic, Model: "claude-3-5-sonnet", Input: messages, Params: &schemas.ResponsesParameters{ ExtraParams: map[string]interface{}{ "top_k": 40, }, },})Cache Control
Section titled “Cache Control”Cache directives can be added to instructions (system) and input messages to enable prompt caching:
curl -X POST http://localhost:8080/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "anthropic/claude-3-5-sonnet", "instructions": "You are a helpful assistant. This instruction is cached.", "instructions_cache_control": {"type": "ephemeral"}, "input": [ { "type": "text", "text": "Answer this question", "cache_control": {"type": "ephemeral"} } ] }'resp, err := client.ResponsesRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), &schemas.DeepIntShieldResponsesRequest{ Provider: schemas.Anthropic, Model: "claude-3-5-sonnet", Input: []schemas.ChatMessage{ { Role: schemas.ChatMessageRoleUser, Content: &schemas.ChatMessageContent{ ContentBlocks: []schemas.ChatContentBlock{ { Text: schemas.Ptr("Answer this question"), CacheControl: &schemas.CacheControl{ Type: schemas.Ptr("ephemeral"), }, }, }, }, }, }, Params: &schemas.ResponsesParameters{ Instructions: schemas.Ptr("You are a helpful assistant. This instruction is cached."), InstructionsCacheControl: &schemas.CacheControl{ Type: schemas.Ptr("ephemeral"), }, },})Input & Instructions
Section titled “Input & Instructions”- Input: String wrapped as user message or array converted to messages
- Instructions: Becomes system message (same extraction as Chat Completions)
Tool Support
Section titled “Tool Support”Supported types: function, computer_use_preview, web_search, mcp
Tool conversions same as Chat Completions with: MCP tools mapped to mcp_servers (server_label → name, server_url → url) and computer tools auto-set with truncation: "auto"
Cache control supported on instructions and input blocks (see Cache Control examples)
Response Conversion
Section titled “Response Conversion”stop_reason→status:end_turn/stop_sequence→completed,max_tokens→incomplete- Top-level
input_tokensandoutput_tokensare rollups that include cache-related usage; they map asinput_tokens→input_tokens|output_tokens→output_tokens. - Cache-specific counts are exposed in details:
cache_read_input_tokens→input_tokens_details.cached_read_tokens|cache_creation_input_tokens→input_tokens_details.cached_write_tokens - Output items:
text→message|tool_use→function_call|thinking→reasoning
Streaming
Section titled “Streaming”Event sequence: message_start → content_block_start → content_block_delta → content_block_stop → message_delta → message_stop
Special handling: Computer tool arguments accumulated across chunks (emitted on content_block_stop), synthetic content_part.added events emitted for text/reasoning, MCP calls use mcp_call_arguments_delta, item IDs generated as msg_{messageID}_item_{outputIndex}
3. Text Completions (Legacy)
Section titled “3. Text Completions (Legacy)”Request: prompt auto-wrapped with \n\nHuman: {prompt}\n\nAssistant: | max_tokens → max_tokens_to_sample | temperature, top_p direct pass-through | top_k, stop via extra_params (→ stop_sequences)
Response: completion → choices[0].text | stop_reason → finish_reason
4. Batch API
Section titled “4. Batch API”Request formats: requests array (CustomID + Params) or input_file_id
Pagination: Cursor-based with after_id, before_id, limit
Endpoints:
- POST
/v1/messages/batches- Create - GET
/v1/messages/batches- List - GET
/v1/messages/batches/{batch_id}- Retrieve - POST
/v1/messages/batches/{batch_id}/cancel- Cancel
Response: JSONL format with {custom_id, result: {type, message}}
Status mapping: in_progress → InProgress, canceling → Cancelling, ended → Ended
Note: RFC3339Nano timestamps converted to Unix, multi-key retry supported
5. Files API
Section titled “5. Files API”Upload: Multipart/form-data with file (required) and filename (optional)
Field mapping: id | filename | size_bytes → bytes | created_at (Unix) | mime_type → content_type
Endpoints: POST /v1/files, GET /v1/files (cursor pagination), GET /v1/files/{file_id}, DELETE /v1/files/{file_id}, GET /v1/files/{file_id}/content
Note: File purpose always "batch", status always "processed"
6. List Models
Section titled “6. List Models”Request: GET /v1/models?limit={defaultPageSize} (no body)
Field mapping: id (prefixed anthropic/) | display_name → name | created_at (Unix timestamp)
Pagination: Token-based with NextPageToken, FirstID, LastID
Multi-key support: Results aggregated from all keys, filtered by allowed_models if configured