Reasoning
Overview
Section titled “Overview”Reasoning (also called “thinking” in some providers) allows AI models to show their step-by-step thought process before providing a final answer. This feature is available across multiple providers with different implementations.
Provider Support Matrix
Section titled “Provider Support Matrix”| Provider | Request Field | Response Field | Min Budget | Effort Levels | Streaming |
|---|---|---|---|---|---|
| OpenAI | reasoning | reasoning_details | None | minimal, low, medium, high | ✅ |
| Anthropic | thinking | Content blocks | 1024 tokens | enabled only | ✅ |
| Bedrock (Anthropic) | thinking | Content blocks | 1024 tokens | enabled only | ✅ |
| Gemini 2.5+ | thinking_config | thought parts | 1024 | Budget-only | ✅ |
| Gemini 3.0+ | thinking_config | thought parts | 1024 | minimal, low, medium, high + Budget | ✅ |
Request Configuration
Section titled “Request Configuration”Chat Completions API
Section titled “Chat Completions API”{ "model": "provider/model-name", "messages": [...], "reasoning": { "effort": "high", "max_tokens": 4096 }}package main
import ( "github.com/maximhq/deepintshield" "github.com/maximhq/deepintshield/core/schemas")
chatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.OpenAI, Model: "gpt-4o", Input: []schemas.ChatMessage{ { Role: schemas.ChatMessageRoleUser, Content: &schemas.ChatMessageContent{ ContentStr: schemas.Ptr("Explain quantum computing"), }, }, }, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ Effort: schemas.Ptr("high"), MaxTokens: schemas.Ptr(4096), }, },}Responses API
Section titled “Responses API”{ "model": "provider/model-name", "input": [...], "reasoning": { "effort": "high", "max_tokens": 4096, "summary": "detailed" }}package main
import ( "github.com/maximhq/deepintshield/core/schemas")
responsesReq := &schemas.DeepIntShieldResponsesRequest{ Provider: schemas.Anthropic, Model: "claude-3-5-sonnet-20241022", Input: []schemas.ResponsesMessage{ { Role: schemas.Ptr(schemas.ResponsesInputMessageRoleUser), Content: &schemas.ResponsesMessageContent{ ContentStr: schemas.Ptr("Explain quantum computing"), }, }, }, Params: &schemas.ResponsesParameters{ MaxOutputTokens: schemas.Ptr(4096), Reasoning: &schemas.ResponsesParametersReasoning{ Effort: schemas.Ptr("high"), MaxTokens: schemas.Ptr(4096), Summary: schemas.Ptr("detailed"), }, },}Parameter Reference
Section titled “Parameter Reference”Chat Completions API Parameters
Section titled “Chat Completions API Parameters”| Parameter | Type | Description |
|---|---|---|
effort | string | Reasoning intensity level |
max_tokens | int | Maximum tokens for reasoning (budget) |
Responses API Parameters
Section titled “Responses API Parameters”| Parameter | Type | Description |
|---|---|---|
effort | string | Reasoning intensity level |
max_tokens | int | Maximum tokens for reasoning (budget) |
summary | string | Summary level: brief, detailed, or json |
Provider-Specific Conversions
Section titled “Provider-Specific Conversions”OpenAI
Section titled “OpenAI”OpenAI uses effort-based reasoning only. DeepIntShield applies priority logic:
- If
reasoning.effortis provided → use it directly - Else if
reasoning.max_tokensis provided → estimate effort from it - The
max_tokensfield is cleared before sending to OpenAI
Conversion Examples:
// DeepIntShield Request (with effort){ "reasoning": { "effort": "high" }}
// OpenAI Request Sent{ "reasoning": { "effort": "high" }}// DeepIntShield request with effort (native field)chatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.OpenAI, Model: "gpt-4o", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ Effort: schemas.Ptr("high"), }, },}
// OpenAI receives effort directly, max_tokens is cleared// DeepIntShield Request (with max_tokens only){ "max_completion_tokens": 4096, "reasoning": { "max_tokens": 3000 }}
// Estimation: ratio = 3000/4096 ≈ 0.73 → "high"// OpenAI Request Sent{ "reasoning": { "effort": "high" }}// DeepIntShield request with max_tokens onlychatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.OpenAI, Model: "gpt-4o", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ MaxTokens: schemas.Ptr(3000), }, },}
// DeepIntShield estimates effort from max_tokens// ratio = 3000/4096 ≈ 0.73 → effort = "high"// OpenAI receives effort, max_tokens clearedSupported Effort Levels: minimal, low, medium, high
Anthropic
Section titled “Anthropic”Anthropic uses a thinking parameter with different structure.
// DeepIntShield Request{ "reasoning": { "effort": "high", "max_tokens": 4096 }}
// Anthropic Request{ "thinking": { "type": "enabled", "budget_tokens": 4096 }}// Using DeepIntShield Go SDKchatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.Anthropic, Model: "claude-3-5-sonnet-20241022", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ MaxTokens: schemas.Ptr(4096), // Anthropic native field }, },}
// DeepIntShield converts to Anthropic format:// {// "thinking": {// "type": "enabled",// "budget_tokens": 4096// }// }// Anthropic Response (content blocks){ "content": [ { "type": "thinking", "thinking": "Let me analyze this step by step...", "signature": "EqoBCkgIAR..." }, { "type": "text", "text": "The answer is 42." } ]}
// DeepIntShield Response{ "choices": [{ "message": { "content": "The answer is 42.", "reasoning": "Let me analyze this step by step...", "reasoning_details": [{ "index": 0, "type": "text", "text": "Let me analyze this step by step...", "signature": "EqoBCkgIAR..." }] } }]}// After calling DeepIntShield Chat Completions with reasoningresp, err := client.ChatCompletionRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), chatReq)if err != nil { log.Fatal(err)}
// Extract reasoning from responsechoice := resp.Choices[0]message := choice.Message
// Access combined reasoning textreasoningText := message.Reasoning
// Access detailed reasoning blocksfor i, details := range message.ReasoningDetails { fmt.Printf("Block %d: %s\n", i, details.Text) if details.Signature != "" { fmt.Printf(" Signature: %s\n", details.Signature) }}Conversion Rules:
| DeepIntShield | Anthropic | Notes |
|---|---|---|
reasoning.effort | thinking.type | Always mapped to "enabled" |
reasoning.max_tokens | thinking.budget_tokens | Token budget for reasoning |
Dynamic Budget Handling:
| Input Value | Converted To |
|---|---|
-1 (dynamic) | 1024 (minimum default) |
< 1024 | Error |
>= 1024 | Pass-through |
Code Reference: core/providers/anthropic/chat.go:104-134
Bedrock (Anthropic Models)
Section titled “Bedrock (Anthropic Models)”Bedrock uses the same structure as Anthropic for Claude models.
// DeepIntShield Request{ "reasoning": { "effort": "high", "max_tokens": 4096 }}
// Bedrock Request (for Anthropic/Claude models){ "additionalModelRequestFields": { "reasoning_config": { "type": "enabled", "budget_tokens": 4096 } }}// Using DeepIntShield Go SDK with Bedrock providerchatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.Bedrock, Model: "us.anthropic.claude-3-5-sonnet-20241022-v2:0", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ MaxTokens: schemas.Ptr(4096), // Bedrock Anthropic native field }, },}
// DeepIntShield converts to Bedrock format with reasoning_configCode Reference: core/providers/bedrock/utils.go:34-47
Bedrock (Nova Models)
Section titled “Bedrock (Nova Models)”Bedrock Nova models use an effort-based approach similar to OpenAI.
// DeepIntShield Request{ "reasoning": { "effort": "high", "max_tokens": 4096 }}
// Bedrock Request (for Nova models){ "additionalModelRequestFields": { "reasoningConfig": { "type": "enabled", "maxReasoningEffort": "high" } }}// Using DeepIntShield Go SDK with Bedrock NovachatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.Bedrock, Model: "us.amazon.nova-pro-v1:0", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ Effort: schemas.Ptr("high"), // Nova native field }, },}
// DeepIntShield converts to Bedrock Nova format:// reasoningConfig: {// type: "enabled",// maxReasoningEffort: "high"// }| DeepIntShield Effort | Nova Effort | Configuration |
|---|---|---|
minimal, low | "low" | Normal parameters allowed |
medium | "medium" | Normal parameters allowed |
high | "high" | Clears maxTokens, temperature, topP |
Key Differences from Anthropic:
- No minimum token budget constraint
- Uses effort levels instead of token budgets
- High effort mode automatically clears conflicting parameters
Code Reference: core/providers/bedrock/utils.go:48-89
Gemini
Section titled “Gemini”Gemini uses thinking_config with dual support for both token budgets and effort levels, depending on the model version.
Model Version Support
Section titled “Model Version Support”| Gemini Version | thinkingBudget | thinkingLevel | Notes |
|---|---|---|---|
| 2.5+ | ✅ | ❌ | Budget-only models |
| 3.0+ | ✅ | ✅ | Support both budget and level |
Priority Rules
Section titled “Priority Rules”When both reasoning.max_tokens and reasoning.effort are present:
1. If max_tokens is provided → USE thinkingBudget (ignores effort)2. Else if effort is provided: - Gemini 3.0+ → USE thinkingLevel (more native) - Gemini 2.5 → CONVERT effort to thinkingBudget3. Else → disable reasoning// DeepIntShield Request - Both fields provided{ "model": "gemini-3.0-flash", "reasoning": { "effort": "high", // Ignored "max_tokens": 4096 // Takes priority }}
// Gemini 3.0+ Request - Only budget sent{ "generation_config": { "thinking_config": { "include_thoughts": true, "thinking_budget": 4096 } }}// DeepIntShield Request - Effort only{ "model": "gemini-3.0-flash", "reasoning": { "effort": "high" }}
// Gemini 3.0+ Request - Converted to level{ "generation_config": { "thinking_config": { "include_thoughts": true, "thinking_level": "high" } }}// DeepIntShield Request - Effort only{ "model": "gemini-2.5-flash", "max_completion_tokens": 4096, "reasoning": { "effort": "high" }}
// Gemini 2.5 Request - Converted to budget// Calculation: 1024 + (0.80 × (4096 - 1024)) = 3482{ "generation_config": { "thinking_config": { "include_thoughts": true, "thinking_budget": 3482 } }}Model-Specific Level Conversions
Section titled “Model-Specific Level Conversions”Gemini Pro models have stricter constraints on thinking levels:
| DeepIntShield Effort | Non-Pro Models | Pro Models | Notes |
|---|---|---|---|
"none" | Empty string | Empty string | Disables thinking |
"minimal" | "minimal" | "low" | Pro doesn’t support minimal |
"low" | "low" | "low" | Supported on all |
"medium" | "medium" | "high" | Pro doesn’t support medium |
"high" | "high" | "high" | Supported on all |
Example:
// For "gemini-3.0-flash-thinking-exp" (non-Pro)effort: "medium" → thinkingLevel: "medium"
// For "gemini-3.0-pro" (Pro model)effort: "medium" → thinkingLevel: "high" // Converted upSpecial Values
Section titled “Special Values”| Value | Field | Behavior | Use Case |
|---|---|---|---|
0 | max_tokens | thinking_budget: 0, include_thoughts: false | Explicitly disable reasoning |
-1 | max_tokens | thinking_budget: -1 | Dynamic budget (Gemini decides) |
"none" | effort | thinking_budget: 0, include_thoughts: false | Disable reasoning |
// DeepIntShield Request - Dynamic budget{ "reasoning": { "max_tokens": -1 }}
// Gemini Request - Sent as-is{ "generation_config": { "thinking_config": { "include_thoughts": true, "thinking_budget": -1 } }}// DeepIntShield Request - Method 1{ "reasoning": { "max_tokens": 0 }}
// DeepIntShield Request - Method 2{ "reasoning": { "effort": "none" }}
// Gemini Request - Both become{ "generation_config": { "thinking_config": { "include_thoughts": false, "thinking_budget": 0 } }}// Using DeepIntShield Go SDK with Gemini// Example 1: Dynamic budgetchatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.Gemini, Model: "gemini-2.0-flash-thinking-exp-1219", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ MaxTokens: schemas.Ptr(-1), // Let Gemini decide }, },}
// Example 2: Effort-based for Gemini 3.0+chatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.Gemini, Model: "gemini-3.0-flash", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ Effort: schemas.Ptr("high"), // Converts to thinkingLevel }, },}
// Example 3: Budget-based (all versions)chatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.Gemini, Model: "gemini-2.5-flash", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ MaxTokens: schemas.Ptr(3000), // Direct budget }, },}Response Conversion
Section titled “Response Conversion”// Gemini Response{ "candidates": [{ "content": { "parts": [ { "thought": true, "text": "Analyzing the problem..." }, { "text": "The answer is 42." } ] } }]}
// DeepIntShield Response{ "choices": [{ "message": { "content": "The answer is 42.", "reasoning": "Analyzing the problem...", "reasoning_details": [{ "index": 0, "type": "text", "text": "Analyzing the problem..." }] } }]}// After calling DeepIntShield Chat Completions with Geminiresp, err := client.ChatCompletionRequest(schemas.NewDeepIntShieldContext(ctx, schemas.NoDeadline), chatReq)if err != nil { log.Fatal(err)}
// Extract reasoning from responsechoice := resp.Choices[0]message := choice.Message
// Access combined reasoning textfmt.Printf("Reasoning: %s\n", message.Reasoning)
// Access detailed reasoning blocksfor i, details := range message.ReasoningDetails { if details.Type == "text" { fmt.Printf("Thinking block %d:\n%s\n", i, details.Text) }}
// Access final answerfmt.Printf("Answer:\n%s\n", message.Content)Conversion Summary
Section titled “Conversion Summary”DeepIntShield → Gemini (Request):
| Input | Gemini 2.5 | Gemini 3.0+ | Note |
|---|---|---|---|
max_tokens: 4096 | thinking_budget: 4096 | thinking_budget: 4096 | Direct pass-through |
max_tokens: -1 | thinking_budget: -1 | thinking_budget: -1 | Dynamic budget |
max_tokens: 0 | thinking_budget: 0 | thinking_budget: 0 | Disabled |
effort: "high" only | thinking_budget: 3482* | thinking_level: "high" | Estimated or native |
effort: "medium" only | thinking_budget: 2330* | thinking_level: "medium" or "high"** | Estimated or native |
Both effort + max_tokens | Uses max_tokens | Uses max_tokens | Priority rule |
* Assumes max_completion_tokens: 8192 (default), uses estimation formula
** Pro models convert "medium" to "high"
Gemini → DeepIntShield (Response):
| Gemini Field | DeepIntShield Field | Conversion |
|---|---|---|
thinking_budget | reasoning.max_tokens | Direct mapping |
thinking_level | reasoning.effort | Level → effort mapping |
thought: true parts | reasoning_details[] | Array of reasoning blocks |
Code References:
core/providers/gemini/utils.go(Chat Completions)core/providers/gemini/responses.go(Responses API)core/providers/gemini/types.go(Constants)
Two Reasoning Methods: Effort vs. Max Tokens
Section titled “Two Reasoning Methods: Effort vs. Max Tokens”DeepIntShield supports two distinct reasoning models across different providers:
Reasoning Model Types
Section titled “Reasoning Model Types”| Model | Providers | Request Field | Native Format |
|---|---|---|---|
| Effort-Based | OpenAI, AWS Bedrock Nova | reasoning.effort | reasoning_effort (Chat) / effort (Responses) |
| Max-Tokens-Based | Anthropic, Cohere, Gemini | reasoning.max_tokens | thinking.budget_tokens |
Important: Both effort and max_tokens can be specified in a single request. DeepIntShield uses a priority hierarchy to determine which field is used.
Priority Logic: Native vs. Estimated
Section titled “Priority Logic: Native vs. Estimated”When both effort and max_tokens are present in a request, DeepIntShield prioritizes the native compatible field for the target provider:
For Max-Tokens-Based Providers (Anthropic, Cohere, Gemini)
Section titled “For Max-Tokens-Based Providers (Anthropic, Cohere, Gemini)”1. If reasoning.max_tokens is provided → USE IT (native field)2. Else if reasoning.effort is provided → ESTIMATE max_tokens from effort3. Else → disable reasoningExample (Cohere):
// Request with both fields{ "reasoning": { "effort": "high", "max_tokens": 2000 }}Result: Uses max_tokens: 2000 directly, ignores effort
For Effort-Based Providers (OpenAI, AWS Bedrock Nova)
Section titled “For Effort-Based Providers (OpenAI, AWS Bedrock Nova)”1. If reasoning.effort is provided → USE IT (native field)2. Else if reasoning.max_tokens is provided → ESTIMATE effort from max_tokens3. Else → disable reasoningExample (OpenAI Chat Completions):
// Request with both fields{ "reasoning": { "effort": "high", "max_tokens": 2000 }}Result: Uses effort: "high" directly, strips max_tokens from JSON
Why Priority Matters
Reason 1: Accuracy - Native fields provide direct control without estimation loss
Reason 2: Consistency - Using native fields ensures the exact user intent is preserved
Reason 3: Performance - Avoids unnecessary conversions when native field is already provided
Estimator Functions
Section titled “Estimator Functions”DeepIntShield provides two estimator functions to convert between reasoning methods. These are used when the native field is not available.
Function 1: Effort → Max Tokens
Section titled “Function 1: Effort → Max Tokens”Function: GetBudgetTokensFromReasoningEffort()
File: core/providers/utils/utils.go:1350-1387
Signature:
func GetBudgetTokensFromReasoningEffort( effort string, // "minimal", "low", "medium", "high" minBudgetTokens int, // Provider-specific minimum (e.g., 1024 for Anthropic) maxTokens int, // Total completion tokens available) (int, error)Algorithm:
1. Define ratio for effort level: - "minimal" → 2.5% (0.025) - "low" → 15% (0.15) - "medium" → 42.5% (0.425) - "high" → 80% (0.80)
2. Calculate budget: budget = minBudgetTokens + (ratio × (maxTokens - minBudgetTokens))
3. Clamp to valid range: if budget < minBudgetTokens → budget = minBudgetTokens if budget > maxTokens → budget = maxTokensConversion Examples (with minBudgetTokens=1024, maxTokens=4096):
| Effort | Ratio | Calculation | Result |
|---|---|---|---|
minimal | 2.5% | 1024 + 0.025 × 3072 | 1101 → 1024* |
low | 15% | 1024 + 0.15 × 3072 | 1485 |
medium | 42.5% | 1024 + 0.425 × 3072 | 2330 |
high | 80% | 1024 + 0.80 × 3072 | 3482 |
Error Handling:
if minBudgetTokens > maxTokens { return 0, fmt.Errorf("max_tokens must be > minBudgetTokens")}Code Example:
// Cohere: Convert effort to token budgetbudgetTokens, err := providerUtils.GetBudgetTokensFromReasoningEffort( "high", // effort 1, // Cohere min 4096, // max completion tokens)// Returns: 3277 tokensFunction 2: Max Tokens → Effort
Section titled “Function 2: Max Tokens → Effort”Function: GetReasoningEffortFromBudgetTokens()
File: core/providers/utils/utils.go:1308-1345
Signature:
func GetReasoningEffortFromBudgetTokens( budgetTokens int, // Reasoning token budget minBudgetTokens int, // Provider-specific minimum maxTokens int, // Total completion tokens available) string // Returns: "low", "medium", "high"Algorithm:
1. Normalize budget to valid range: if budget < min → budget = min if budget > max → budget = max
2. Calculate ratio: ratio = (budgetTokens - minBudgetTokens) / (maxTokens - minBudgetTokens)
3. Map ratio to effort level: if ratio ≤ 0.25 → "low" if ratio ≤ 0.60 → "medium" if ratio > 0.60 → "high"Conversion Examples (with minBudgetTokens=1024, maxTokens=4096):
| Budget Tokens | Ratio | Effort |
|---|---|---|
| 1024 | 0% | low |
| 1101 | 2.5% | low |
| 1500 | 15.6% | low |
| 1900 | 28.6% | medium |
| 2500 | 48.1% | medium |
| 3000 | 64.5% | high |
| 3400 | 77.6% | high |
Defensive Defaults:
if budgetTokens <= 0 { return "none"}if maxTokens <= 0 { return "medium" // Safe default}if maxTokens <= minBudgetTokens { return "high" // Can't calculate ratio}Code Example:
// Convert Anthropic budget back to effort for displayeffort := providerUtils.GetReasoningEffortFromBudgetTokens( 3000, // budget tokens from Anthropic response 1024, // Anthropic minimum 4096, // max tokens)// Returns: "high"Provider-Specific Constants
Section titled “Provider-Specific Constants”Different providers have different constraints on reasoning budget:
Min Budget Constants
Section titled “Min Budget Constants”| Provider | File | MinBudgetTokens | Reason |
|---|---|---|---|
| Anthropic | core/providers/anthropic/types.go | 1024 | Anthropic API requirement |
| Bedrock Anthropic | core/providers/bedrock/types.go | 1024 | Same as Anthropic |
| Bedrock Nova | core/providers/bedrock/types.go | 1 | More flexible |
| Cohere | core/providers/cohere/types.go | 1 | Flexible |
| Gemini | core/providers/gemini/types.go | 1024 | Default minimum for conversions |
Default Completion Tokens (for ratio calculation)
Section titled “Default Completion Tokens (for ratio calculation)”When max_completion_tokens is not provided, these defaults are used for ratio calculations:
| Provider | Default | File |
|---|---|---|
| OpenAI, Anthropic, Cohere, Bedrock | 4096 | core/providers/*/types.go |
| Gemini | 8192 | core/providers/gemini/types.go |
Effort-to-Token Conversion Examples
Section titled “Effort-to-Token Conversion Examples”Example 1: Estimate tokens from effort (Anthropic)
Section titled “Example 1: Estimate tokens from effort (Anthropic)”Input:
{ "model": "anthropic/claude-3-5-sonnet", "max_completion_tokens": 2000, "reasoning": { "effort": "high" }}Conversion Process:
effort = "high"→ratio = 0.80minBudgetTokens = 1024(Anthropic)maxCompletionTokens = 2000budget = 1024 + (0.80 × (2000 - 1024))budget = 1024 + (0.80 × 976)budget = 1024 + 780- Result: 1804 tokens
Anthropic Request Generated:
{ "thinking": { "type": "enabled", "budget_tokens": 1804 }}import ( "github.com/maximhq/deepintshield/core/providers/utils" "github.com/maximhq/deepintshield/core/schemas")
// Using DeepIntShield Go SDKchatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.Anthropic, Model: "claude-3-5-sonnet-20241022", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(2000), Reasoning: &schemas.ChatReasoning{ Effort: schemas.Ptr("high"), // Effort provided, max_tokens not set }, },}
// DeepIntShield automatically converts effort to budget tokens:// 1. Get ratio for "high": 0.80// 2. Calculate: 1024 + (0.80 × (2000 - 1024)) = 1804// 3. Send to Anthropic with budget_tokens: 1804
// Alternatively, manually call the estimator function:budgetTokens, _ := utils.GetBudgetTokensFromReasoningEffort( "high", // effort 1024, // Anthropic minimum 2000, // max completion tokens)// Returns: 1804Example 2: Estimate effort from tokens (Bedrock Nova)
Section titled “Example 2: Estimate effort from tokens (Bedrock Nova)”Input:
{ "model": "bedrock/us.amazon.nova-pro-v1:0", "max_completion_tokens": 4096, "reasoning": { "max_tokens": 2000 }}Conversion Process:
budgetTokens = 2000minBudgetTokens = 1(Nova)maxCompletionTokens = 4096ratio = (2000 - 1) / (4096 - 1)ratio = 1999 / 4095ratio = 0.488(48.8%)- Since
0.25 < 0.488 ≤ 0.60→ Result: “medium”
Bedrock Nova Request Generated:
{ "reasoningConfig": { "type": "enabled", "maxReasoningEffort": "medium" }}import ( "github.com/maximhq/deepintshield/core/providers/utils" "github.com/maximhq/deepintshield/core/schemas")
// Using DeepIntShield Go SDK with max_tokens (not effort)chatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.Bedrock, Model: "us.amazon.nova-pro-v1:0", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ MaxTokens: schemas.Ptr(2000), // Max tokens provided, effort not set }, },}
// DeepIntShield automatically estimates effort from max_tokens:// 1. Calculate ratio: (2000 - 1) / (4096 - 1) = 0.488// 2. Since 0.25 < 0.488 ≤ 0.60 → "medium"// 3. Send to Bedrock Nova with effort: "medium"
// Alternatively, manually call the estimator function:effort := utils.GetReasoningEffortFromBudgetTokens( 2000, // budget tokens 1, // Nova minimum 4096, // max completion tokens)// Returns: "medium"Example 3: Both fields provided (priority used)
Section titled “Example 3: Both fields provided (priority used)”Input:
{ "model": "anthropic/claude-3-5-sonnet", "max_completion_tokens": 4096, "reasoning": { "effort": "medium", "max_tokens": 2500 }}Logic for Max-Tokens-Based Provider:
- Check: Is
max_tokensprovided? → YES - Use
max_tokensdirectly (ignoreeffort) - Validate:
2500 >= 1024? → YES
Anthropic Request Generated:
{ "thinking": { "type": "enabled", "budget_tokens": 2500 }}Note: The effort: "medium" is completely ignored because max_tokens takes priority.
import "github.com/maximhq/deepintshield/core/schemas"
// Using DeepIntShield Go SDK with BOTH effort and max_tokenschatReq := &schemas.DeepIntShieldChatRequest{ Provider: schemas.Anthropic, Model: "claude-3-5-sonnet-20241022", Input: messages, Params: &schemas.ChatParameters{ MaxCompletionTokens: schemas.Ptr(4096), Reasoning: &schemas.ChatReasoning{ Effort: schemas.Ptr("medium"), // Provided but ignored MaxTokens: schemas.Ptr(2500), // This takes priority }, },}
// DeepIntShield Priority Logic:// 1. For max-tokens-based providers (Anthropic):// → Check if max_tokens is provided? YES// → Use it directly: 2500// → Ignore effort: "medium"// → Validate: 2500 >= 1024? YES ✓// 2. Send to Anthropic with budget_tokens: 2500
// Result: effort is completely ignored, max_tokens is usedResponse Format
Section titled “Response Format”DeepIntShield Standard Response
Section titled “DeepIntShield Standard Response”All providers return reasoning in a normalized reasoning_details array:
{ "choices": [{ "message": { "role": "assistant", "content": "Final response text", "reasoning_details": [ { "index": 0, "type": "text", "text": "Step-by-step reasoning content...", "signature": "optional_signature_for_verification" } ] } }]}Reasoning Details Fields
Section titled “Reasoning Details Fields”| Field | Type | Description | Present In |
|---|---|---|---|
index | int | Position in reasoning sequence | All |
type | string | Content type (text, encrypted, summary) | All |
text | string | Reasoning content | Chat Completions |
summary | string | Reasoning summary | Responses API |
signature | string | Cryptographic signature for verification | Anthropic, Bedrock |
Type Mappings
Section titled “Type Mappings”| Reasoning Type | When Used | Source |
|---|---|---|
reasoning.text | Direct thinking/reasoning content | Anthropic, Gemini, Bedrock |
reasoning.encrypted | Signature-verified reasoning | Anthropic, Bedrock Nova |
reasoning.summary | Summarized reasoning (Responses API) | All providers |
Streaming
Section titled “Streaming”Stream Event Types
Section titled “Stream Event Types”| Provider | Reasoning Event | Signature Event |
|---|---|---|
| OpenAI | reasoning (top-level) | N/A |
| Anthropic | thinking_delta | signature_delta |
| Bedrock | thinking_delta | signature_delta |
| Gemini | thought (in content) | thought_signature |
Anthropic Streaming Example
Section titled “Anthropic Streaming Example”// Stream eventsevent: content_block_startdata: {"type": "content_block_start", "content_block": {"type": "thinking"}}
event: content_block_deltadata: {"type": "content_block_delta", "delta": {"type": "thinking_delta", "thinking": "Let me"}}
event: content_block_deltadata: {"type": "content_block_delta", "delta": {"type": "thinking_delta", "thinking": " analyze..."}}
event: content_block_deltadata: {"type": "content_block_delta", "delta": {"type": "signature_delta", "signature": "EqoB..."}}
event: content_block_stopdata: {"type": "content_block_stop"}DeepIntShield Stream Response
Section titled “DeepIntShield Stream Response”// Thinking delta{ "choices": [{ "delta": { "reasoning_details": [{ "index": 0, "type": "text", "text": "Let me analyze..." }] } }]}
// Signature delta{ "choices": [{ "delta": { "reasoning_details": [{ "index": 0, "signature": "EqoB..." }] } }]}Caveats Summary
Section titled “Caveats Summary”Minimum Budget (Anthropic/Bedrock)
Severity: High
Behavior: reasoning.max_tokens must be >= 1024
Impact: Requests with lower values fail with error
Workaround: Always set max_tokens >= 1024 for Anthropic/Bedrock
Dynamic Budget Not Supported
Severity: Medium
Behavior: reasoning.max_tokens = -1 converted to 1024
Impact: Dynamic budgeting not available on Anthropic/Bedrock
Workaround: Set explicit token budget
Effort Level Normalization
Severity: Low
Behavior: OpenAI’s minimal converted to low when routing to other providers
Impact: Slightly different reasoning behavior
Signature Field Provider-Specific
Severity: Low
Behavior: signature field only present in Anthropic/Bedrock responses
Impact: Signature-based verification only available for these providers
Thinking Type Always Enabled
Severity: Low
Behavior: Anthropic’s thinking.type always set to "enabled" regardless of effort
Impact: Cannot disable thinking once reasoning param is present
Gemini: Only One Parameter Sent
Severity: Medium
Behavior: When both effort and max_tokens are provided, only thinkingBudget is sent to Gemini (effort is dropped)
Impact: Effort value is completely ignored when max_tokens is present
Workaround: Provide only the parameter you want to use
Gemini: Model Version Differences
Severity: Medium
Behavior: Gemini 2.5 only supports thinkingBudget, while 3.0+ supports both thinkingBudget and thinkingLevel
Impact: Effort-only requests on 2.5 are converted to budget; on 3.0+ they use native levels
Note: DeepIntShield automatically detects version and uses appropriate conversion
Gemini Pro: Limited Level Support
Severity: Low
Behavior: Pro models only support “low” and “high” thinking levels
Impact: "minimal" → "low", "medium" → "high" for Pro models
Note: Non-Pro models support all four levels: minimal, low, medium, high
Complete Provider Comparison
Section titled “Complete Provider Comparison”Reasoning Model
Section titled “Reasoning Model”| Provider | Model Type | Budget Type | Min Budget | Signature Support |
|---|---|---|---|---|
| OpenAI | Effort-based | Effort-based | None | ❌ |
| Anthropic | Thinking blocks | Token budget | 1024 | ✅ |
| Bedrock (Anthropic) | Reasoning config | Token budget | 1024 | ✅ |
| Bedrock (Nova) | Reasoning config | Effort-based | None | ❌ |
| Gemini 2.5+ | Thinking config | Token budget | 1024 | ✅ |
| Gemini 3.0+ | Thinking config | Dual (budget + level) | 1024 | ✅ |
Parameter Support
Section titled “Parameter Support”| Provider | effort | max_tokens | summary | Streaming |
|---|---|---|---|---|
| OpenAI | ✅ (4 levels) | ✅ | ❌ | ✅ |
| Anthropic | ❌ (binary) | ✅ | ✅ | ✅ |
| Bedrock (Anthropic) | ❌ (binary) | ✅ | ✅ | ✅ |
| Bedrock (Nova) | ✅ (3 levels) | ⚠️ (ignored) | ❌ | ✅ |
| Gemini 2.5+ | ⚠️ (converts to budget) | ✅ | ❌ | ✅ |
| Gemini 3.0+ | ✅ (4 levels) | ✅ | ❌ | ✅ |
Troubleshooting
Section titled “Troubleshooting”Anthropic: “reasoning.max_tokens must be >= 1024”
Section titled “Anthropic: “reasoning.max_tokens must be >= 1024””Cause: Attempting to use reasoning with max_tokens < 1024
Solution: Ensure reasoning.max_tokens >= 1024 for Anthropic/Bedrock Anthropic models
// ❌ Invalid{"reasoning": {"effort": "high", "max_tokens": 500}}
// ✅ Valid{"reasoning": {"effort": "high", "max_tokens": 1024}}OpenAI: Model doesn’t support reasoning
Section titled “OpenAI: Model doesn’t support reasoning”Cause: Using an older model that doesn’t support reasoning (e.g., gpt-4-turbo)
Solution: Use models with reasoning support: gpt-4o, gpt-4o-mini (o1 series with native reasoning)
Bedrock Nova: max_tokens parameter being ignored
Section titled “Bedrock Nova: max_tokens parameter being ignored”Expected Behavior: Bedrock Nova uses effort-based reasoning only
Solution: Provide effort parameter instead of max_tokens for Nova models
// ✅ Correct for Nova{"reasoning": {"effort": "high"}}