Skip to content

Reasoning

Reasoning (also called “thinking” in some providers) allows AI models to show their step-by-step thought process before providing a final answer. This feature is available across multiple providers with different implementations.


ProviderRequest FieldResponse FieldMin BudgetEffort LevelsStreaming
OpenAIreasoningreasoning_detailsNoneminimal, low, medium, high
AnthropicthinkingContent blocks1024 tokensenabled only
Bedrock (Anthropic)thinkingContent blocks1024 tokensenabled only
Gemini 2.5+thinking_configthought parts1024Budget-only
Gemini 3.0+thinking_configthought parts1024minimal, low, medium, high + Budget

{
"model": "provider/model-name",
"messages": [...],
"reasoning": {
"effort": "high",
"max_tokens": 4096
}
}
{
"model": "provider/model-name",
"input": [...],
"reasoning": {
"effort": "high",
"max_tokens": 4096,
"summary": "detailed"
}
}
ParameterTypeDescription
effortstringReasoning intensity level
max_tokensintMaximum tokens for reasoning (budget)
ParameterTypeDescription
effortstringReasoning intensity level
max_tokensintMaximum tokens for reasoning (budget)
summarystringSummary level: brief, detailed, or json

OpenAI uses effort-based reasoning only. DeepIntShield applies priority logic:

  1. If reasoning.effort is provided → use it directly
  2. Else if reasoning.max_tokens is provided → estimate effort from it
  3. The max_tokens field is cleared before sending to OpenAI

Conversion Examples:

// DeepIntShield Request (with effort)
{
"reasoning": {
"effort": "high"
}
}
// OpenAI Request Sent
{
"reasoning": {
"effort": "high"
}
}

Supported Effort Levels: minimal, low, medium, high


Anthropic uses a thinking parameter with different structure.

// DeepIntShield Request
{
"reasoning": {
"effort": "high",
"max_tokens": 4096
}
}
// Anthropic Request
{
"thinking": {
"type": "enabled",
"budget_tokens": 4096
}
}

Conversion Rules:

DeepIntShieldAnthropicNotes
reasoning.effortthinking.typeAlways mapped to "enabled"
reasoning.max_tokensthinking.budget_tokensToken budget for reasoning

Dynamic Budget Handling:

Input ValueConverted To
-1 (dynamic)1024 (minimum default)
< 1024Error
>= 1024Pass-through

Code Reference: core/providers/anthropic/chat.go:104-134


Bedrock uses the same structure as Anthropic for Claude models.

// DeepIntShield Request
{
"reasoning": {
"effort": "high",
"max_tokens": 4096
}
}
// Bedrock Request (for Anthropic/Claude models)
{
"additionalModelRequestFields": {
"reasoning_config": {
"type": "enabled",
"budget_tokens": 4096
}
}
}

Code Reference: core/providers/bedrock/utils.go:34-47


Bedrock Nova models use an effort-based approach similar to OpenAI.

// DeepIntShield Request
{
"reasoning": {
"effort": "high",
"max_tokens": 4096
}
}
// Bedrock Request (for Nova models)
{
"additionalModelRequestFields": {
"reasoningConfig": {
"type": "enabled",
"maxReasoningEffort": "high"
}
}
}

Key Differences from Anthropic:

  • No minimum token budget constraint
  • Uses effort levels instead of token budgets
  • High effort mode automatically clears conflicting parameters

Code Reference: core/providers/bedrock/utils.go:48-89


Gemini uses thinking_config with dual support for both token budgets and effort levels, depending on the model version.

Gemini VersionthinkingBudgetthinkingLevelNotes
2.5+Budget-only models
3.0+Support both budget and level

When both reasoning.max_tokens and reasoning.effort are present:

1. If max_tokens is provided → USE thinkingBudget (ignores effort)
2. Else if effort is provided:
- Gemini 3.0+ → USE thinkingLevel (more native)
- Gemini 2.5 → CONVERT effort to thinkingBudget
3. Else → disable reasoning
// DeepIntShield Request - Both fields provided
{
"model": "gemini-3.0-flash",
"reasoning": {
"effort": "high", // Ignored
"max_tokens": 4096 // Takes priority
}
}
// Gemini 3.0+ Request - Only budget sent
{
"generation_config": {
"thinking_config": {
"include_thoughts": true,
"thinking_budget": 4096
}
}
}

Gemini Pro models have stricter constraints on thinking levels:

DeepIntShield EffortNon-Pro ModelsPro ModelsNotes
"none"Empty stringEmpty stringDisables thinking
"minimal""minimal""low"Pro doesn’t support minimal
"low""low""low"Supported on all
"medium""medium""high"Pro doesn’t support medium
"high""high""high"Supported on all

Example:

// For "gemini-3.0-flash-thinking-exp" (non-Pro)
effort: "medium"thinkingLevel: "medium"
// For "gemini-3.0-pro" (Pro model)
effort: "medium"thinkingLevel: "high" // Converted up
ValueFieldBehaviorUse Case
0max_tokensthinking_budget: 0, include_thoughts: falseExplicitly disable reasoning
-1max_tokensthinking_budget: -1Dynamic budget (Gemini decides)
"none"effortthinking_budget: 0, include_thoughts: falseDisable reasoning
// DeepIntShield Request - Dynamic budget
{
"reasoning": {
"max_tokens": -1
}
}
// Gemini Request - Sent as-is
{
"generation_config": {
"thinking_config": {
"include_thoughts": true,
"thinking_budget": -1
}
}
}
// Gemini Response
{
"candidates": [{
"content": {
"parts": [
{
"thought": true,
"text": "Analyzing the problem..."
},
{
"text": "The answer is 42."
}
]
}
}]
}
// DeepIntShield Response
{
"choices": [{
"message": {
"content": "The answer is 42.",
"reasoning": "Analyzing the problem...",
"reasoning_details": [{
"index": 0,
"type": "text",
"text": "Analyzing the problem..."
}]
}
}]
}

DeepIntShield → Gemini (Request):

InputGemini 2.5Gemini 3.0+Note
max_tokens: 4096thinking_budget: 4096thinking_budget: 4096Direct pass-through
max_tokens: -1thinking_budget: -1thinking_budget: -1Dynamic budget
max_tokens: 0thinking_budget: 0thinking_budget: 0Disabled
effort: "high" onlythinking_budget: 3482*thinking_level: "high"Estimated or native
effort: "medium" onlythinking_budget: 2330*thinking_level: "medium" or "high"**Estimated or native
Both effort + max_tokensUses max_tokensUses max_tokensPriority rule

* Assumes max_completion_tokens: 8192 (default), uses estimation formula
** Pro models convert "medium" to "high"

Gemini → DeepIntShield (Response):

Gemini FieldDeepIntShield FieldConversion
thinking_budgetreasoning.max_tokensDirect mapping
thinking_levelreasoning.effortLevel → effort mapping
thought: true partsreasoning_details[]Array of reasoning blocks

Code References:

  • core/providers/gemini/utils.go (Chat Completions)
  • core/providers/gemini/responses.go (Responses API)
  • core/providers/gemini/types.go (Constants)

Two Reasoning Methods: Effort vs. Max Tokens

Section titled “Two Reasoning Methods: Effort vs. Max Tokens”

DeepIntShield supports two distinct reasoning models across different providers:

ModelProvidersRequest FieldNative Format
Effort-BasedOpenAI, AWS Bedrock Novareasoning.effortreasoning_effort (Chat) / effort (Responses)
Max-Tokens-BasedAnthropic, Cohere, Geminireasoning.max_tokensthinking.budget_tokens

Important: Both effort and max_tokens can be specified in a single request. DeepIntShield uses a priority hierarchy to determine which field is used.

When both effort and max_tokens are present in a request, DeepIntShield prioritizes the native compatible field for the target provider:

For Max-Tokens-Based Providers (Anthropic, Cohere, Gemini)

Section titled “For Max-Tokens-Based Providers (Anthropic, Cohere, Gemini)”
1. If reasoning.max_tokens is provided → USE IT (native field)
2. Else if reasoning.effort is provided → ESTIMATE max_tokens from effort
3. Else → disable reasoning

Example (Cohere):

// Request with both fields
{
"reasoning": {
"effort": "high",
"max_tokens": 2000
}
}

Result: Uses max_tokens: 2000 directly, ignores effort

For Effort-Based Providers (OpenAI, AWS Bedrock Nova)

Section titled “For Effort-Based Providers (OpenAI, AWS Bedrock Nova)”
1. If reasoning.effort is provided → USE IT (native field)
2. Else if reasoning.max_tokens is provided → ESTIMATE effort from max_tokens
3. Else → disable reasoning

Example (OpenAI Chat Completions):

// Request with both fields
{
"reasoning": {
"effort": "high",
"max_tokens": 2000
}
}

Result: Uses effort: "high" directly, strips max_tokens from JSON

Why Priority Matters

Reason 1: Accuracy - Native fields provide direct control without estimation loss

Reason 2: Consistency - Using native fields ensures the exact user intent is preserved

Reason 3: Performance - Avoids unnecessary conversions when native field is already provided


DeepIntShield provides two estimator functions to convert between reasoning methods. These are used when the native field is not available.

Function: GetBudgetTokensFromReasoningEffort()

File: core/providers/utils/utils.go:1350-1387

Signature:

func GetBudgetTokensFromReasoningEffort(
effort string, // "minimal", "low", "medium", "high"
minBudgetTokens int, // Provider-specific minimum (e.g., 1024 for Anthropic)
maxTokens int, // Total completion tokens available
) (int, error)

Algorithm:

1. Define ratio for effort level:
- "minimal" → 2.5% (0.025)
- "low" → 15% (0.15)
- "medium" → 42.5% (0.425)
- "high" → 80% (0.80)
2. Calculate budget:
budget = minBudgetTokens + (ratio × (maxTokens - minBudgetTokens))
3. Clamp to valid range:
if budget < minBudgetTokens → budget = minBudgetTokens
if budget > maxTokens → budget = maxTokens

Conversion Examples (with minBudgetTokens=1024, maxTokens=4096):

EffortRatioCalculationResult
minimal2.5%1024 + 0.025 × 30721101 → 1024*
low15%1024 + 0.15 × 30721485
medium42.5%1024 + 0.425 × 30722330
high80%1024 + 0.80 × 30723482

Error Handling:

if minBudgetTokens > maxTokens {
return 0, fmt.Errorf("max_tokens must be > minBudgetTokens")
}

Code Example:

// Cohere: Convert effort to token budget
budgetTokens, err := providerUtils.GetBudgetTokensFromReasoningEffort(
"high", // effort
1, // Cohere min
4096, // max completion tokens
)
// Returns: 3277 tokens

Function: GetReasoningEffortFromBudgetTokens()

File: core/providers/utils/utils.go:1308-1345

Signature:

func GetReasoningEffortFromBudgetTokens(
budgetTokens int, // Reasoning token budget
minBudgetTokens int, // Provider-specific minimum
maxTokens int, // Total completion tokens available
) string // Returns: "low", "medium", "high"

Algorithm:

1. Normalize budget to valid range:
if budget < min → budget = min
if budget > max → budget = max
2. Calculate ratio:
ratio = (budgetTokens - minBudgetTokens) / (maxTokens - minBudgetTokens)
3. Map ratio to effort level:
if ratio ≤ 0.25 → "low"
if ratio ≤ 0.60 → "medium"
if ratio > 0.60 → "high"

Conversion Examples (with minBudgetTokens=1024, maxTokens=4096):

Budget TokensRatioEffort
10240%low
11012.5%low
150015.6%low
190028.6%medium
250048.1%medium
300064.5%high
340077.6%high

Defensive Defaults:

if budgetTokens <= 0 {
return "none"
}
if maxTokens <= 0 {
return "medium" // Safe default
}
if maxTokens <= minBudgetTokens {
return "high" // Can't calculate ratio
}

Code Example:

// Convert Anthropic budget back to effort for display
effort := providerUtils.GetReasoningEffortFromBudgetTokens(
3000, // budget tokens from Anthropic response
1024, // Anthropic minimum
4096, // max tokens
)
// Returns: "high"

Different providers have different constraints on reasoning budget:

ProviderFileMinBudgetTokensReason
Anthropiccore/providers/anthropic/types.go1024Anthropic API requirement
Bedrock Anthropiccore/providers/bedrock/types.go1024Same as Anthropic
Bedrock Novacore/providers/bedrock/types.go1More flexible
Coherecore/providers/cohere/types.go1Flexible
Geminicore/providers/gemini/types.go1024Default minimum for conversions

Default Completion Tokens (for ratio calculation)

Section titled “Default Completion Tokens (for ratio calculation)”

When max_completion_tokens is not provided, these defaults are used for ratio calculations:

ProviderDefaultFile
OpenAI, Anthropic, Cohere, Bedrock4096core/providers/*/types.go
Gemini8192core/providers/gemini/types.go

Example 1: Estimate tokens from effort (Anthropic)

Section titled “Example 1: Estimate tokens from effort (Anthropic)”

Input:

{
"model": "anthropic/claude-3-5-sonnet",
"max_completion_tokens": 2000,
"reasoning": {
"effort": "high"
}
}

Conversion Process:

  1. effort = "high"ratio = 0.80
  2. minBudgetTokens = 1024 (Anthropic)
  3. maxCompletionTokens = 2000
  4. budget = 1024 + (0.80 × (2000 - 1024))
  5. budget = 1024 + (0.80 × 976)
  6. budget = 1024 + 780
  7. Result: 1804 tokens

Anthropic Request Generated:

{
"thinking": {
"type": "enabled",
"budget_tokens": 1804
}
}

Example 2: Estimate effort from tokens (Bedrock Nova)

Section titled “Example 2: Estimate effort from tokens (Bedrock Nova)”

Input:

{
"model": "bedrock/us.amazon.nova-pro-v1:0",
"max_completion_tokens": 4096,
"reasoning": {
"max_tokens": 2000
}
}

Conversion Process:

  1. budgetTokens = 2000
  2. minBudgetTokens = 1 (Nova)
  3. maxCompletionTokens = 4096
  4. ratio = (2000 - 1) / (4096 - 1)
  5. ratio = 1999 / 4095
  6. ratio = 0.488 (48.8%)
  7. Since 0.25 < 0.488 ≤ 0.60Result: “medium”

Bedrock Nova Request Generated:

{
"reasoningConfig": {
"type": "enabled",
"maxReasoningEffort": "medium"
}
}

Example 3: Both fields provided (priority used)

Section titled “Example 3: Both fields provided (priority used)”

Input:

{
"model": "anthropic/claude-3-5-sonnet",
"max_completion_tokens": 4096,
"reasoning": {
"effort": "medium",
"max_tokens": 2500
}
}

Logic for Max-Tokens-Based Provider:

  1. Check: Is max_tokens provided? → YES
  2. Use max_tokens directly (ignore effort)
  3. Validate: 2500 >= 1024? → YES

Anthropic Request Generated:

{
"thinking": {
"type": "enabled",
"budget_tokens": 2500
}
}

Note: The effort: "medium" is completely ignored because max_tokens takes priority.


All providers return reasoning in a normalized reasoning_details array:

{
"choices": [{
"message": {
"role": "assistant",
"content": "Final response text",
"reasoning_details": [
{
"index": 0,
"type": "text",
"text": "Step-by-step reasoning content...",
"signature": "optional_signature_for_verification"
}
]
}
}]
}
FieldTypeDescriptionPresent In
indexintPosition in reasoning sequenceAll
typestringContent type (text, encrypted, summary)All
textstringReasoning contentChat Completions
summarystringReasoning summaryResponses API
signaturestringCryptographic signature for verificationAnthropic, Bedrock
Reasoning TypeWhen UsedSource
reasoning.textDirect thinking/reasoning contentAnthropic, Gemini, Bedrock
reasoning.encryptedSignature-verified reasoningAnthropic, Bedrock Nova
reasoning.summarySummarized reasoning (Responses API)All providers

ProviderReasoning EventSignature Event
OpenAIreasoning (top-level)N/A
Anthropicthinking_deltasignature_delta
Bedrockthinking_deltasignature_delta
Geminithought (in content)thought_signature
// Stream events
event: content_block_start
data: {"type": "content_block_start", "content_block": {"type": "thinking"}}
event: content_block_delta
data: {"type": "content_block_delta", "delta": {"type": "thinking_delta", "thinking": "Let me"}}
event: content_block_delta
data: {"type": "content_block_delta", "delta": {"type": "thinking_delta", "thinking": " analyze..."}}
event: content_block_delta
data: {"type": "content_block_delta", "delta": {"type": "signature_delta", "signature": "EqoB..."}}
event: content_block_stop
data: {"type": "content_block_stop"}
// Thinking delta
{
"choices": [{
"delta": {
"reasoning_details": [{
"index": 0,
"type": "text",
"text": "Let me analyze..."
}]
}
}]
}
// Signature delta
{
"choices": [{
"delta": {
"reasoning_details": [{
"index": 0,
"signature": "EqoB..."
}]
}
}]
}

Minimum Budget (Anthropic/Bedrock)

Severity: High Behavior: reasoning.max_tokens must be >= 1024 Impact: Requests with lower values fail with error Workaround: Always set max_tokens >= 1024 for Anthropic/Bedrock

Dynamic Budget Not Supported

Severity: Medium Behavior: reasoning.max_tokens = -1 converted to 1024 Impact: Dynamic budgeting not available on Anthropic/Bedrock Workaround: Set explicit token budget

Effort Level Normalization

Severity: Low Behavior: OpenAI’s minimal converted to low when routing to other providers Impact: Slightly different reasoning behavior

Signature Field Provider-Specific

Severity: Low Behavior: signature field only present in Anthropic/Bedrock responses Impact: Signature-based verification only available for these providers

Thinking Type Always Enabled

Severity: Low Behavior: Anthropic’s thinking.type always set to "enabled" regardless of effort Impact: Cannot disable thinking once reasoning param is present

Gemini: Only One Parameter Sent

Severity: Medium Behavior: When both effort and max_tokens are provided, only thinkingBudget is sent to Gemini (effort is dropped) Impact: Effort value is completely ignored when max_tokens is present Workaround: Provide only the parameter you want to use

Gemini: Model Version Differences

Severity: Medium Behavior: Gemini 2.5 only supports thinkingBudget, while 3.0+ supports both thinkingBudget and thinkingLevel Impact: Effort-only requests on 2.5 are converted to budget; on 3.0+ they use native levels Note: DeepIntShield automatically detects version and uses appropriate conversion

Gemini Pro: Limited Level Support

Severity: Low Behavior: Pro models only support “low” and “high” thinking levels Impact: "minimal""low", "medium""high" for Pro models Note: Non-Pro models support all four levels: minimal, low, medium, high


ProviderModel TypeBudget TypeMin BudgetSignature Support
OpenAIEffort-basedEffort-basedNone
AnthropicThinking blocksToken budget1024
Bedrock (Anthropic)Reasoning configToken budget1024
Bedrock (Nova)Reasoning configEffort-basedNone
Gemini 2.5+Thinking configToken budget1024
Gemini 3.0+Thinking configDual (budget + level)1024
Providereffortmax_tokenssummaryStreaming
OpenAI✅ (4 levels)
Anthropic❌ (binary)
Bedrock (Anthropic)❌ (binary)
Bedrock (Nova)✅ (3 levels)⚠️ (ignored)
Gemini 2.5+⚠️ (converts to budget)
Gemini 3.0+✅ (4 levels)

Anthropic: “reasoning.max_tokens must be >= 1024”

Section titled “Anthropic: “reasoning.max_tokens must be >= 1024””

Cause: Attempting to use reasoning with max_tokens < 1024

Solution: Ensure reasoning.max_tokens >= 1024 for Anthropic/Bedrock Anthropic models

// ❌ Invalid
{"reasoning": {"effort": "high", "max_tokens": 500}}
// ✅ Valid
{"reasoning": {"effort": "high", "max_tokens": 1024}}

Cause: Using an older model that doesn’t support reasoning (e.g., gpt-4-turbo)

Solution: Use models with reasoning support: gpt-4o, gpt-4o-mini (o1 series with native reasoning)

Bedrock Nova: max_tokens parameter being ignored

Section titled “Bedrock Nova: max_tokens parameter being ignored”

Expected Behavior: Bedrock Nova uses effort-based reasoning only

Solution: Provide effort parameter instead of max_tokens for Nova models

// ✅ Correct for Nova
{"reasoning": {"effort": "high"}}