Skip to content

Code Mode

Code Mode is a transformative approach to using MCP that solves a critical problem at scale:

The Problem: When you connect 8-10 MCP servers (150+ tools), every single request includes all tool definitions in the context. The LLM spends most of its budget reading tool catalogs instead of doing actual work.

The Solution: Instead of exposing 150 tools directly, Code Mode exposes just four generic tools. The LLM uses those tools to write Python code (Starlark) that orchestrates everything else in a sandbox.

Compare a workflow across 5 MCP servers with ~100 tools:

Classic MCP Flow:

  • 6 LLM turns
  • 100 tools in context every turn (600 tool-definition tokens)
  • All intermediate results flow through the model

Code Mode Flow:

  • 3-4 LLM turns
  • Only 4 tools + definitions on-demand
  • Intermediate results processed in sandbox

Result: ~50% cost reduction + 30-40% faster execution

Code Mode provides four meta-tools to the AI:

  1. listToolFiles - Discover available MCP servers
  2. readToolFile - Load Python stub signatures on-demand
  3. getToolDocs - Get detailed documentation for a specific tool
  4. executeToolCode - Execute Python code with full tool bindings

Enable Code Mode if you have:

  • ✅ 3+ MCP servers connected
  • ✅ Complex multi-step workflows
  • ✅ Concerned about token costs or latency
  • ✅ Tools that need to interact with each other

Keep Classic MCP if you have:

  • ✅ Only 1-2 small MCP servers
  • ✅ Simple, direct tool calls
  • ✅ Very latency-sensitive use cases (though Code Mode is usually faster)

You can mix both: Enable Code Mode for “heavy” servers (web, documents, databases) and keep small utilities as direct tools.


Instead of seeing 150+ tool definitions, the model sees four generic tools:

graph LR
LLM["<b>LLM Context</b><br/><i>Compact & Efficient</i>"]
List["<b>listToolFiles</b><br/>Discover servers"]
Read["<b>readToolFile</b><br/>Load signatures"]
Docs["<b>getToolDocs</b><br/>Get detailed docs"]
Execute["<b>executeToolCode</b><br/>Run code with bindings"]
Hidden["<i>All other MCP servers<br/>hidden behind these 4 tools</i>"]
LLM --> List
LLM --> Read
LLM --> Docs
LLM --> Execute
List -.-> Hidden
Read -.-> Hidden
Docs -.-> Hidden
Execute -.-> Hidden
style LLM fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
style List fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A
style Read fill:#FFF3E0,stroke:#BF360C,stroke-width:2.5px,color:#1A1A1A
style Docs fill:#E1F5FE,stroke:#0288D1,stroke-width:2.5px,color:#1A1A1A
style Execute fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
style Hidden fill:#EEEEEE,stroke:#424242,stroke-width:1.5px,stroke-dasharray: 5 5,color:#1A1A1A
graph LR
User["<b>1. User Request</b><br/>Search YouTube<br/>& save to file"]
Discover["<b>2. Discover Tools</b><br/>listToolFiles()"]
GetDefs["<b>3. Load Definitions</b><br/>readToolFile()"]
Write["<b>4. Write Code</b><br/>Python<br/>in sandbox"]
Execute["<b>5. Execute</b><br/>Real MCP calls<br/>contained in VM"]
Result["<b>6. Compact Result</b><br/>{saved:10}"]
Response["<b>7. Final Response</b><br/>Found & saved<br/>10 videos"]
User --> Discover
Discover --> GetDefs
GetDefs --> Write
Write --> Execute
Execute --> Result
Result --> Response
style User fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
style Discover fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
style GetDefs fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
style Write fill:#FFF3E0,stroke:#BF360C,stroke-width:2.5px,color:#1A1A1A
style Execute fill:#E8F5E9,stroke:#1B5E20,stroke-width:3px,color:#1A1A1A
style Result fill:#FFFDE7,stroke:#F57F17,stroke-width:2.5px,color:#1A1A1A
style Response fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A

Key insight: All the complex orchestration happens inside the sandbox. The LLM only receives the final, compact result—not every intermediate step.


Turn 1: Prompt + search query + [100 tool definitions]
Turn 2: Prompt + search result + [100 tool definitions]
Turn 3: Prompt + channel list + [100 tool definitions]
Turn 4: Prompt + video list + [100 tool definitions]
Turn 5: Prompt + summaries + [100 tool definitions]
Turn 6: Prompt + doc result + [100 tool definitions]
Total: 6 LLM calls, ~600+ tokens in tool definitions alone
Turn 1: Prompt + 4 tools (listToolFiles, readToolFile, getToolDocs, executeToolCode)
Turn 2: Prompt + server list + 4 tools
Turn 3: Prompt + selected definitions + 4 tools + [EXECUTES CODE]
[YouTube search, channel list, videos, summaries, doc creation all happen in sandbox]
Turn 4: Prompt + final result + 4 tools
Total: 3-4 LLM calls, ~50 tokens in tool definitions
Result: 50% cost reduction, 3-4x fewer LLM round trips

Code Mode must be enabled per MCP client. Once enabled, that client’s tools are accessed through the four meta-tools rather than exposed directly.

Best practice: Enable Code Mode for 3+ servers or any “heavy” server (web search, documents, databases).

  1. Navigate to MCP Gateway in the sidebar
  2. Click on a client row to open the configuration sheet
MCP Client Configuration
  1. In the Basic Information section, toggle Code Mode Client to enabled
  2. Click Save Changes

Once enabled:

  • This client’s tools are no longer in the default tool list
  • They become accessible through listToolFiles() and readToolFile()
  • The AI can write code using executeToolCode() to call them
mcpConfig := &schemas.MCPConfig{
ClientConfigs: []schemas.MCPClientConfig{
{
Name: "youtube",
ConnectionType: schemas.MCPConnectionTypeHTTP,
ConnectionString: deepintshield.Ptr("http://localhost:3001/mcp"),
ToolsToExecute: []string{"*"},
IsCodeModeClient: true, // Enable code mode
},
{
Name: "filesystem",
ConnectionType: schemas.MCPConnectionTypeSTDIO,
StdioConfig: &schemas.MCPStdioConfig{
Command: "npx",
Args: []string{"-y", "@anthropic/mcp-filesystem"},
},
ToolsToExecute: []string{"*"},
IsCodeModeClient: true, // Enable code mode
},
},
}

When Code Mode clients are connected, DeepIntShield automatically adds four meta-tools to every request:

Lists all available virtual .pyi stub files for connected code mode servers.

Example output (Server-level binding):

servers/
youtube.pyi
filesystem.pyi

Example output (Tool-level binding):

servers/
youtube/
search.pyi
get_video.pyi
filesystem/
read_file.pyi
write_file.pyi

Reads a virtual .pyi file to get compact Python function signatures for tools.

Parameters:

  • fileName (required): Path like servers/youtube.pyi or servers/youtube/search.pyi
  • startLine (optional): 1-based starting line for partial reads
  • endLine (optional): 1-based ending line for partial reads

Example output:

# youtube server tools
# Usage: youtube.tool_name(param=value)
# For detailed docs: use getToolDocs(server="youtube", tool="tool_name")
def search(query: str, maxResults: int = None) -> dict: # Search for videos
def get_video(id: str) -> dict: # Get video details

Get detailed documentation for a specific tool when the compact signature from readToolFile is not sufficient.

Parameters:

  • server (required): The server name (e.g., "youtube")
  • tool (required): The tool name (e.g., "search")

Example output:

# ============================================================================
# Documentation for youtube.search tool
# ============================================================================
#
# USAGE INSTRUCTIONS:
# Call tools using: result = youtube.tool_name(param=value)
# No async/await needed - calls are synchronous.
#
# CRITICAL - HANDLING RESPONSES:
# Tool responses are dicts. To avoid runtime errors:
# 1. Use print(result) to inspect the response structure first
# 2. Access dict values with brackets: result["key"] NOT result.key
# 3. Use .get() for safe access: result.get("key", default)
# ============================================================================
def search(query: str, maxResults: int = None) -> dict:
"""
Search for videos on YouTube.
Args:
query (str): Search query (required)
maxResults (int): Max results to return (optional)
Returns:
dict: Response from the tool. Structure varies by tool.
Use print(result) to inspect the actual structure.
Example:
result = youtube.search(query="...")
print(result) # Always inspect response first!
value = result.get("key", default) # Safe access
"""
...

Executes Python code in a sandboxed Starlark interpreter with access to all code mode server tools.

Parameters:

  • code (required): Python code to execute

Execution Environment:

  • Python code runs in a Starlark interpreter (Python subset)
  • All code mode servers are exposed as global objects (e.g., youtube, filesystem)
  • Tool calls are synchronous - no async/await needed
  • Use print() for logging (output captured in logs)
  • Assign to result variable to return a value
  • Tool execution timeout applies (default 30s)

Syntax notes:

  • Use keyword arguments: server.tool(param="value") NOT server.tool({"param": "value"})
  • Access dict values with brackets: result["key"] NOT result.key
  • List comprehensions work: [x for x in items if x["active"]]

Example code:

# Search YouTube and return formatted results
results = youtube.search(query="AI news", maxResults=5)
titles = [item["snippet"]["title"] for item in results["items"]]
print("Found", len(titles), "videos")
result = {"titles": titles, "count": len(titles)}

Code Mode supports two binding levels that control how tools are organized in the virtual file system:

All tools from a server are grouped into a single .pyi file.

servers/
youtube.pyi ← Contains all youtube tools
filesystem.pyi ← Contains all filesystem tools

Best for:

  • Servers with few tools
  • When you want to see all tools at once
  • Simpler discovery workflow

Each tool gets its own .pyi file.

servers/
youtube/
search.pyi
get_video.pyi
get_channel.pyi
filesystem/
read_file.pyi
write_file.pyi
list_directory.pyi

Best for:

  • Servers with many tools
  • When tools have large/complex schemas
  • More focused documentation per tool

Binding level is a global setting that controls how Code Mode’s virtual file system is organized. It affects how the AI discovers and loads tool definitions.

Binding level can be viewed in the MCP configuration overview:

MCP Gateway Configuration
  • Server-level (default): One .pyi file per MCP server

    • Use when: 5-20 tools per server, want simple discovery
    • Example: servers/youtube.pyi contains all YouTube tools
  • Tool-level: One .pyi file per individual tool

    • Use when: 30+ tools per server, want minimal context bloat
    • Example: servers/youtube/search.pyi, servers/youtube/list_channels.pyi

Both modes use the same four-tool interface (listToolFiles, readToolFile, getToolDocs, executeToolCode). The choice is purely about context efficiency per read operation.


Code Mode tools can be auto-executed in Agent Mode, but with additional validation:

  1. The listToolFiles and readToolFile tools are always auto-executable (they’re read-only)
  2. The executeToolCode tool is auto-executable only if all tool calls within the code are allowed

When executeToolCode is called in agent mode:

  1. DeepIntShield parses the Python code
  2. Extracts all serverName.toolName() calls
  3. Checks each call against tools_to_auto_execute for that server
  4. If ALL calls are allowed → auto-execute
  5. If ANY call is not allowed → return to user for approval

Example:

{
"name": "youtube",
"tools_to_execute": ["*"],
"tools_to_auto_execute": ["search"],
"is_code_mode_client": true
}
# This code WILL auto-execute (only uses search)
results = youtube.search(query="AI")
result = results
# This code will NOT auto-execute (uses delete_video which is not in auto-execute list)
youtube.delete_video(id="abc123")

AvailableNot Available
Python-like syntaximport statements
Synchronous tool callsClasses (use dicts)
print() for loggingFile I/O
Dict/List operationsNetwork access
List comprehensionsrandom, time modules

Engine: Starlark interpreter (Python subset)

Tool Exposure: Tools from code mode clients are exposed as global objects:

# If you have a 'youtube' code mode client with a 'search' tool
results = youtube.search(query="AI news")

Code Processing:

  1. Code is validated for syntax errors
  2. Tool calls are extracted and validated
  3. Code executes in isolated Starlark context
  4. Result variable is automatically serialized to JSON

Execution Limits:

  • Default timeout: 30 seconds per tool execution
  • Memory isolation: Each execution gets its own context
  • No access to host file system or network
  • Logs captured from print() calls

DeepIntShield provides detailed error messages with hints:

# Error: youtube is not defined
# Hints:
# - Variable or identifier 'youtube' is not defined
# - Available server keys: youtubeAPI, filesystem
# - Use one of the available server keys as the object name
  • Default: 30 seconds per tool call
  • Configure via tool_execution_timeout in tool_manager_config
  • Long-running operations are interrupted with timeout error

Scenario: E-commerce Assistant with Multiple Services

Section titled “Scenario: E-commerce Assistant with Multiple Services”

Setup:

  • 10 MCP servers (product catalog, inventory, payments, shipping, chat, analytics, docs, images, calendar, notifications)
  • Average 15 tools per server = 150 total tools
  • Complex multi-step task: “Find matching products, check inventory, compare prices, get shipping estimate, create quote”
MetricValue
LLM Turns8-10
Tokens in Tool Defs~2,400 per turn
Avg Request Tokens4,000-5,000
Avg Total Cost$3.20-4.00
Latency18-25 seconds

Problem: Most context goes to tool definitions. Model makes redundant tool calls. Every intermediate result travels back through the LLM.

MetricValue
LLM Turns3-4
Tokens in Tool Defs~100-300 per turn
Avg Request Tokens1,500-2,000
Avg Total Cost$1.20-1.80
Latency8-12 seconds

Benefit: Model writes one Python script. All orchestration happens in sandbox. Only compact result returned to LLM.


Agent Mode

Combine Code Mode with auto-execution

Open →

MCP Gateway URL

Expose your tools to external clients

Open →