Token-Efficient Multi-Tool Orchestration with Progressive Disclosure
Most MCP (Model Context Protocol) clients suffer from a critical inefficiency: they load all tool definitions upfront into the context window. When you have 63 tools across 9 different servers, this creates massive token waste:
Instead of loading 63 tool definitions upfront, expose ONE tool to the LLM: execute_code. The LLM writes code that discovers and uses tools on-demand.
Key Benefits:
┌─────────────────────────────────────────────────────────────────────────┐
│ CLAUDE CODE CLI / API │
│ │
│ Context Window: 200K tokens │
│ MCP Config: ~/.claude/mcp.json │
│ Loads: code-executor ONLY (4 tools, ~100 tokens) │
└────────────────────────────────┬────────────────────────────────────────┘
│ stdio
│ docker exec -i mcp-code-executor npx tsx mcp-server.ts
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ MCP CODE-EXECUTOR SERVER │
│ │
│ Container: mcp-code-executor │
│ Exposes 4 MCP Tools: │
│ • execute_code(code, timeout) ← PRIMARY INTERFACE │
│ • search_tools(query, server, detail) ← Progressive discovery │
│ • get_tool_info(server, tool, detail) ← On-demand metadata │
│ • list_mcp_tools() ← Tool inventory │
│ │
│ MCP Server (stdio): /app/mcp-server.ts │
└────────────────────────────────┬────────────────────────────────────────┘
│ HTTP (port 9091 internal)
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ CODE EXECUTION ENGINE │
│ │
│ Fastify HTTP API: localhost:9091 │
│ Sandbox: Node 20 + Python 3 │
│ Workspace: /workspace/servers/ (63 TypeScript wrappers) │
│ │
│ POST /execute │
│ - Wraps code in async IIFE if needed │
│ - Executes in /tmp/executions (tmpfs, noexec) │
│ - Returns: output, executionTime, metrics │
│ │
│ GET /tools/search?query=&server=&detail= │
│ - Progressive disclosure API │
│ - name: 245 tokens (97% savings) │
│ - description: 1,181 tokens (85% savings) │
│ - full: ~7,685 tokens (only when needed) │
└────────────────────────────────┬────────────────────────────────────────┘
│
┌──────────────┴──────────────┐
│ │
▼ ▼
┌──────────────────────────────┐ ┌─────────────────────────────────────┐
│ TYPESCRIPT TOOL WRAPPERS │ │ MCP HTTP CLIENT │
│ (Generated) │ │ │
│ │ │ /app/client.ts │
│ /workspace/servers/ │ │ callMCPTool(server, tool, args) │
│ ├── filesystem/ │ │ │
│ │ ├── read_file.ts │ │ Connects to MCP Proxy via HTTP │
│ │ ├── write_file.ts │ └──────────────┬──────────────────────┘
│ │ └── ...9 tools │ │ HTTP
│ ├── postgres/ │ ▼
│ │ └── execute_sql.ts │ ┌─────────────────────────────────────┐
│ ├── timescaledb/ │ │ TBXark MCP PROXY │
│ ├── minio/ │ │ │
│ ├── playwright/ │ │ Container: mcp-proxy │
│ ├── memory/ │ │ Port: 9090 │
│ ├── n8n/ │ │ Transport: Streamable HTTP │
│ ├── ib/ │ │ Config: /config/config.json │
│ └── arangodb/ │ │ │
│ └── ...63 total │ │ Routes: │
└──────────────────────────────┘ │ /filesystem/mcp │
│ /postgres/mcp │
│ /timescaledb/mcp │
│ /... (9 servers total) │
└──────────────┬──────────────────────┘
│ stdio
┌──────────────────────────────┴────────────────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ MCP SERVERS │ │ MCP SERVERS │ │ MCP SERVERS │
│ │ │ │ │ │
│ Filesystem │ │ PostgreSQL │ │ Playwright │
│ (9 tools) │ │ (1 tool) │ │ (6 tools) │
│ │ │ │ │ │
│ Memory │ │ MinIO │ │ N8N │
│ (9 tools) │ │ (9 tools) │ │ (6 tools) │
│ │ │ │ │ │
│ TimescaleDB │ │ IB Trading │ │ ArangoDB │
│ (6 tools) │ │ (10 tools) │ │ (7 tools) │
└──────────────────┘ └──────────────────┘ └──────────────────┘
User: "Query database, filter results, upload to S3"
1. Load 63 tool schemas (~800 tokens)
2. Call postgres.execute_sql
→ Returns 10K rows (50K tokens!)
3. Claude filters in context window
→ Processing 50K tokens
4. Call minio.upload_object
→ Upload filtered data (20K tokens)
Total: ~70K tokens consumed
User: "Query database, filter results, upload to S3"
1. Load code-executor (~100 tokens)
2. Claude writes TypeScript code:
→ Import timescaledb wrapper
→ Import minio wrapper
→ Execute query IN sandbox
→ Filter data IN sandbox
→ Upload to S3 IN sandbox
→ Return: "Uploaded 145 rows"
Total: ~2K tokens consumed
Savings: 97.1%
File: /home/administrator/projects/mcp/code-executor/mcp-server.ts
Purpose: Exposes code execution as an MCP server for Claude Code CLI integration.
| Tool Name | Description | Parameters |
|---|---|---|
execute_code |
Execute TypeScript/JavaScript code with access to all MCP tools | code: string, timeout?: number |
search_tools |
Search available MCP tools with progressive disclosure | query?: string, server?: string, detail?: 'name'|'description'|'full' |
get_tool_info |
Get detailed information about a specific MCP tool | server: string, tool: string, detail?: string |
list_mcp_tools |
List all available MCP tools across all servers | (none) |
// mcp-server.ts - Simplified excerpt
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
const server = new Server({
name: 'code-executor',
version: '1.0.0'
});
server.setRequestHandler(CallToolRequestSchema, async (request) => {
switch (request.params.name) {
case 'execute_code':
const { code, timeout } = request.params.arguments;
// Forward to HTTP execution engine
const response = await fetch(`http://localhost:3000/execute`, {
method: 'POST',
body: JSON.stringify({ code, timeout })
});
return await response.json();
}
});
File: /home/administrator/projects/mcp/code-executor/executor.ts
Purpose: Sandboxed execution environment with security controls.
// executor.ts - Auto-wrapping logic
async function executeTypeScript(code: string, timeout: number) {
// Check if code needs async wrapping
const needsAsyncWrap = /\bawait\s/.test(code) &&
!/^\s*\(?\s*async\s*\(/.test(code);
if (needsAsyncWrap) {
// Split imports from code
const imports = code.match(/^import .+$/gm) || [];
const rest = code.replace(/^import .+$/gm, '').trim();
// Wrap non-import code in async IIFE
wrappedCode = `${imports.join('\n')}\n\n` +
`(async () => {\n${rest}\n})()` +
`.catch(err => console.error(err));`;
}
// Execute with tsx
const result = await execAsync(
`tsx ${tempFile}`,
{ timeout, maxBuffer: MAX_OUTPUT_SIZE }
);
return {
output: result.stdout,
executionTime: Date.now() - startTime,
metrics: { outputBytes: result.stdout.length }
};
}
File: /home/administrator/projects/mcp/code-executor/generate-wrappers.ts
Purpose: Auto-generates type-safe TypeScript wrappers for all 63 MCP tools.
// /workspace/servers/filesystem/read_file.ts (auto-generated)
import { callMCPTool } from '/app/client.js';
export type ReadFileInput = {
path: string;
};
export type ReadFileResponse = any;
/** Read complete contents of a file. Only works within allowed directories. */
export async function read_file(input: ReadFileInput): Promise<ReadFileResponse> {
return callMCPTool<ReadFileResponse>('filesystem', 'read_file', input);
}
/workspace/servers/
├── filesystem/ (9 tools: read_file, write_file, list_directory, ...)
├── postgres/ (1 tool: execute_sql)
├── timescaledb/ (6 tools: execute_query, list_databases, ...)
├── minio/ (9 tools: list_buckets, upload_object, ...)
├── playwright/ (6 tools: navigate, screenshot, click, ...)
├── memory/ (9 tools: entities, relations, search, ...)
├── n8n/ (6 tools: workflows, execute, ...)
├── ib/ (10 tools: get_historical_data, positions, ...)
├── arangodb/ (7 tools: query, insert, update, ...)
└── discovery.ts (utility: listServers, listTools)
File: /home/administrator/projects/mcp/code-executor/client.ts
Purpose: Communicates with TBXark MCP Proxy via HTTP to call actual MCP tools.
// client.ts - Simplified excerpt
const MCP_PROXY_BASE_URL = 'http://mcp-proxy:9090';
export async function callMCPTool<T>(
server: string,
tool: string,
args: any
): Promise<T> {
const response = await fetch(`${MCP_PROXY_BASE_URL}/${server}/mcp`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
jsonrpc: '2.0',
id: Date.now(),
method: 'tools/call',
params: { name: tool, arguments: args }
})
});
const result = await response.json();
return result.result.content[0].text;
}
Image: ghcr.io/tbxark/mcp-proxy:latest
Config: /home/administrator/projects/mcp/proxy/config.json
Purpose: HTTP gateway that translates HTTP requests to stdio MCP protocol.
{
"mcpProxy": {
"addr": ":9090",
"type": "streamable-http"
},
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem@0.6.2", "/workspace"]
},
"postgres": {
"command": "/wrappers/postgres-wrapper.sh"
},
"timescaledb": {
"command": "/wrappers/timescaledb-wrapper.sh"
}
// ... 6 more servers
}
}
File: ~/.claude/mcp.json
{
"mcpServers": {
"code-executor": {
"type": "stdio",
"command": "docker",
"args": [
"exec", "-i", "mcp-code-executor",
"npx", "tsx", "/app/mcp-server.ts"
],
"env": {
"CODE_EXECUTOR_URL": "http://localhost:3000"
}
}
}
}
Result: Claude Code loads ONLY the code-executor server (4 tools, ~100 tokens). All 63 actual tools are discovered on-demand via code execution.
{
"mcpServers": {
"filesystem": { "type": "sse", "url": "http://localhost:9073/sse" },
"postgres": { "type": "stdio", "command": "..." },
"timescaledb": { "type": "sse", "url": "http://localhost:9074/sse" },
"minio": { "type": "sse", "url": "http://localhost:9075/sse" },
"playwright": { "type": "sse", "url": "http://localhost:9076/sse" },
"memory": { "type": "sse", "url": "http://localhost:9077/sse" },
"n8n": { "type": "sse", "url": "http://localhost:9078/sse" },
"ib": { "type": "sse", "url": "http://localhost:9079/sse" },
"arangodb": { "type": "sse", "url": "http://localhost:9080/sse" }
}
}
Problem: Loads all 9 servers upfront (~800-1200 tokens baseline). Every tool call passes through context. Intermediate results bloat context. This is the inefficient pattern we replaced.
File: /home/administrator/projects/mcp/code-executor/docker-compose.yml
services:
mcp-code-executor:
build: .
container_name: mcp-code-executor
restart: unless-stopped
networks:
- mcp-net
ports:
- "9091:3000" # Internal API (not exposed to Claude directly)
environment:
- PORT=3000
- EXECUTION_TIMEOUT=300000 # 5 minutes
# Security: tmpfs with proper permissions
tmpfs:
- /workspace:size=500m,uid=1000,gid=1000
- /tmp/executions:size=100m,noexec,uid=1000,gid=1000
# Resource limits
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
# Security hardening
security_opt:
- no-new-privileges:true
read_only: true
networks:
mcp-net:
external: true
docker network create mcp-net
# Create project directory
mkdir -p ~/projects/mcp/proxy
cd ~/projects/mcp/proxy
# Create config.json (see "Configuration Files" section above)
vim config.json
# Create docker-compose.yml
cat > docker-compose.yml << 'EOF'
services:
mcp-proxy:
image: ghcr.io/tbxark/mcp-proxy:latest
container_name: mcp-proxy
restart: unless-stopped
networks:
- mcp-net
ports:
- "9090:9090"
volumes:
- ./config.json:/config/config.json:ro
command: ["/app/mcp-proxy", "-c", "/config/config.json"]
networks:
mcp-net:
external: true
EOF
# Start proxy
docker compose up -d
# Create project directory
mkdir -p ~/projects/mcp/code-executor
cd ~/projects/mcp/code-executor
# Clone or create files (see GitHub/documentation)
# Files needed:
# - Dockerfile
# - docker-compose.yml
# - package.json
# - tsconfig.json
# - mcp-server.ts
# - executor.ts
# - client.ts
# - generate-wrappers.ts
# Build and start
docker compose build
docker compose up -d
# Fix tmpfs permissions (first-time only)
docker exec -u root mcp-code-executor chown -R node:node /workspace /tmp/executions
# Generate tool wrappers
docker exec mcp-code-executor npm run generate-wrappers
# Verify
curl http://localhost:9091/health | jq
# Edit Claude Code MCP configuration
mkdir -p ~/.claude
cat > ~/.claude/mcp.json << 'EOF'
{
"mcpServers": {
"code-executor": {
"type": "stdio",
"command": "docker",
"args": [
"exec", "-i", "mcp-code-executor",
"npx", "tsx", "/app/mcp-server.ts"
],
"env": {
"CODE_EXECUTOR_URL": "http://localhost:3000"
}
}
}
}
EOF
# Remove any old MCP server configs
# Edit ~/.claude.json and remove "mcpServers" sections if present
# Restart Claude Code
# Exit current session, start new one
# In Claude Code session:
# Test 1: Check /mcp shows only code-executor
# Type: /mcp
# Expected: 1 server (code-executor)
# Test 2: List available tools
User: "Use mcp__code-executor__list_mcp_tools"
# Expected: Returns 63 tools across 9 servers
# Test 3: Execute simple code
User: "Execute this code: console.log('Hello MCP!')"
# Expected: Output: "Hello MCP!"
# Test 4: Use filesystem tool
User: "Use code execution to list files in /workspace"
# Expected: Claude writes code using filesystem wrapper, executes, returns results
# Check token usage at bottom of Claude Code responses
# Before: 800-1200 tokens baseline (with 9 servers)
# After: ~100 tokens baseline (with code-executor only)
# Multi-tool workflow example:
User: "Query timescaledb for last 100 metrics, filter values > 50, upload to minio as CSV"
# Claude should:
# 1. Write TypeScript code importing timescaledb and minio wrappers
# 2. Execute query in sandbox
# 3. Filter data in sandbox
# 4. Generate CSV in sandbox
# 5. Upload to minio in sandbox
# 6. Return: "Uploaded 73 metrics to minio bucket 'exports'"
# Token usage should be ~2K instead of ~70K (old pattern)
User: "Calculate the Fibonacci sequence up to 10 terms"
Claude writes:
const fib = [0, 1];
for (let i = 2; i < 10; i++) {
fib[i] = fib[i-1] + fib[i-2];
}
console.log(fib);
Output: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
User: "What MCP tools are available for working with files?"
Claude uses: search_tools({ query: "file", detail: "description" })
Returns:
{
"matches": [
{
"server": "filesystem",
"tool": "read_file",
"description": "Read complete contents of a file..."
},
{
"server": "filesystem",
"tool": "write_file",
"description": "Create a new file or overwrite..."
}
// ... more matches
],
"tokenCost": 245
}
User: "Read the config.json file from /workspace"
Claude writes:
const { read_file } = await import('/workspace/servers/filesystem/read_file.js');
const content = await read_file({ path: '/workspace/config.json' });
console.log(content);
Output: { "mcpProxy": { "addr": ":9090", ... } }
User: "Get yesterday's stock prices from Interactive Brokers, filter for gainers > 5%, and save to ArangoDB"
Claude writes:
const { get_historical_data } = await import('/workspace/servers/ib/get_historical_data.js');
const { arango_insert } = await import('/workspace/servers/arangodb/arango_insert.js');
// Get historical data for multiple symbols
const symbols = ['AAPL', 'GOOGL', 'MSFT', 'AMZN'];
const yesterday = new Date(Date.now() - 86400000).toISOString().split('T')[0];
const results = [];
for (const symbol of symbols) {
const data = await get_historical_data({
symbol,
startDate: yesterday,
endDate: yesterday,
barSize: '1 day'
});
// Calculate gain percentage
const gain = ((data.close - data.open) / data.open) * 100;
// Filter gainers > 5%
if (gain > 5) {
results.push({
symbol,
date: yesterday,
open: data.open,
close: data.close,
gain: gain.toFixed(2)
});
}
}
// Save to ArangoDB
await arango_insert({
collection: 'stock_gainers',
document: {
date: yesterday,
gainers: results,
count: results.length
}
});
console.log(`Found ${results.length} gainers > 5%`);
console.log(JSON.stringify(results, null, 2));
Output: Found 2 gainers > 5%
[
{ "symbol": "AAPL", "gain": "6.23", ... },
{ "symbol": "GOOGL", "gain": "5.87", ... }
]
Token Impact:
- Old pattern: Load IB tools (10) + ArangoDB tools (7) = ~2K tokens upfront
Historical data returned to context = ~30K tokens
Total: ~32K tokens
- New pattern: Load code-executor = ~100 tokens
Historical data stays in sandbox
Final output: ~500 tokens
Total: ~600 tokens
Savings: 98.1%
User: "Query postgres for user records and count how many have verified emails"
Claude writes:
const { execute_sql } = await import('/workspace/servers/postgres/execute_sql.js');
// Query returns user records with email addresses (PII!)
const result = await execute_sql({
sql: "SELECT email, email_verified FROM users"
});
// Filter in execution environment - PII never enters context
const verifiedCount = result.rows.filter(r => r.email_verified).length;
const totalCount = result.rows.length;
console.log(`Verified: ${verifiedCount}/${totalCount} (${(verifiedCount/totalCount*100).toFixed(1)}%)`);
Output: Verified: 847/1203 (70.4%)
Privacy Win: 1203 email addresses never entered the LLM context window!
| Detail Level | All 63 Tools | Per Tool | Savings vs Full | Use Case |
|---|---|---|---|---|
| name | 245 tokens | ~4 tokens | 97% | Initial discovery |
| description | 1,181 tokens | ~19 tokens | 85% | Understanding capabilities |
| full | ~7,685 tokens | ~122 tokens | 0% (baseline) | Implementation details |
Old Pattern (Direct MCP Loading):
New Pattern (Code Execution):
Savings: 99.1% (75,150 tokens saved)
Typing /mcp in Claude Code shows 9 servers instead of just code-executor.
Multiple configuration files loading simultaneously:
# 1. Check ALL config locations
cat ~/.claude/mcp.json | jq '.mcpServers | keys'
cat ~/.claude.json | jq '.mcpServers | keys'
cat ~/.claude.json | jq '.projects | to_entries[] | {path: .key, servers: .value.mcpServers | keys}'
# 2. Remove MCP servers from all locations except mcp.json
# Edit ~/.claude.json
# Set "mcpServers": {} at all levels
# 3. Verify only code-executor remains
cat ~/.claude/mcp.json
# Should show ONLY: {"mcpServers": {"code-executor": {...}}}
# 4. Full restart required
# Exit Claude Code completely, start new session
Error: Cannot find module '/workspace/servers/filesystem/read_file.js'
Tool wrappers not generated yet.
# Generate wrappers
docker exec mcp-code-executor npm run generate-wrappers
# Verify generation
docker exec mcp-code-executor ls -la /workspace/servers/
# Should show 9 directories (arangodb, filesystem, ib, memory, minio, n8n, playwright, postgres, timescaledb)
Error: EACCES: permission denied, open '/workspace/...'
tmpfs mounted by Docker as root (UID 0), but code runs as node user (UID 1000).
# Fix ownership (workaround)
docker exec -u root mcp-code-executor chown -R node:node /workspace /tmp/executions
# Permanent fix: Update docker-compose.yml
# Add uid/gid to tmpfs mount (requires Docker 20.10+)
tmpfs:
- /workspace:size=500m,uid=1000,gid=1000
- /tmp/executions:size=100m,noexec,uid=1000,gid=1000
Token usage still high (~800 tokens baseline) even with code-executor.
Claude Code loading multiple MCP servers from hidden config locations.
# Check debug logs for MCP server initialization
ls -lt ~/.claude/debug/*.txt | head -1 | awk '{print $NF}' | xargs grep "MCP server" | grep -i "initializ"
# Expected: ONLY "MCP server 'code-executor': Initializing..."
# If you see filesystem, postgres, etc.: Config cleanup incomplete
See "Multiple MCP Servers Showing" troubleshooting above.
# Check container health
docker ps | grep mcp-code-executor
docker logs mcp-code-executor --tail 50
# Test execution engine directly
curl -X POST http://localhost:9091/execute \
-H 'Content-Type: application/json' \
-d '{"code":"console.log(2+2)"}'
# List available tools
curl http://localhost:9091/health | jq
# Test MCP server (from Claude Code)
# Uses mcp__code-executor__execute_code tool
# Verify wrapper generation
docker exec mcp-code-executor find /workspace/servers -name "*.ts" | wc -l
# Expected: 73 files (63 tools + 9 indexes + discovery.ts)