You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have verified this feature I'm about to request hasn't been suggested before
Problem
When an AI agent calls an MCP tool with a large payload (e.g., updating a Confluence page with 300KB+ ADF JSON), the agent must emit the entire payload as output tokens. This hits the model's max_output_tokens limit, causing:
A single Confluence page update: ~160K tokens as tool argument
Anthropic Claude: 32K max output tokens (Sonnet), 64K (Opus)
Result: Any MCP tool call with >32K token payload is physically impossible in a single turn
This is the output-side counterpart to the input bloat that experimental.mcp_lazy (#8771) solved. Lazy-load reduced system prompt tokens; this proposal reduces output tokens.
Proposed Solution
Allow the agent to reference a local file instead of inlining the full payload. The client resolves the reference before sending to the MCP server.
functionresolveFileRefs(args: Record<string,unknown>): Record<string,unknown>{for(const[key,value]ofObject.entries(args)){if(isFileRef(value)){args[key]=fs.readFileSync(value.$file,'utf-8');}}returnargs;}functionisFileRef(v: unknown): v is {$file: string}{returntypeofv==='object'&&v!==null&&'$file'inv&&typeofv.$file==='string';}
Workflow:
Agent writes large content to a temp file using existing Write/Bash tools
Agent calls MCP tool with {"$file": "/tmp/payload.json"} as argument
Client detects $file reference, reads file, injects content into argument
MCP server receives the full payload as normal — no protocol change needed
Industry precedent — Claude Code community validated this pattern via PreToolUse hooks (anthropics/claude-code#45770); MCP spec has SEP-2356 (file input for tools) in draft
Security Considerations
Only resolve files under allowed paths (e.g., temp dirs, workspace)
Problem
When an AI agent calls an MCP tool with a large payload (e.g., updating a Confluence page with 300KB+ ADF JSON), the agent must emit the entire payload as output tokens. This hits the model's
max_output_tokenslimit, causing:Quantified impact:
This is the output-side counterpart to the input bloat that
experimental.mcp_lazy(#8771) solved. Lazy-load reduced system prompt tokens; this proposal reduces output tokens.Proposed Solution
Allow the agent to reference a local file instead of inlining the full payload. The client resolves the reference before sending to the MCP server.
Agent outputs (~50 bytes instead of 160K tokens):
{ "tool": "confluence_update_page", "arguments": { "page_id": "123456", "content": { "$file": "/tmp/adf-payload.json" } } }Client intercepts and resolves:
Workflow:
{"$file": "/tmp/payload.json"}as argument$filereference, reads file, injects content into argumentConfiguration (opt-in):
Why This Matters
Security Considerations