Server-Side Tool Gating: How the `_tool_gating` Convention Lets MCP Servers Filter Their Own Tools

Developer Divan Visagie published a technical proposal on March 15, 2026 introducing a "server-side tool gating" pattern for the Model Context Protocol (MCP) ecosystem. The proposal targets a well-documented problem in production agent deployments: LLM tool selection accuracy collapses when agents are exposed to large numbers of tools simultaneously, with research showing naive all-tools-loaded baselines achieving as low as 14% accuracy beyond roughly 20 tools. As MCP adoption grows and multi-server configurations become common — some individual MCP servers contribute over 20,000 tokens worth of tool definitions — the token efficiency and routing accuracy problem compounds.

Visagie's solution centers on a single well-known tool named _tool_gating that any MCP server can expose alongside its regular tools. An aware client detects this tool at connection time and calls it before every LLM turn, passing the user's raw request. The server evaluates the message and returns one of three verdicts for each of its tools: exclude (remove the tool from the model's context entirely), claim (bypass the model and resolve the request directly with a deterministic tool call), or include (the default, which requires no explicit declaration). In Visagie's own test environment spanning 4 MCP servers and 33 tools, the pattern eliminated 4 irrelevant tools per read-only request, saving approximately 318 tokens per turn. The claim verdict is where the pattern earns its keep: slash commands that previously required a full LLM round-trip now resolve in milliseconds through direct tool invocation.

The proposal explicitly positions server-side gating against existing client-side alternatives. OpenAI's Agents SDK offers a tool_filter hook, Google's Agent Development Kit has its own filtering layer, Portkey built an embedding-based filter, and the STRAP pattern consolidates tools into fewer "megatools." What distinguishes Visagie's approach is that all of those alternatives place filtering intelligence outside the server that owns and understands the tools. By inverting this relationship, server authors can encode domain-specific routing logic — keyword matching, intent detection, pattern-matched commands — closer to the source. The implementation was validated in a Python MCP server (pman-mcp, using FastMCP) and Visagie's Rust agent client (chell v0.2.0), with the gating logic intentionally kept to simple keyword matching rather than ML inference.

The pattern requires no MCP specification changes and no new capability flags, relying entirely on existing MCP primitives. Unaware clients that do not implement gating-aware behavior can still discover and optionally call the _tool_gating tool, preserving backward compatibility. Visagie notes the approach was inspired by a plugin system he built for Telegram bots years before MCP existed, where each capability could declare its own relevance before the model was consulted — a layer-capability pattern he formalized at the time. The zero-spec-change requirement lowers the barrier to experimentation, and Visagie has already shipped working code; what the pattern needs now is uptake from server authors and client framework maintainers willing to treat _tool_gating as a de facto convention.