{"id":1002,"date":"2026-05-30T11:11:59","date_gmt":"2026-05-30T03:11:59","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=1002"},"modified":"2026-05-30T11:11:59","modified_gmt":"2026-05-30T03:11:59","slug":"hermes-agent-ships-tool-search-for-mcp-anthropic-evals-show-49-to-74-accuracy-gain-on-opus-4","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=1002","title":{"rendered":"Hermes Agent Ships Tool Search for MCP: Anthropic Evals Show 49% to 74% Accuracy Gain on Opus 4"},"content":{"rendered":"<p class=\"wp-block-paragraph\"><strong>Nous Research\u2019s open-source Hermes Agent now ships a Tool Search feature.<\/strong> It directly addresses a growing bottleneck in AI agent systems: too many MCP tools filling up the context window. In this explainer article, we will breaks down what Tool Search does, how it works, and when to use it.<\/p>\n<h2 class=\"wp-block-heading\"><strong>The Problem: MCP Tools Are Eating Your Context Window<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">When you connect multiple MCP (Model Context Protocol) servers to an AI agent, every tool\u2019s JSON schema gets sent to the model on every turn. This happens even if the model only needs one or two tools for a given task.<\/p>\n<p class=\"wp-block-paragraph\">Real-world deployments feel this immediately. A Hermes deployment with five MCP servers and 34 tools shows average prompt sizes of 45,000 tokens per turn. Roughly 22,000 of those tokens \u2014 around 50% \u2014 are tool schema overhead alone.<\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.anthropic.com\/engineering\/advanced-tool-use\">Anthropic\u2019s own engineering data<\/a> shows tool definitions can consume 134,000 tokens before optimization. <a href=\"https:\/\/arxiv.org\/abs\/2604.21816\" target=\"_blank\" rel=\"noreferrer noopener\">Tool Attention<\/a> measures the \u201cMCP Tools Tax\u201d at <strong>15,000\u201360,000 tokens per turn<\/strong> for typical multi-server deployments.<\/p>\n<p class=\"wp-block-paragraph\">This creates two distinct problems:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Cost<\/strong>: Cache-miss generations at session start can cost $0.07\u2013$0.10 per turn.<\/li>\n<li><strong>Accuracy loss<\/strong>: Decision paralysis sets in when the model sees hundreds of irrelevant tool options simultaneously.<\/li>\n<\/ul>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"960\" data-attachment-id=\"80205\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/05\/29\/hermes-agent-ships-tool-search-for-mcp-anthropic-evals-show-49-to-74-accuracy-gain-on-opus-4\/screenshot-2026-05-29-at-7-56-04-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-29-at-7.56.04-PM-1.png\" data-orig-size=\"1400,960\" data-comments-opened=\"0\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;,&quot;alt&quot;:&quot;&quot;}\" data-image-title=\"Screenshot 2026-05-29 at 7.56.04\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-29-at-7.56.04-PM-1-1024x702.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-29-at-7.56.04-PM-1.png\" alt=\"\" class=\"wp-image-80205\" \/><figcaption class=\"wp-element-caption\">Source: hermes-agent.nousresearch.com\/docs \u00b7 Nous Research 2026<\/figcaption><\/figure>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>What is Tool Search?<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Tool Search is Hermes Agent\u2019s opt-in progressive-disclosure layer for MCP and non-core plugin tools. Instead of loading every tool schema upfront, the model loads only what it needs \u2014 on demand, per turn.<\/p>\n<p class=\"wp-block-paragraph\"><strong>When Tool Search activates, MCP and plugin tools are replaced in the model-visible tools array by three bridge tools:<\/strong><\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">tool_search(query, limit?)   \u2014 search the deferred-tool catalog\ntool_describe(name)          \u2014 load the full schema for one tool\ntool_call(name, arguments)   \u2014 invoke a deferred tool<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\"><strong>A typical interaction looks like this:<\/strong><\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">Model: tool_search(\"create a github issue\")\n  \u2192 { matches: [{ name: \"mcp_github_create_issue\", ... }] }\nModel: tool_describe(\"mcp_github_create_issue\")\n  \u2192 { parameters: { type: \"object\", properties: { ... } } }\nModel: tool_call(\"mcp_github_create_issue\", { title: \"...\", body: \"...\" })\n  \u2192 { ok: true, issue_number: 42 }<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">The model searches for what it needs, loads the schema, then calls the tool. All hooks, guardrails, and approval prompts run against the real underlying tool name \u2014 not against the bridge.<\/p>\n<h2 class=\"wp-block-heading\"><strong>The Accuracy Numbers<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">This is not just a token-saving feature. Tool Search also <strong>improves model accuracy<\/strong> on MCP evaluations.<\/p>\n<p class=\"wp-block-paragraph\"><strong>According to <a href=\"https:\/\/www.anthropic.com\/engineering\/advanced-tool-use\">Anthropic\u2019s internal MCP<\/a> evals:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Claude Opus 4<\/strong>: accuracy improved from <strong>49% \u2192 74%<\/strong> with Tool Search enabled<\/li>\n<li><strong>Claude Opus 4.5<\/strong>: accuracy improved from <strong>79.5% \u2192 88.1%<\/strong> with Tool Search enabled<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">Large tool catalogs create \u201cdecision paralysis\u201d \u2014 the model gets confused choosing among many irrelevant options. Removing those options from the context window reduces false positives. <a href=\"https:\/\/www.anthropic.com\/engineering\/advanced-tool-use\">Anthropic\u2019s data<\/a> also shows an <strong>85% reduction in tool-definition token usage<\/strong> while maintaining access to the full tool library.<\/p>\n<h2 class=\"wp-block-heading\"><strong>How the Retrieval Works: BM25 + Fallback<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Under the hood, Hermes uses <strong>BM25<\/strong> \u2014 a classic information retrieval algorithm \u2014 to match the model\u2019s query against a catalog of tool names, descriptions, and parameter names.<\/p>\n<p class=\"wp-block-paragraph\">If BM25 returns no positive-score hits, the system falls back to a literal substring match on the tool name. This protects against zero-IDF degenerate cases, such as searching for <code>\"github\"<\/code> in a catalog where every tool name contains \u201cgithub.\u201d<\/p>\n<p class=\"wp-block-paragraph\">The catalog is <strong>stateless across turns<\/strong>. It rebuilds from the current tool-defs list on every assembly. This prevents drift bugs where a stored catalog goes out of sync with the live tool registry.<\/p>\n<h2 class=\"wp-block-heading\"><strong>When Does Tool Search Activate?<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">By default, Tool Search runs in <code>auto<\/code> mode. It activates only when the deferrable tool schemas would consume <strong>at least 10% of the active model\u2019s context window<\/strong>.<\/p>\n<p class=\"wp-block-paragraph\">Below that threshold, the tools-array assembly is a pure pass-through. You pay no overhead.<\/p>\n<p class=\"wp-block-paragraph\"><strong>This decision is re-evaluated on every turn:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li>A session with just a few MCP tools and a long-context model may never activate Tool Search.<\/li>\n<li>A session with many MCP servers attached (15+ tools typically) starts activating it.<\/li>\n<li>Removing servers mid-session correctly returns to direct tool exposure on the next assembly.<\/li>\n<\/ul>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<h2 class=\"wp-block-heading\"><strong>Configuration Reference<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Add this to your <code>hermes.yaml<\/code> to control the behavior:<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">tools:\n  tool_search:\n    enabled: auto        # auto (default), on, or off\n    threshold_pct: 10    # % of context at which auto mode kicks in\n    search_default_limit: 5\n    max_search_limit: 20<\/code><\/pre>\n<\/div>\n<\/div>\n<figure class=\"wp-block-table\">\n<table class=\"has-fixed-layout\">\n<thead>\n<tr>\n<th>Key<\/th>\n<th>Default<\/th>\n<th>Meaning<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><code>enabled<\/code><\/td>\n<td><code>auto<\/code><\/td>\n<td><code>auto<\/code> activates above threshold; <code>on<\/code> always activates if there\u2019s at least one deferrable tool; <code>off<\/code> disables entirely<\/td>\n<\/tr>\n<tr>\n<td><code>threshold_pct<\/code><\/td>\n<td><code>10<\/code><\/td>\n<td>Percentage of context length at which <code>auto<\/code> kicks in. Range: 0\u2013100<\/td>\n<\/tr>\n<tr>\n<td><code>search_default_limit<\/code><\/td>\n<td><code>5<\/code><\/td>\n<td>Hits returned when the model calls <code>tool_search<\/code> without a <code>limit<\/code><\/td>\n<\/tr>\n<tr>\n<td><code>max_search_limit<\/code><\/td>\n<td><code>20<\/code><\/td>\n<td>Hard upper bound the model can request via <code>limit<\/code>. Range: 1\u201350<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<p class=\"wp-block-paragraph\">You can also use a simple boolean shorthand:<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">tools:\n  tool_search: true   # equivalent to {enabled: auto}<\/code><\/pre>\n<\/div>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>Marktechpost\u2019s Visual Explainer<\/strong><\/h2>\n<div>\n<div>\n<div>\n<p><!-- SLIDE 1 --><\/p>\n<div class=\"hts-slide\">\n<div class=\"hts-accent\"><\/div>\n<p><span class=\"hts-tag\">Nous Research \u2014 Hermes Agent<\/span><br \/>\n<span class=\"hts-num\">01 \/ 07<\/span><\/p>\n<div class=\"hts-title\">Tool Search: Solving the MCP Context Window Problem<\/div>\n<div class=\"hts-body\">When multiple MCP servers connect to an agent, every tool\u2019s JSON schema loads into the model\u2019s context on every turn \u2014 even when only one tool is needed. Hermes Agent\u2019s Tool Search fixes this with progressive schema disclosure.<\/div>\n<div class=\"hts-stat-row\">\n<div class=\"hts-stat\">\n<span class=\"hts-stat-val\">~22K<\/span><br \/>\n<span class=\"hts-stat-label\">tokens\/turn overhead<br \/>in a 5-server, 34-tool setup<\/span>\n<\/div>\n<div class=\"hts-stat\">\n<span class=\"hts-stat-val\">85%<\/span><br \/>\n<span class=\"hts-stat-label\">reduction in tool-definition<br \/>token usage (Anthropic data)<\/span>\n<\/div>\n<div class=\"hts-stat\">\n<span class=\"hts-stat-val\">134K<\/span><br \/>\n<span class=\"hts-stat-label\">tokens consumed by tool defs<br \/>before optimization (Anthropic)<\/span>\n<\/div>\n<\/div>\n<\/div>\n<p><!-- SLIDE 2 --><\/p>\n<div class=\"hts-slide\">\n<div class=\"hts-accent\"><\/div>\n<p><span class=\"hts-tag\">The Problem<\/span><br \/>\n<span class=\"hts-num\">02 \/ 07<\/span><\/p>\n<div class=\"hts-title\">The MCP Tools Tax<\/div>\n<div class=\"hts-body\">Every connected MCP server dumps its full JSON schema into context upfront. With multiple servers, this crowds out the actual conversation and forces the model to choose from hundreds of irrelevant tools, causing decision paralysis.<\/div>\n<div class=\"hts-body\">Research paper arXiv 2604.21816 (\u201cTool Attention\u201d) measures the MCP Tools Tax at <strong>15,000\u201460,000 tokens per turn<\/strong>. Cache-miss sessions can cost <strong>$0.07\u2014$0.10 per turn<\/strong> in API spend.<\/div>\n<div class=\"hts-pill-row\">\n<span class=\"hts-pill\">GitHub: 35 tools \u2014 ~26K tokens<\/span><br \/>\n<span class=\"hts-pill\">Slack: 11 tools \u2014 ~21K tokens<\/span><br \/>\n<span class=\"hts-pill\">Jira: ~17K tokens alone<\/span>\n<\/div>\n<div class=\"hts-caption\">A five-server setup approaches 100K+ token overhead before the conversation starts.<\/div>\n<\/div>\n<p><!-- SLIDE 3 --><\/p>\n<div class=\"hts-slide\">\n<div class=\"hts-accent\"><\/div>\n<p><span class=\"hts-tag\">What Is It<\/span><br \/>\n<span class=\"hts-num\">03 \/ 07<\/span><\/p>\n<div class=\"hts-title\">Tool Search: A Progressive-Disclosure Layer<\/div>\n<div class=\"hts-body\">Tool Search is Hermes Agent\u2019s opt-in feature that replaces all MCP tool schemas in the model-visible tools array with just three lightweight bridge tools. The model loads each tool\u2019s schema on demand \u2014 only when it actually needs it.<\/div>\n<div class=\"hts-pill-row\">\n<span class=\"hts-pill hts-pill-green\">tool_search(query, limit?)<\/span><br \/>\n<span class=\"hts-pill hts-pill-green\">tool_describe(name)<\/span><br \/>\n<span class=\"hts-pill hts-pill-green\">tool_call(name, arguments)<\/span>\n<\/div>\n<div class=\"hts-rule\"><\/div>\n<div class=\"hts-body\">All hooks, guardrails, and approval prompts still run \u2014 against the real underlying tool name, not the bridge. The CLI activity feed also unwraps to show the real tool, not the bridge.<\/div>\n<\/div>\n<p><!-- SLIDE 4 --><\/p>\n<div class=\"hts-slide\">\n<div class=\"hts-accent\"><\/div>\n<p><span class=\"hts-tag\">How It Works<\/span><br \/>\n<span class=\"hts-num\">04 \/ 07<\/span><\/p>\n<div class=\"hts-title\">The Three-Step Retrieval Sequence<\/div>\n<div class=\"hts-step-row\">\n<div class=\"hts-step\">\n<div class=\"hts-step-line\"><\/div>\n<div class=\"hts-step-circle\">1<\/div>\n<p><span class=\"hts-step-name\">tool_search<\/span><br \/>\n<span class=\"hts-step-desc\">BM25 query against tool name, description and params<\/span>\n<\/p><\/div>\n<div class=\"hts-step\">\n<div class=\"hts-step-line\"><\/div>\n<div class=\"hts-step-circle\">2<\/div>\n<p><span class=\"hts-step-name\">tool_describe<\/span><br \/>\n<span class=\"hts-step-desc\">Loads full JSON schema for the matched tool into context<\/span>\n<\/p><\/div>\n<div class=\"hts-step\">\n<div class=\"hts-step-circle\">3<\/div>\n<p><span class=\"hts-step-name\">tool_call<\/span><br \/>\n<span class=\"hts-step-desc\">Bridge unwraps \u2014 real tool executes with full guardrails<\/span>\n<\/p><\/div>\n<\/div>\n<div class=\"hts-code\">Model: tool_search(\u201ccreate a github issue\u201d)<br \/>\n  \u2192 { matches: [{ name: \u201cmcp_github_create_issue\u201d }] }<br \/>\nModel: tool_describe(\u201cmcp_github_create_issue\u201d)<br \/>\n  \u2192 { parameters: { type: \u201cobject\u201d, properties: {\u2026} } }<br \/>\nModel: tool_call(\u201cmcp_github_create_issue\u201d, { title: \u201c\u2026\u201d })<br \/>\n  \u2192 { ok: true, issue_number: 42 }<\/div>\n<\/div>\n<p><!-- SLIDE 5 --><\/p>\n<div class=\"hts-slide\">\n<div class=\"hts-accent\"><\/div>\n<p><span class=\"hts-tag\">Accuracy Results<\/span><br \/>\n<span class=\"hts-num\">05 \/ 07<\/span><\/p>\n<div class=\"hts-title\">Anthropic MCP Evals Show Major Accuracy Gains<\/div>\n<div class=\"hts-body\">Large tool catalogs cause decision paralysis. Removing irrelevant schemas from context reduces false positives. Anthropic\u2019s internal MCP evaluations show significant accuracy improvements with Tool Search enabled.<\/div>\n<div class=\"hts-stat-row\">\n<div class=\"hts-stat\">\n<span class=\"hts-stat-val\">49% \u2192 74%<\/span><br \/>\n<span class=\"hts-stat-label\">Claude Opus 4<br \/>accuracy on MCP evals<\/span>\n<\/div>\n<div class=\"hts-stat\">\n<span class=\"hts-stat-val\">79.5% \u2192 88.1%<\/span><br \/>\n<span class=\"hts-stat-label\">Claude Opus 4.5<br \/>accuracy on MCP evals<\/span>\n<\/div>\n<\/div>\n<div class=\"hts-rule\"><\/div>\n<div class=\"hts-body\">Note: ~26 percentage points of accuracy is still retrieval failure on Opus 4. Smaller models perform less reliably on query formulation. Tool Search assumes the model can write a reasonable search query.<\/div>\n<\/div>\n<p><!-- SLIDE 6 --><\/p>\n<div class=\"hts-slide\">\n<div class=\"hts-accent\"><\/div>\n<p><span class=\"hts-tag\">Configuration<\/span><br \/>\n<span class=\"hts-num\">06 \/ 07<\/span><\/p>\n<div class=\"hts-title\">Setting Up Tool Search in hermes.yaml<\/div>\n<div class=\"hts-code\">tools:<br \/>\n  tool_search:<br \/>\n    enabled: auto       # auto (default), on, or off<br \/>\n    threshold_pct: 10   # % of context \u2014 auto mode only<br \/>\n    search_default_limit: 5<br \/>\n    max_search_limit: 20\n<p># Shorthand:<br \/>\ntools:<br \/>\n  tool_search: true     # equivalent to {enabled: auto}<\/p><\/div>\n<div class=\"hts-table-wrap\">\n<table class=\"hts-table\">\n<thead>\n<tr>\n<th>Key<\/th>\n<th>Default<\/th>\n<th>Meaning<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>enabled<\/td>\n<td>auto<\/td>\n<td>auto activates above threshold; on always activates; off disables<\/td>\n<\/tr>\n<tr>\n<td>threshold_pct<\/td>\n<td>10<\/td>\n<td>% of context length at which auto mode kicks in. Range: 0\u2014100<\/td>\n<\/tr>\n<tr>\n<td>search_default_limit<\/td>\n<td>5<\/td>\n<td>Hits returned when model calls tool_search without a limit<\/td>\n<\/tr>\n<tr>\n<td>max_search_limit<\/td>\n<td>20<\/td>\n<td>Hard upper bound the model can request via limit. Range: 1\u201450<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- SLIDE 7 --><\/p>\n<div class=\"hts-slide\">\n<div class=\"hts-accent\"><\/div>\n<p><span class=\"hts-tag\">Key Takeaways<\/span><br \/>\n<span class=\"hts-num\">07 \/ 07<\/span><\/p>\n<div class=\"hts-title\">When to Use It \u2014 and When Not To<\/div>\n<div class=\"hts-pill-row\">\n<span class=\"hts-pill hts-pill-green\">\u2713 15+ tools attached<\/span><br \/>\n<span class=\"hts-pill hts-pill-green\">\u2713 Few tools used per turn<\/span><br \/>\n<span class=\"hts-pill hts-pill-green\">\u2713 Multiple MCP servers<\/span><br \/>\n<span class=\"hts-pill hts-pill-amber\">\u26a0 Small toolsets \u2014 net overhead<\/span><br \/>\n<span class=\"hts-pill hts-pill-amber\">\u26a0 All tools used every turn<\/span>\n<\/div>\n<div class=\"hts-rule\"><\/div>\n<ul class=\"hts-bullets\">\n<li>Bridge tools cost ~300 tokens + at least one extra round trip per cold tool<\/li>\n<li>Deferred schemas get no system-prompt cache prefix benefit<\/li>\n<li>Catalog is stateless \u2014 rebuilds every turn, preventing drift bugs<\/li>\n<li>Security-scoped: bridge cannot access tools outside the session\u2019s granted toolsets<\/li>\n<li>Core Hermes tools (terminal, read_file, web_search, send_message\u2026) are never deferred<\/li>\n<\/ul>\n<div class=\"hts-caption\">Source: hermes-agent.nousresearch.com\/docs \u2014 Anthropic engineering blog \u2014 Nous Research 2026<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div>\n<button class=\"hts-btn\" disabled>\u2190 Prev<\/button>\n<div class=\"hts-dots\"><\/div>\n<p><span class=\"hts-slide-count\">1 \/ 7<\/span><br \/>\n<button class=\"hts-btn\">Next \u2192<\/button>\n<\/p><\/div>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h2>\n<ul class=\"wp-block-list\">\n<li>Tool Search defers MCP tool schemas until the model actually needs them \u2014 using a <code>tool_search<\/code> \/ <code>tool_describe<\/code> \/ <code>tool_call<\/code> bridge.<\/li>\n<li><a href=\"https:\/\/www.anthropic.com\/engineering\/advanced-tool-use\">Anthropic<\/a>&#8216;s evals show accuracy gains from 49% \u2192 74% on Claude Opus 4 with large tool catalogs.<\/li>\n<li>BM25 retrieval over tool name + description + parameter names powers the search, with substring fallback for zero-IDF edge cases.<\/li>\n<li><code>auto<\/code> mode (default) is self-tuning \u2014 activates only when tool schemas exceed 10% of the context window.<\/li>\n<li>Core Hermes tools are never deferred; only MCP and non-core plugin tools are eligible.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<\/p><p class=\"wp-block-paragraph\">\n<\/p><p class=\"wp-block-paragraph\">Check out\u00a0the\u00a0<strong><a href=\"https:\/\/hermes-agent.nousresearch.com\/docs\/user-guide\/features\/tool-search\" target=\"_blank\" rel=\"noreferrer noopener\">Hermes Agent Tool Search Documentation<\/a><\/strong> and <strong><a href=\"https:\/\/www.anthropic.com\/engineering\/advanced-tool-use\" target=\"_blank\" rel=\"noreferrer noopener\">Anthropic Advanced Tool Use<\/a>.\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">150k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p class=\"wp-block-paragraph\">Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/wbash1wF6efRj8G58\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Connect with us<\/mark><\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/05\/29\/hermes-agent-ships-tool-search-for-mcp-anthropic-evals-show-49-to-74-accuracy-gain-on-opus-4\/\">Hermes Agent Ships Tool Search for MCP: Anthropic Evals Show 49% to 74% Accuracy Gain on Opus 4<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Nous Research\u2019s open-source He&hellip;<\/p>\n","protected":false},"author":1,"featured_media":1003,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1002","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/1002","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1002"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/1002\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/1003"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1002"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1002"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1002"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}