Claude Introduces Three New Beta Features: Discover, Learn, and Utilize Tools

12/01 2025 456

In the future, AI agents will empower models to collaborate seamlessly with a multitude of tools. Consider, for instance, an IDE assistant that seamlessly integrates Git operations, file management, package managers, testing frameworks, and deployment pipelines. Another example is an operations coordinator capable of simultaneously interfacing with Slack, GitHub, Google Drive, Jira, company databases, and numerous MCP servers.

To construct more efficient agents, it is crucial to leverage a vast tool library without the need to preload all definitions into the context. Anthropic previously elucidated in a paper that combining code execution with MCP could consume over 50,000 tokens even before an agent processes a request. Ideally, agents should dynamically discover and load tools as needed, retaining only those pertinent to the current task.

Moreover, agents require the capability to invoke tools programmatically, adapting flexibly to the task at hand. They must not only comprehend schema definitions but also learn the correct application of tools through examples.

Just yesterday, Anthropic unveiled three beta features that facilitate these capabilities.

Tool Search: Enables Claude to access a multitude of tools via a search mechanism without cluttering the context window.

Programmatic Tool Invocation: Allows Claude to invoke tools within a code execution environment, minimizing the impact on the model's context window.

Tool Usage Examples: Provides a standardized method for demonstrating the effective use of specific tools.

As the number of connected servers escalates, so does the accumulation of tokens.

Fifty-eight tools can consume approximately 55,000 tokens. In Anthropic's tests, tool definitions could devour 134,000 tokens prior to optimization.

The Tool Search feature does not preload all tool definitions but dynamically discovers tools as required. Claude only encounters the tools essential for the current task.

Traditionally, all tool definitions are preloaded, and conversation history and system prompts vie for the remaining space. This results in a total context consumption of roughly 77K tokens before any work commences.

With Tool Search, only the Tool Search mechanism itself is preloaded, and tools are discovered on demand. The total context consumption is reduced to merely about 8.7K tokens, preserving 95% of the context window.

Tests indicate that enabling Tool Search enhances Opus 4's accuracy from 49% to 74% and Opus 4.5's accuracy from 79.5% to 88.1%.

It is important to note that enabling Tool Search necessitates a consideration of trade-offs. This feature introduces a search step prior to tool invocation, offering the best return on investment when the benefits of context savings and accuracy improvements surpass the additional latency.

Thus, this feature is most efficacious in the following scenarios:

Tool definitions consume in excess of 10,000 tokens.

Constructing MCP systems with multiple servers.

Over 10 available tools.

As workflows grow increasingly complex, traditional tool invocation methods introduce two fundamental issues: first, context pollution caused by intermediate results; second, the necessity for a complete model inference with each tool invocation.

Programmatic Tool Invocation empowers Claude to orchestrate tools through code rather than separate API calls. Claude can invoke multiple tools by writing code and control which information ultimately enters its context window.

This feature enhances efficiency in the following ways:

Token savings: Average usage decreases from 43,588 tokens to 27,297 tokens, reducing token consumption by 37% for complex research tasks.

Reduced latency: Coordinating over 20 tool invocations in a single code block eliminates more than 19 inference processes.

Improved accuracy: Internal knowledge retrieval accuracy increases from 25.6% to 28.5%; GIA benchmark accuracy improves from 46.5% to 51.2%.

Programmatic Tool Invocation is ideally suited for processing large datasets, executing multi-step workflows, managing intermediate data, and performing operations on multiple items.

While JSON Schema excels at defining structures, it falls short in expressing usage patterns, such as when to include optional parameters or which combinations are logical.

Tool Usage Examples permit the inclusion of sample tool invocations directly within the tool definition. This enables the demonstration of specific usage patterns to Claude without relying on schemas.

Format specifications: Dates adhere to the YYYY-MM-DD format, user IDs follow the USR-XXXXX format, and tags conform to the kebab-case format.

Nested structure patterns: Construct a reporter object with nested contact objects.

Optional parameter associations: Include comprehensive contact information and escalation mechanisms with stringent service level agreements (SLAs).

In Claude's internal tests, Tool Usage Examples elevated the accuracy of complex parameter handling from 72% to 90%.

Tool Usage Examples are apt for complex nested structures, tools with numerous optional parameters, and tools incorporating schemas.

References:

https://www.anthropic.com/engineering/advanced-tool-use

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.