Claude Advanced Tool Use: MCP Optimization Guide
Optimize Claude with Advanced Tool Use. 85% token reduction, Tool Search Tool. Complete MCP pattern guide with 98.7% efficiency.
Key Takeaways
Beta Header: anthropic-beta: advanced-tool-use-2025-11-20
Anthropic released Advanced Tool Use features for Claude on November 20, 2025, addressing one of the most significant challenges in AI agent development: context pollution from large tool ecosystems. As applications integrate more MCP (Model Context Protocol) servers—databases, APIs, file systems, web scrapers—the token cost of loading all tool definitions into every request becomes prohibitive. A typical application with 50 tools might consume 150K tokens per request just describing available tools, burning through 75% of Claude's 200K context window before any actual work begins.
Advanced Tool Use introduces three features that transform tool management: the Tool Search Tool with defer_loading for dynamic tool discovery, Programmatic Tool Calling for code-based orchestration, and Tool Use Examples for schema-validated inputs. Combined with code-first MCP patterns that auto-generate schemas from TypeScript or Python type annotations, these features achieve 85-98% efficiency improvements while maintaining full tool access.
Advanced Tool Use: Three Features That Change Everything
Dynamic tool discovery with defer_loading
- 85% token reduction
- BM25 + regex search
- 3-5 tools per search
Code-based tool orchestration
- Python execution
- Reduced context pollution
- Loops and conditionals
Schema-validated input patterns
- input_examples field
- Complex nested objects
- Format-sensitive inputs
Tool Search Tool: Dynamic Discovery with defer_loading
The Tool Search Tool fundamentally changes how Claude interacts with large tool sets by introducing a two-phase execution model. In standard tool use, you provide all tool definitions upfront in the initial API request. Claude analyzes the user's query, selects appropriate tools from those provided, and executes them. This works well for applications with 3-5 tools but becomes inefficient with 20+ tools, as tool definitions consume significant prompt tokens that count against context limits and increase API costs.
{
"name": "query_database",
"description": "Execute SQL queries against the database. Use for retrieving, filtering, and aggregating data from any table.",
"defer_loading": true,
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The SQL query to execute"
},
"limit": {
"type": "integer",
"description": "Maximum rows to return (default: 100)"
}
},
"required": ["query"]
}
}With defer_loading: true, tools aren't loaded into Claude's context initially—they're only loaded when Claude discovers them via Tool Search. The Tool Search Tool contains lightweight metadata about all available tools (names, brief descriptions, categories), consuming approximately 2K tokens regardless of how many tools you have. When Claude receives a user query, it first calls the Tool Search Tool with a natural language description of what it needs.
Search Variants: BM25 vs Regex
| Search Variant | How It Works | Best For |
|---|---|---|
| BM25 | Natural language semantic matching using the BM25 algorithm | Most use cases (default) |
| Regex | Pattern-based exact matching with regex patterns | Known tool names, debugging, specific lookups |
Token Savings Breakdown
50 tools × 3K tokens each
= 150,000 tokens per request
75% of 200K context consumed
Tool Search (2K) + 5 tools (15K)
= 17,000 tokens per request
8.5% of 200K context consumed
89% Token Reduction: 150K → 17K tokens per request
Programmatic Tool Calling: Code-Based Orchestration
Programmatic Tool Calling enables Claude to orchestrate tools through Python code rather than individual API round-trips. Instead of Claude requesting tools one at a time with each result being returned to its context, Claude writes code that calls multiple tools, processes their outputs, and controls what information enters its context window. This eliminates context pollution from intermediate results that consume token budgets without providing value.
# Claude writes this code to orchestrate multiple tools
def process_sales_pipeline():
# Fetch 1,000 records - only code sees all of them
raw_leads = fetch_database_records(
query="SELECT * FROM leads WHERE status = 'active'",
limit=1000
)
# Process in code - filter, transform, aggregate
qualified = [
lead for lead in raw_leads
if lead['score'] > 80 and lead['last_contact_days'] < 30
]
# Enrich only the top candidates
for lead in qualified[:10]:
lead['company_info'] = fetch_company_data(lead['domain'])
# Only the final 10 enriched leads enter Claude's context
return {
'total_processed': len(raw_leads),
'qualified_count': len(qualified),
'top_leads': qualified[:10]
}Tool Use Examples: Schema-Validated Input Patterns
For tools with complex inputs, nested objects, or format-sensitive parameters, the input_examples field provides schema-validated examples that help Claude understand how to use your tools effectively. While JSON schemas define what's structurally valid, they can't express usage patterns: when to include optional parameters, which combinations make sense, or what conventions your API expects.
{
"name": "create_report",
"description": "Generate a formatted report from data sources",
"input_schema": {
"type": "object",
"properties": {
"data_source": { "type": "string" },
"format": { "enum": ["pdf", "xlsx", "html"] },
"filters": { "type": "object" },
"include_charts": { "type": "boolean" }
},
"required": ["data_source", "format"]
},
"input_examples": [
{
"data_source": "sales_2025",
"format": "pdf",
"filters": { "region": "EMEA", "quarter": "Q4" },
"include_charts": true
},
{
"data_source": "inventory_current",
"format": "xlsx",
"filters": { "warehouse": "EU-West", "status": "low_stock" }
}
]
}MCP vs Function Calling: When to Use Each
Function calling and MCP serve complementary purposes in AI agent development. Understanding when to use each approach helps you make the right architectural decisions for your application's needs.
| Aspect | Function Calling | MCP (Model Context Protocol) |
|---|---|---|
| Architecture | Embedded in API requests | Client-server separation |
| State | Stateless (each call independent) | Persistent connections |
| Reusability | Per-application | Cross-application |
| Tool Updates | Requires code deployment | Runtime dynamic updates |
| Setup Complexity | Low | Medium-High |
| Best For | Simple integrations, prototypes | Large tool sets, enterprise, multi-app |
- Building quick prototypes
- Using 5 or fewer tools
- Single-application use case
- No cross-platform needs
- Managing 10+ tools
- Reusing tools across applications
- Need dynamic tool updates at runtime
- Enterprise deployments
Code-First MCP Pattern Implementation
Code-first MCP patterns eliminate the traditional separation between code implementation and schema definition by generating schemas automatically from typed function signatures. Instead of writing a tool function and then separately maintaining a JSON schema that describes its parameters, you write a normal TypeScript or Python function with type annotations, and the MCP SDK automatically generates the schema Claude needs.
TypeScript Code-First Pattern
import { tool } from '@anthropic-ai/sdk';
interface QueryDatabaseParams {
/** The SQL query to execute against the database */
query: string;
/** Maximum rows to return (default: 100) */
limit?: number;
/** Timeout in milliseconds */
timeout?: number;
}
// Schema generated automatically from types and JSDoc
export const queryDatabase = tool<QueryDatabaseParams>({
name: 'query_database',
description: 'Execute SQL queries against the production database',
handler: async ({ query, limit = 100, timeout = 5000 }) => {
const results = await db.query(query, { limit, timeout });
return { success: true, rowCount: results.length, rows: results };
}
});Python Code-First Pattern
from anthropic import tool
from pydantic import BaseModel, Field
class QueryDatabaseParams(BaseModel):
query: str = Field(description="The SQL query to execute")
limit: int = Field(default=100, description="Maximum rows to return")
timeout: int = Field(default=5000, description="Timeout in milliseconds")
@tool
async def query_database(params: QueryDatabaseParams) -> dict:
"""Execute SQL queries against the production database.
Use this tool when you need to retrieve, filter, or aggregate
data from any database table. Supports SELECT queries only.
"""
results = await db.query(params.query, limit=params.limit)
return {"success": True, "row_count": len(results), "rows": results}Model Compatibility: Opus 4.5 vs Sonnet 4.5
| Feature | Claude Opus 4.5 | Claude Sonnet 4.5 |
|---|---|---|
| Tool Search Tool | ✓ Supported | ✓ Supported |
| Programmatic Tool Calling | ✓ Supported | ✓ Supported |
| Tool Use Examples | ✓ Supported | ✓ Supported |
| Accuracy with Tool Search | 74% (improved from 49%) | Not published |
| Context Window | 200K tokens | 200K (500K Enterprise) |
| Input Pricing | $5.00 per 1M tokens | $3.00 per 1M tokens |
| Output Pricing | $25.00 per 1M tokens | $15.00 per 1M tokens |
- • Maximum tool selection accuracy needed
- • Complex multi-step tool orchestration
- • Budget allows for premium pricing
- • Mission-critical tool decisions
- • Cost optimization is priority
- • High-volume tool usage
- • Standard accuracy is sufficient
- • Production workloads at scale
Cost Optimization: Pricing and Savings Analysis
| Model | Input (per 1M) | Output (per 1M) | Cached Input | Batch API |
|---|---|---|---|---|
| Claude Opus 4.5 | $5.00 | $25.00 | $0.50 (90% off) | $2.50 (50% off) |
| Claude Sonnet 4.5 | $3.00 | $15.00 | $0.30 (90% off) | $1.50 (50% off) |
Before: Standard Tool Use
50 tools × 3K tokens = 150K tokens/request
10,000 requests/day × 30 days = 300,000 requests
150K × 300K × $3/1M input tokens
= $13,500/month (tool definitions only)
After: Tool Search Tool
Tool Search (2K) + 5 tools (15K) = 17K tokens
300,000 requests × 17K tokens
17K × 300K × $3/1M input tokens
= $1,530/month (89% reduction)
Monthly Savings: $11,970 | Annual Savings: $143,640
Stacked Savings Potential
85%
Token reduction
90%
On cached portions
50%
Non-urgent tasks
97%
Maximum potential
MCP Security: June 2025 Specification Requirements
The June 2025 MCP specification introduced significant security requirements that affect all production deployments. These changes align MCP with enterprise security standards and address vulnerabilities identified in early implementations.
| Requirement | Specification | Status |
|---|---|---|
| OAuth 2.1 | MCP servers as resource servers only (separate auth servers) | Mandatory |
| PKCE | Proof Key for Code Exchange for all public clients | Mandatory |
| RFC 9728 | Protected Resource Metadata with WWW-Authenticate | Mandatory |
| RFC 8707 | Resource Indicators for audience-scoped tokens | Recommended |
| DPoP / mTLS | Token binding for enhanced security | Recommended |
| Session Security | Secure random IDs, no session-based auth | Mandatory |
MCP servers now act as OAuth 2.1 resource servers only. Authentication flows through your existing identity provider (Okta, Azure AD, etc.) rather than the MCP server itself.
Mandate human approval for actions with financial, security, or reputational impact. Automatically flag high-risk tool calls for human review.
When NOT to Use Advanced Tool Use: Honest Guidance
- Small tool sets (under 5 tools) — Overhead exceeds savings
- Latency-critical applications — Adds 50-100ms per search
- Consistent tool usage patterns — Same 2-3 tools every time
- Rapid prototyping — Get working first, optimize later
- Single-function integrations — Function calling is simpler
- One-off scripts — Direct API calls more efficient
- No tool reuse potential — MCP shines with reusability
- Resource-constrained environments — MCP servers have overhead
Common Mistakes: Lessons from Production Deployments
The Error: Creating 200+ micro-tools when 30 well-designed tools would suffice. Each tool adds metadata overhead and search noise.
The Impact: Lower tool selection accuracy (more similar tools to distinguish), higher maintenance burden, slower Tool Search responses.
The Fix: Consolidate related operations into logical tool groups. Aim for 20-50 tools maximum with clear, non-overlapping responsibilities.
The Error: Using auto-generated or minimal descriptions like "Query the database" instead of comprehensive 3-4 sentence descriptions.
The Impact: 15-20% lower tool selection accuracy. Claude can't distinguish between similar tools or understand edge case behavior.
The Fix: Invest 30 minutes per tool in description writing. Explain WHAT it does, WHEN to use it, WHEN NOT to use it, and all parameters.
The Error: Deploying MCP servers without implementing OAuth 2.1, PKCE, and RFC 9728 requirements introduced in June 2025.
The Impact: Security vulnerabilities, non-compliance with enterprise requirements, potential data exposure through confused deputy attacks.
The Fix: Review and implement June 2025 spec requirements before production deployment. Use authorization servers separate from MCP servers.
The Error: Launching Tool Search Tool without monitoring tool selection accuracy, latency, or usage patterns.
The Impact: Can't identify optimization opportunities, no visibility into the 6% failure cases, no data for improving descriptions.
The Fix: Track from day one: tools searched, tools selected, tool success rate, latency. Use failures to improve tool descriptions and add category filtering.
The Error: Only testing happy paths where Tool Search returns the exact tool needed. Not testing the 6% failure cases.
The Impact: Production failures when queries don't match any tool, when multiple tools partially match, or when the wrong tool is selected.
The Fix: Test: no matches, multiple partial matches, wrong tool selection, retry behavior, fallback to manual tool specification.
Migration Strategy for Existing Applications
Migrating existing Claude applications to Advanced Tool Use can happen incrementally without disrupting production systems. Start by auditing your current tool usage patterns—identify which tools get used most frequently and which are rarely called.
- • Audit current tool usage patterns
- • Identify high/low frequency tools
- • Implement Tool Search Tool in parallel
- • Set up monitoring infrastructure
- • A/B test with 10% traffic
- • Monitor selection accuracy
- • Measure latency impact
- • Validate token reduction
- • Migrate low-risk tool categories
- • Gradually increase traffic
- • Migrate business-critical tools
- • Sunset legacy approach
Conclusion
Claude Advanced Tool Use with the Tool Search Tool, Programmatic Tool Calling, and code-first MCP patterns represents a fundamental improvement in how AI applications manage tool ecosystems. The 85% token reduction from Tool Search with defer_loading eliminates the linear relationship between tool count and API costs, enabling applications to integrate dozens or hundreds of tools without prohibitive overhead.
Combined with the June 2025 MCP security specification (OAuth 2.1, PKCE, RFC 9728), these patterns provide a production-ready foundation for enterprise AI deployments. For teams building serious AI applications, the investment in Advanced Tool Use pays for itself within 1-2 months through API cost savings alone—while actually improving tool selection accuracy from 49% to 74% on complex MCP evaluations.
Ready to Transform Your Business with AI?
Discover how our AI services can help you build cutting-edge solutions with optimized tool ecosystems.
Frequently Asked Questions
Related Articles
Continue exploring with these related guides