AI Development9 min readDecember 2025

Claude Advanced Tool Use: MCP Optimization Guide

Optimize Claude with Advanced Tool Use. 85% token reduction, Tool Search Tool. Complete MCP pattern guide with 98.7% efficiency.

Digital Applied Team
November 20, 2025• Updated December 13, 2025
9 min read

Key Takeaways

85% Token Reduction with Tool Search Tool: Advanced Tool Use with defer_loading reduces prompt token usage by 85% while maintaining full tool access, cutting costs from 150K to 17K tokens per request.
Programmatic Tool Calling for Complex Orchestration: Claude writes Python code to orchestrate multiple tools, reducing context pollution and enabling precise control flow with loops, conditionals, and error handling.
Code-First MCP Patterns: Auto-generate JSON schemas from TypeScript/Python type annotations, achieving 98.7% efficiency gains and eliminating manual schema maintenance.
June 2025 Security Spec Compliance: MCP deployments require OAuth 2.1, mandatory PKCE, and RFC 9728 Protected Resource Metadata for production readiness.
Claude Advanced Tool Use Technical Specifications
November 2025 Release - Beta Features
ReleaseNovember 20, 2025
StatusBeta
ModelsClaude Opus 4.5, Sonnet 4.5
Token SavingsUp to 85%
Accuracy49% → 74% (Opus 4 with Tool Search)
ProtocolJSON-RPC 2.0
Context200K / 500K (Enterprise) / 1M (Beta)
Transportsstdio, HTTP, SSE

Beta Header: anthropic-beta: advanced-tool-use-2025-11-20

Anthropic released Advanced Tool Use features for Claude on November 20, 2025, addressing one of the most significant challenges in AI agent development: context pollution from large tool ecosystems. As applications integrate more MCP (Model Context Protocol) servers—databases, APIs, file systems, web scrapers—the token cost of loading all tool definitions into every request becomes prohibitive. A typical application with 50 tools might consume 150K tokens per request just describing available tools, burning through 75% of Claude's 200K context window before any actual work begins.

Advanced Tool Use introduces three features that transform tool management: the Tool Search Tool with defer_loading for dynamic tool discovery, Programmatic Tool Calling for code-based orchestration, and Tool Use Examples for schema-validated inputs. Combined with code-first MCP patterns that auto-generate schemas from TypeScript or Python type annotations, these features achieve 85-98% efficiency improvements while maintaining full tool access.

Advanced Tool Use: Three Features That Change Everything

Tool Search Tool

Dynamic tool discovery with defer_loading

  • 85% token reduction
  • BM25 + regex search
  • 3-5 tools per search
Programmatic Calling

Code-based tool orchestration

  • Python execution
  • Reduced context pollution
  • Loops and conditionals
Tool Use Examples

Schema-validated input patterns

  • input_examples field
  • Complex nested objects
  • Format-sensitive inputs

Tool Search Tool: Dynamic Discovery with defer_loading

The Tool Search Tool fundamentally changes how Claude interacts with large tool sets by introducing a two-phase execution model. In standard tool use, you provide all tool definitions upfront in the initial API request. Claude analyzes the user's query, selects appropriate tools from those provided, and executes them. This works well for applications with 3-5 tools but becomes inefficient with 20+ tools, as tool definitions consume significant prompt tokens that count against context limits and increase API costs.

Tool definition with defer_loading
{
  "name": "query_database",
  "description": "Execute SQL queries against the database. Use for retrieving, filtering, and aggregating data from any table.",
  "defer_loading": true,
  "input_schema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The SQL query to execute"
      },
      "limit": {
        "type": "integer",
        "description": "Maximum rows to return (default: 100)"
      }
    },
    "required": ["query"]
  }
}

With defer_loading: true, tools aren't loaded into Claude's context initially—they're only loaded when Claude discovers them via Tool Search. The Tool Search Tool contains lightweight metadata about all available tools (names, brief descriptions, categories), consuming approximately 2K tokens regardless of how many tools you have. When Claude receives a user query, it first calls the Tool Search Tool with a natural language description of what it needs.

Search Variants: BM25 vs Regex

Search VariantHow It WorksBest For
BM25Natural language semantic matching using the BM25 algorithmMost use cases (default)
RegexPattern-based exact matching with regex patternsKnown tool names, debugging, specific lookups

Token Savings Breakdown

Before: Standard Tool Use

50 tools × 3K tokens each

= 150,000 tokens per request

75% of 200K context consumed

After: Tool Search Tool

Tool Search (2K) + 5 tools (15K)

= 17,000 tokens per request

8.5% of 200K context consumed

89% Token Reduction: 150K → 17K tokens per request

Programmatic Tool Calling: Code-Based Orchestration

Programmatic Tool Calling enables Claude to orchestrate tools through Python code rather than individual API round-trips. Instead of Claude requesting tools one at a time with each result being returned to its context, Claude writes code that calls multiple tools, processes their outputs, and controls what information enters its context window. This eliminates context pollution from intermediate results that consume token budgets without providing value.

Precise Control Flow
Loops, conditionals, and error handling are explicit in code rather than implicit in Claude's reasoning.
Reduced Context Pollution
Process 1,000 records in code but only pass the top 10 results into Claude's context window.
Fewer API Round-trips
Multiple tool calls execute in a single code block instead of sequential API requests.
Programmatic Tool Calling Example (Python)
# Claude writes this code to orchestrate multiple tools
def process_sales_pipeline():
    # Fetch 1,000 records - only code sees all of them
    raw_leads = fetch_database_records(
        query="SELECT * FROM leads WHERE status = 'active'",
        limit=1000
    )

    # Process in code - filter, transform, aggregate
    qualified = [
        lead for lead in raw_leads
        if lead['score'] > 80 and lead['last_contact_days'] < 30
    ]

    # Enrich only the top candidates
    for lead in qualified[:10]:
        lead['company_info'] = fetch_company_data(lead['domain'])

    # Only the final 10 enriched leads enter Claude's context
    return {
        'total_processed': len(raw_leads),
        'qualified_count': len(qualified),
        'top_leads': qualified[:10]
    }

Tool Use Examples: Schema-Validated Input Patterns

For tools with complex inputs, nested objects, or format-sensitive parameters, the input_examples field provides schema-validated examples that help Claude understand how to use your tools effectively. While JSON schemas define what's structurally valid, they can't express usage patterns: when to include optional parameters, which combinations make sense, or what conventions your API expects.

Tool definition with input_examples
{
  "name": "create_report",
  "description": "Generate a formatted report from data sources",
  "input_schema": {
    "type": "object",
    "properties": {
      "data_source": { "type": "string" },
      "format": { "enum": ["pdf", "xlsx", "html"] },
      "filters": { "type": "object" },
      "include_charts": { "type": "boolean" }
    },
    "required": ["data_source", "format"]
  },
  "input_examples": [
    {
      "data_source": "sales_2025",
      "format": "pdf",
      "filters": { "region": "EMEA", "quarter": "Q4" },
      "include_charts": true
    },
    {
      "data_source": "inventory_current",
      "format": "xlsx",
      "filters": { "warehouse": "EU-West", "status": "low_stock" }
    }
  ]
}

MCP vs Function Calling: When to Use Each

Function calling and MCP serve complementary purposes in AI agent development. Understanding when to use each approach helps you make the right architectural decisions for your application's needs.

AspectFunction CallingMCP (Model Context Protocol)
ArchitectureEmbedded in API requestsClient-server separation
StateStateless (each call independent)Persistent connections
ReusabilityPer-applicationCross-application
Tool UpdatesRequires code deploymentRuntime dynamic updates
Setup ComplexityLowMedium-High
Best ForSimple integrations, prototypesLarge tool sets, enterprise, multi-app
Choose Function Calling When
  • Building quick prototypes
  • Using 5 or fewer tools
  • Single-application use case
  • No cross-platform needs
Choose MCP When
  • Managing 10+ tools
  • Reusing tools across applications
  • Need dynamic tool updates at runtime
  • Enterprise deployments

Code-First MCP Pattern Implementation

Code-first MCP patterns eliminate the traditional separation between code implementation and schema definition by generating schemas automatically from typed function signatures. Instead of writing a tool function and then separately maintaining a JSON schema that describes its parameters, you write a normal TypeScript or Python function with type annotations, and the MCP SDK automatically generates the schema Claude needs.

TypeScript Code-First Pattern

TypeScript with @anthropic-ai/sdk
import { tool } from '@anthropic-ai/sdk';

interface QueryDatabaseParams {
  /** The SQL query to execute against the database */
  query: string;
  /** Maximum rows to return (default: 100) */
  limit?: number;
  /** Timeout in milliseconds */
  timeout?: number;
}

// Schema generated automatically from types and JSDoc
export const queryDatabase = tool<QueryDatabaseParams>({
  name: 'query_database',
  description: 'Execute SQL queries against the production database',
  handler: async ({ query, limit = 100, timeout = 5000 }) => {
    const results = await db.query(query, { limit, timeout });
    return { success: true, rowCount: results.length, rows: results };
  }
});

Python Code-First Pattern

Python with Pydantic and anthropic package
from anthropic import tool
from pydantic import BaseModel, Field

class QueryDatabaseParams(BaseModel):
    query: str = Field(description="The SQL query to execute")
    limit: int = Field(default=100, description="Maximum rows to return")
    timeout: int = Field(default=5000, description="Timeout in milliseconds")

@tool
async def query_database(params: QueryDatabaseParams) -> dict:
    """Execute SQL queries against the production database.

    Use this tool when you need to retrieve, filter, or aggregate
    data from any database table. Supports SELECT queries only.
    """
    results = await db.query(params.query, limit=params.limit)
    return {"success": True, "row_count": len(results), "rows": results}
1Auto-Sync Schemas
Schemas stay synchronized with code automatically. No risk of schema drift where documentation doesn't match implementation.
2Compile-Time Validation
TypeScript/Python type checking catches parameter validation errors during development, not at runtime.
3Safe Refactoring
Rename parameters using IDE refactoring tools—both implementation and schema update together.

Model Compatibility: Opus 4.5 vs Sonnet 4.5

FeatureClaude Opus 4.5Claude Sonnet 4.5
Tool Search Tool✓ Supported✓ Supported
Programmatic Tool Calling✓ Supported✓ Supported
Tool Use Examples✓ Supported✓ Supported
Accuracy with Tool Search74% (improved from 49%)Not published
Context Window200K tokens200K (500K Enterprise)
Input Pricing$5.00 per 1M tokens$3.00 per 1M tokens
Output Pricing$25.00 per 1M tokens$15.00 per 1M tokens
Choose Opus 4.5 When
  • • Maximum tool selection accuracy needed
  • • Complex multi-step tool orchestration
  • • Budget allows for premium pricing
  • • Mission-critical tool decisions
Choose Sonnet 4.5 When
  • • Cost optimization is priority
  • • High-volume tool usage
  • • Standard accuracy is sufficient
  • • Production workloads at scale

Cost Optimization: Pricing and Savings Analysis

ModelInput (per 1M)Output (per 1M)Cached InputBatch API
Claude Opus 4.5$5.00$25.00$0.50 (90% off)$2.50 (50% off)
Claude Sonnet 4.5$3.00$15.00$0.30 (90% off)$1.50 (50% off)
Monthly Savings Example: 50-Tool Application
Using Claude Sonnet 4.5 with 10,000 requests/day

Before: Standard Tool Use

50 tools × 3K tokens = 150K tokens/request

10,000 requests/day × 30 days = 300,000 requests

150K × 300K × $3/1M input tokens

= $13,500/month (tool definitions only)

After: Tool Search Tool

Tool Search (2K) + 5 tools (15K) = 17K tokens

300,000 requests × 17K tokens

17K × 300K × $3/1M input tokens

= $1,530/month (89% reduction)

Monthly Savings: $11,970 | Annual Savings: $143,640

Stacked Savings Potential

1Tool Search

85%

Token reduction

2Prompt Caching

90%

On cached portions

3Batch API

50%

Non-urgent tasks

Combined

97%

Maximum potential

MCP Security: June 2025 Specification Requirements

The June 2025 MCP specification introduced significant security requirements that affect all production deployments. These changes align MCP with enterprise security standards and address vulnerabilities identified in early implementations.

RequirementSpecificationStatus
OAuth 2.1MCP servers as resource servers only (separate auth servers)Mandatory
PKCEProof Key for Code Exchange for all public clientsMandatory
RFC 9728Protected Resource Metadata with WWW-AuthenticateMandatory
RFC 8707Resource Indicators for audience-scoped tokensRecommended
DPoP / mTLSToken binding for enhanced securityRecommended
Session SecuritySecure random IDs, no session-based authMandatory
Authorization Server Separation

MCP servers now act as OAuth 2.1 resource servers only. Authentication flows through your existing identity provider (Okta, Azure AD, etc.) rather than the MCP server itself.

Human-in-the-Loop Controls

Mandate human approval for actions with financial, security, or reputational impact. Automatically flag high-risk tool calls for human review.

When NOT to Use Advanced Tool Use: Honest Guidance

Don't Use Tool Search Tool For
  • Small tool sets (under 5 tools) — Overhead exceeds savings
  • Latency-critical applications — Adds 50-100ms per search
  • Consistent tool usage patterns — Same 2-3 tools every time
  • Rapid prototyping — Get working first, optimize later
Use Simpler Approaches When
  • Single-function integrations — Function calling is simpler
  • One-off scripts — Direct API calls more efficient
  • No tool reuse potential — MCP shines with reusability
  • Resource-constrained environments — MCP servers have overhead
Quick Decision Framework
1-5 tools→ Load all tools directly (no Tool Search needed)
5-20 tools→ Hybrid approach (keep top 3-5 loaded, defer rest)
20+ tools→ Full Tool Search Tool implementation
50+ tools→ Tool Search + categorical filtering + usage analytics

Common Mistakes: Lessons from Production Deployments

Mistake #1: Over-Granular Tool Definitions

The Error: Creating 200+ micro-tools when 30 well-designed tools would suffice. Each tool adds metadata overhead and search noise.

The Impact: Lower tool selection accuracy (more similar tools to distinguish), higher maintenance burden, slower Tool Search responses.

The Fix: Consolidate related operations into logical tool groups. Aim for 20-50 tools maximum with clear, non-overlapping responsibilities.

Mistake #2: Skipping Tool Descriptions

The Error: Using auto-generated or minimal descriptions like "Query the database" instead of comprehensive 3-4 sentence descriptions.

The Impact: 15-20% lower tool selection accuracy. Claude can't distinguish between similar tools or understand edge case behavior.

The Fix: Invest 30 minutes per tool in description writing. Explain WHAT it does, WHEN to use it, WHEN NOT to use it, and all parameters.

Mistake #3: Ignoring the June 2025 Security Spec

The Error: Deploying MCP servers without implementing OAuth 2.1, PKCE, and RFC 9728 requirements introduced in June 2025.

The Impact: Security vulnerabilities, non-compliance with enterprise requirements, potential data exposure through confused deputy attacks.

The Fix: Review and implement June 2025 spec requirements before production deployment. Use authorization servers separate from MCP servers.

Mistake #4: Deploying Without Metrics

The Error: Launching Tool Search Tool without monitoring tool selection accuracy, latency, or usage patterns.

The Impact: Can't identify optimization opportunities, no visibility into the 6% failure cases, no data for improving descriptions.

The Fix: Track from day one: tools searched, tools selected, tool success rate, latency. Use failures to improve tool descriptions and add category filtering.

Mistake #5: Not Testing Edge Cases

The Error: Only testing happy paths where Tool Search returns the exact tool needed. Not testing the 6% failure cases.

The Impact: Production failures when queries don't match any tool, when multiple tools partially match, or when the wrong tool is selected.

The Fix: Test: no matches, multiple partial matches, wrong tool selection, retry behavior, fallback to manual tool specification.

Migration Strategy for Existing Applications

Migrating existing Claude applications to Advanced Tool Use can happen incrementally without disrupting production systems. Start by auditing your current tool usage patterns—identify which tools get used most frequently and which are rarely called.

Week 1-4Audit & Plan
  • • Audit current tool usage patterns
  • • Identify high/low frequency tools
  • • Implement Tool Search Tool in parallel
  • • Set up monitoring infrastructure
Week 5-8Validate
  • • A/B test with 10% traffic
  • • Monitor selection accuracy
  • • Measure latency impact
  • • Validate token reduction
Week 9-12Roll Out
  • • Migrate low-risk tool categories
  • • Gradually increase traffic
  • • Migrate business-critical tools
  • • Sunset legacy approach

Conclusion

Claude Advanced Tool Use with the Tool Search Tool, Programmatic Tool Calling, and code-first MCP patterns represents a fundamental improvement in how AI applications manage tool ecosystems. The 85% token reduction from Tool Search with defer_loading eliminates the linear relationship between tool count and API costs, enabling applications to integrate dozens or hundreds of tools without prohibitive overhead.

Combined with the June 2025 MCP security specification (OAuth 2.1, PKCE, RFC 9728), these patterns provide a production-ready foundation for enterprise AI deployments. For teams building serious AI applications, the investment in Advanced Tool Use pays for itself within 1-2 months through API cost savings alone—while actually improving tool selection accuracy from 49% to 74% on complex MCP evaluations.

Ready to Transform Your Business with AI?

Discover how our AI services can help you build cutting-edge solutions with optimized tool ecosystems.

Free consultation
Expert guidance

Frequently Asked Questions

Related Articles

Continue exploring with these related guides