AI Development13 min read

AI Function Calling Guide: OpenAI, Anthropic, Google

Technical guide to AI function calling across OpenAI, Anthropic, and Google APIs. Syntax comparison, implementation patterns, and best practices.

Digital Applied Team
April 1, 2026
13 min read
3

Major Providers Compared

98.7%

GPT-5.2 Thinking TAU2 Score

8.4

Anthropic Reliability Score

Mid-2026

OpenAI Assistants API Sunset

Key Takeaways

All three providers support function calling but with distinct API formats: OpenAI uses a tools array with type 'function', Anthropic uses tool definitions with input_schema, and Google uses functionDeclarations. The core loop is identical across all three: define schemas, detect tool calls, execute functions, return results. The surface-level syntax differences are what create migration friction.
Anthropic leads on tool-calling reliability at 8.4 out of 10: Anthropic scores 8.4 on tool-use reliability metrics, Google 7.9, and OpenAI 6.3 as of Q1 2026. Claude's content-block architecture separates tool calls from text responses cleanly, and the strict mode ensures schema compliance. For agentic systems with complex tool chains, reliability differences compound across multiple turns.
MCP is becoming the cross-provider standard for tool integration: Model Context Protocol, originally introduced by Anthropic in November 2024, has been adopted by OpenAI and Google. OpenAI deprecated its Assistants API in favor of MCP with a mid-2026 sunset. Building on MCP now means less migration work as the ecosystem converges.
TypeScript and Python implementations follow the same logical structure: Despite different SDK syntax in each language, the four-step pattern (define tool, call model, execute function, return result) is identical. This guide provides working code for both languages across all three providers, with copy-paste-ready examples for common use cases.

Function calling is the mechanism that turns a language model from a text generator into a tool-using agent. Instead of generating a natural language answer to “What is the weather in Tokyo?”, the model generates a structured JSON call to a get_weather function with the argument {"location": "Tokyo"}. Your code executes the function, returns the result, and the model incorporates the real data into its response.

All three major providers — OpenAI, Anthropic, and Google — support this pattern, but each uses different API formats, response structures, and SDK conventions. This guide walks through the concrete syntax for each provider, provides working TypeScript and Python code, and identifies the practical differences that affect production systems. For teams building broader AI and digital transformation pipelines, understanding these differences is essential for choosing the right provider and designing tool integrations that remain maintainable as APIs evolve.

What Is Function Calling and Why It Matters

Function calling bridges the gap between a language model's general knowledge and your application's specific capabilities. Without it, the model can only produce text. With it, the model can query databases, call APIs, update records, trigger workflows, and interact with any system you expose through a function definition.

Without Function Calling

Model generates a text approximation: “The weather in Tokyo is probably around 15-20 degrees Celsius in April.” No real data, no API calls, hallucination risk on factual questions.

With Function Calling

Model generates a structured call: get_weather({location: "Tokyo"}). Your code executes it against a real API, returns 18.2 degrees, and the model uses the actual data.

The core loop is the same across all providers. First, you define available functions using JSON schema that describes parameters, types, and descriptions. Second, you send a user message plus the tool definitions to the model. Third, the model returns either a regular text response or one or more function calls with structured arguments. Fourth, you execute the function, send the results back, and the model generates a final response incorporating the real data.

Step 1

Define Tools

JSON schema describing functions, parameters, and types

Step 2

Send Request

User message plus tool definitions sent to the model API

Step 3

Detect Calls

Model returns function calls with structured JSON arguments

Step 4

Execute & Return

Run the function, send results back, model generates answer

OpenAI: Tools Array and Responses API

OpenAI introduced function calling in June 2023 and has evolved the format through several iterations. The current API uses a tools parameter with objects of type "function". The older functions and function_call parameters were deprecated with the 2023-12-01 API version. The Responses API now also supports tool namespaces for organizing large tool sets.

OpenAI Tool Definition (TypeScript)
import OpenAI from "openai";

const client = new OpenAI();

const tools: OpenAI.ChatCompletionTool[] = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Get current weather for a location",
      parameters: {
        type: "object",
        properties: {
          location: {
            type: "string",
            description: "City and state, e.g. San Francisco, CA",
          },
          unit: {
            type: "string",
            enum: ["celsius", "fahrenheit"],
          },
        },
        required: ["location"],
        additionalProperties: false,
      },
    },
  },
];

const response = await client.chat.completions.create({
  model: "gpt-5.4",
  messages: [{ role: "user", content: "Weather in Tokyo?" }],
  tools,
  tool_choice: "auto",
});
OpenAI Tool Definition (Python)
from openai import OpenAI

client = OpenAI()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City and state, e.g. SF, CA",
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
        },
    }
]

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    tools=tools,
    tool_choice="auto",
)

When the model decides to call a function, the response includes a tool_calls array on the assistant message. Each tool call has an id, the function name, and a JSON string of arguments. You parse the arguments, execute your function, and return the result as a message with role "tool" and the matching tool_call_id.

Anthropic: Tool Use with input_schema

Anthropic calls this feature “tool use” and has a distinctly different response format from OpenAI. Claude uses a content-block architecture where tool calls and text appear as separate blocks within the assistant's response. This architectural decision means the model can interleave reasoning text with tool calls naturally, which is useful for agentic systems where the model needs to explain its decisions while acting. For a deeper look at Claude's advanced tool use and MCP integration, our dedicated guide covers the extended capabilities.

Anthropic Tool Definition (TypeScript)
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const tools: Anthropic.Tool[] = [
  {
    name: "get_weather",
    description: "Get current weather for a location",
    input_schema: {
      type: "object",
      properties: {
        location: {
          type: "string",
          description: "City and state, e.g. San Francisco, CA",
        },
        unit: {
          type: "string",
          enum: ["celsius", "fahrenheit"],
        },
      },
      required: ["location"],
    },
  },
];

const response = await client.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 1024,
  tools,
  messages: [{ role: "user", content: "Weather in Tokyo?" }],
});
Anthropic Tool Definition (Python)
import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City and state, e.g. SF, CA",
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                },
            },
            "required": ["location"],
        },
    }
]

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
)

The response from Claude contains a content array with blocks of type "text" and "tool_use". Each tool_use block has an id, the tool name, and an input object containing the parsed arguments — already a JavaScript object, not a JSON string like OpenAI. You return results as a message with role "user" containing a tool_result content block.

Strict Mode

Add strict: true to tool definitions to ensure Claude's tool calls always match your schema exactly. Prevents malformed arguments from reaching your execution layer.

Tool Search

Anthropic introduced tool search where the model pulls tool definitions on demand rather than loading everything at once. Keeps context windows cleaner for large tool sets exceeding 20-30 definitions.

Google: Function Declarations in Gemini

Google's Gemini API uses FunctionDeclaration objects wrapped in a Tool object. The format is structurally similar to OpenAI's approach but with different property names and nesting. As of March 2026, Gemini 3 model APIs generate a unique id for every function call, and passing the matching id in your functionResponse is now recommended for reliable multi-turn conversations.

Google Gemini Function Declaration (TypeScript)
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GOOGLE_AI_KEY });

const getWeatherDeclaration = {
  name: "get_weather",
  description: "Get current weather for a location",
  parameters: {
    type: "object",
    properties: {
      location: {
        type: "string",
        description: "City and state, e.g. San Francisco, CA",
      },
      unit: {
        type: "string",
        enum: ["celsius", "fahrenheit"],
      },
    },
    required: ["location"],
  },
};

const response = await ai.models.generateContent({
  model: "gemini-3.1-pro",
  contents: [{ role: "user", parts: [{ text: "Weather in Tokyo?" }] }],
  config: {
    tools: [{ functionDeclarations: [getWeatherDeclaration] }],
  },
});
Google Gemini Function Declaration (Python)
from google import genai

client = genai.Client(api_key=os.environ["GOOGLE_AI_KEY"])

get_weather = {
    "name": "get_weather",
    "description": "Get current weather for a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "City and state, e.g. SF, CA",
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
            },
        },
        "required": ["location"],
    },
}

response = client.models.generate_content(
    model="gemini-3.1-pro",
    contents="Weather in Tokyo?",
    config={
        "tools": [{"function_declarations": [get_weather]}],
    },
)

Gemini's response contains functionCall parts with the function name and arguments as a parsed object (like Anthropic, not a JSON string like OpenAI). You send results back as a functionResponse part. Google also supports automatic function execution via the automatic_function_calling configuration, where the SDK handles the call-execute-return loop for you — useful for prototyping but not recommended for production where you need control over execution and error handling.

Side-by-Side Syntax Comparison

The following table maps the equivalent concepts across all three providers. Despite different naming conventions, the underlying structure is remarkably similar — each provider needs a function name, description, parameter schema, and a way to return results.

API Concept Mapping
ConceptOpenAIAnthropicGoogle
Tool definition wrappertools[]tools[]tools[].functionDeclarations[]
Parameter schema keyparametersinput_schemaparameters
Call in responsetool_calls[]tool_use blockfunctionCall part
Arguments formatJSON stringParsed objectParsed object
Result return rolerole: "tool"tool_result in user msgfunctionResponse part
Strict modestrict: truestrict: trueallowed_function_names

Implementation Patterns: TypeScript and Python

The complete function-calling loop — send request, detect tool calls, execute functions, return results — follows the same structure in every language and provider. The following pattern shows a production-ready TypeScript implementation that handles the full multi-turn loop with proper error handling.

Full Tool Loop (TypeScript / OpenAI)
// Define your function implementations
const functions: Record<string, (args: unknown) => Promise<string>> = {
  get_weather: async (args) => {
    const { location } = args as { location: string };
    // Call your actual weather API here
    return JSON.stringify({ temp: 18.2, unit: "celsius", location });
  },
};

// The tool loop: keeps calling until the model stops requesting tools
async function runToolLoop(userMessage: string) {
  const messages: OpenAI.ChatCompletionMessageParam[] = [
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await client.chat.completions.create({
      model: "gpt-5.4",
      messages,
      tools,
    });

    const choice = response.choices[0];
    messages.push(choice.message);

    // If no tool calls, return the final text response
    if (!choice.message.tool_calls?.length) {
      return choice.message.content;
    }

    // Execute each tool call and add results
    for (const toolCall of choice.message.tool_calls) {
      const fn = functions[toolCall.function.name];
      const args = JSON.parse(toolCall.function.arguments);
      const result = await fn(args);

      messages.push({
        role: "tool",
        tool_call_id: toolCall.id,
        content: result,
      });
    }
  }
}

The equivalent Anthropic loop differs mainly in how tool calls are detected (content blocks with type tool_use) and how results are returned (as tool_result blocks in a user message). The logical structure is identical.

Full Tool Loop (TypeScript / Anthropic)
async function runToolLoop(userMessage: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await client.messages.create({
      model: "claude-opus-4-6",
      max_tokens: 4096,
      tools,
      messages,
    });

    // Add assistant response to history
    messages.push({ role: "assistant", content: response.content });

    // If model stopped naturally (not waiting for tool results)
    if (response.stop_reason !== "tool_use") {
      const textBlock = response.content.find((b) => b.type === "text");
      return textBlock?.text ?? "";
    }

    // Execute tool calls and build result blocks
    const toolResults = [];
    for (const block of response.content) {
      if (block.type === "tool_use") {
        const fn = functions[block.name];
        const result = await fn(block.input);
        toolResults.push({
          type: "tool_result" as const,
          tool_use_id: block.id,
          content: result,
        });
      }
    }

    messages.push({ role: "user", content: toolResults });
  }
}

Both implementations follow a while loop that continues until the model produces a response without requesting additional tool calls. This pattern handles chained tool calls naturally — the model can call one tool, use the result to decide which tool to call next, and repeat until it has enough information to answer the original question.

MCP: The Emerging Cross-Provider Standard

Model Context Protocol (MCP) is the most significant development in function calling since OpenAI introduced the feature in 2023. Originally launched by Anthropic in November 2024, MCP defines a standard protocol for connecting AI models to external tools and data sources. Both OpenAI and Google have adopted MCP, and OpenAI deprecated its Assistants API in favor of MCP with a mid-2026 sunset. For a comprehensive overview of how MCP fits alongside other agent protocols, see our AI agent protocol ecosystem map covering MCP, A2A, ACP, and UCP.

Write Once

MCP tool servers expose tools through a standard protocol. Build the integration once and it works with Claude, GPT, Gemini, and any MCP-compatible client without per-provider adapter code.

Server Architecture

MCP servers run as separate processes that expose tools, resources, and prompts. Clients (AI models) connect via stdio or HTTP. This separation means tool servers can be shared across applications and teams.

Dynamic Discovery

Models can discover available tools at runtime through MCP rather than receiving static definitions in every API call. This reduces context window consumption for applications with large tool sets.

The practical recommendation is straightforward: if you are building new tool integrations today, build them as MCP servers. The upfront investment is comparable to building a provider-specific integration, and the result works across all three major providers. For existing integrations using provider-specific formats, the migration priority depends on whether you need multi-provider support. Single-provider applications work fine with native tool definitions.

Error Handling and Schema Validation

Production tool-calling systems need robust error handling at three layers: schema validation (preventing malformed arguments from reaching your code), execution errors (handling failures in the functions themselves), and result formatting (returning errors in a way the model can understand and recover from).

Error Handling Pattern (TypeScript)
import { z } from "zod";

// Define schemas for validation
const weatherSchema = z.object({
  location: z.string().min(1),
  unit: z.enum(["celsius", "fahrenheit"]).optional(),
});

async function executeToolCall(
  name: string,
  args: unknown
): Promise<{ result?: string; error?: string }> {
  try {
    // 1. Validate arguments against schema
    if (name === "get_weather") {
      const parsed = weatherSchema.safeParse(args);
      if (!parsed.success) {
        return {
          error: `Invalid arguments: ${parsed.error.message}`,
        };
      }
      // 2. Execute the function
      const data = await fetchWeather(parsed.data.location);
      return { result: JSON.stringify(data) };
    }
    return { error: `Unknown function: ${name}` };
  } catch (err) {
    // 3. Return execution errors to the model
    return {
      error: `Function execution failed: ${
        err instanceof Error ? err.message : "Unknown error"
      }`,
    };
  }
}

Always validate before execution

Even with strict mode enabled, validate arguments with Zod or a JSON schema validator before passing them to your function. Defense in depth prevents edge cases from reaching your business logic.

Return structured errors, not exceptions

When a function call fails, return the error as a structured result to the model. This lets the model self-correct, retry with different arguments, or explain the failure to the user. Throwing an unhandled exception crashes the loop.

Set maximum loop iterations

The while-true tool loop should have a maximum iteration count (typically 5-10 rounds) to prevent infinite loops where the model keeps calling tools without converging on an answer. This is especially important for agentic applications.

Provider Selection: Accuracy, Latency, Cost

Choosing a provider for function-calling workloads involves three dimensions: tool-call accuracy (how reliably the model generates valid function calls), latency (time to first tool call in the response), and cost (per-token pricing at scale). The right choice depends on your application's priority ordering across these dimensions.

Anthropic (Claude Opus 4.6)

Highest reliability score (8.4), best for long-horizon autonomous agents

Higher latency on initial response, premium pricing. Excels when accuracy and reliability across multi-turn interactions matter more than speed.

OpenAI (GPT-5.4 Thinking)

Highest single-turn accuracy (98.7% TAU2), largest ecosystem

Thinking variant adds latency. Standard variant is faster but less reliable on complex tool chains. Best for teams already invested in the OpenAI ecosystem.

Google (Gemini 3.1 Pro)

Best cross-MCP coordination (69.2% MCP-Atlas), lowest latency tier with Flash-Lite

Flash-Lite is the cheapest option for high-throughput tool calls. Pro variant is competitive with frontier models. Best for teams using Google Cloud or needing multimodal tool inputs.

Conclusion

Function calling is the foundational capability that transforms language models from text generators into tool-using agents. The syntax differs across OpenAI, Anthropic, and Google, but the four-step pattern is universal: define tools, detect calls, execute functions, return results. The key differences that matter in practice are Anthropic's higher reliability scores for complex tool chains, OpenAI's strongest single-turn accuracy, Google's lowest-latency option with Flash-Lite, and the convergence toward MCP as a cross-provider standard.

For new projects starting today, the recommended approach is to build tool integrations using MCP servers that work across all providers, use an abstraction layer like the Vercel AI SDK to normalize the provider-specific differences, and implement schema validation with Zod at the execution boundary. This combination provides the flexibility to benchmark and swap providers as capabilities and pricing evolve, without rewriting your tool infrastructure. The MCP versus A2A versus ACP comparison for business leaders provides additional strategic context for teams deciding which protocol standards to invest in.

Ready to Build AI-Powered Integrations?

Function calling transforms what AI can do for your business. Our team helps companies design and implement tool-calling architectures that work across providers and scale with your needs.

Free consultation
Expert guidance
Tailored solutions

Related Articles

Continue exploring with these related guides