AI Function Calling Guide: OpenAI, Anthropic, Google
Technical guide to AI function calling across OpenAI, Anthropic, and Google APIs. Syntax comparison, implementation patterns, and best practices.
Major Providers Compared
GPT-5.2 Thinking TAU2 Score
Anthropic Reliability Score
OpenAI Assistants API Sunset
Key Takeaways
Function calling is the mechanism that turns a language model from a text generator into a tool-using agent. Instead of generating a natural language answer to “What is the weather in Tokyo?”, the model generates a structured JSON call to a get_weather function with the argument {"location": "Tokyo"}. Your code executes the function, returns the result, and the model incorporates the real data into its response.
All three major providers — OpenAI, Anthropic, and Google — support this pattern, but each uses different API formats, response structures, and SDK conventions. This guide walks through the concrete syntax for each provider, provides working TypeScript and Python code, and identifies the practical differences that affect production systems. For teams building broader AI and digital transformation pipelines, understanding these differences is essential for choosing the right provider and designing tool integrations that remain maintainable as APIs evolve.
What Is Function Calling and Why It Matters
Function calling bridges the gap between a language model's general knowledge and your application's specific capabilities. Without it, the model can only produce text. With it, the model can query databases, call APIs, update records, trigger workflows, and interact with any system you expose through a function definition.
Model generates a text approximation: “The weather in Tokyo is probably around 15-20 degrees Celsius in April.” No real data, no API calls, hallucination risk on factual questions.
Model generates a structured call: get_weather({location: "Tokyo"}). Your code executes it against a real API, returns 18.2 degrees, and the model uses the actual data.
The core loop is the same across all providers. First, you define available functions using JSON schema that describes parameters, types, and descriptions. Second, you send a user message plus the tool definitions to the model. Third, the model returns either a regular text response or one or more function calls with structured arguments. Fourth, you execute the function, send the results back, and the model generates a final response incorporating the real data.
Define Tools
JSON schema describing functions, parameters, and types
Send Request
User message plus tool definitions sent to the model API
Detect Calls
Model returns function calls with structured JSON arguments
Execute & Return
Run the function, send results back, model generates answer
OpenAI: Tools Array and Responses API
OpenAI introduced function calling in June 2023 and has evolved the format through several iterations. The current API uses a tools parameter with objects of type "function". The older functions and function_call parameters were deprecated with the 2023-12-01 API version. The Responses API now also supports tool namespaces for organizing large tool sets.
import OpenAI from "openai";
const client = new OpenAI();
const tools: OpenAI.ChatCompletionTool[] = [
{
type: "function",
function: {
name: "get_weather",
description: "Get current weather for a location",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "City and state, e.g. San Francisco, CA",
},
unit: {
type: "string",
enum: ["celsius", "fahrenheit"],
},
},
required: ["location"],
additionalProperties: false,
},
},
},
];
const response = await client.chat.completions.create({
model: "gpt-5.4",
messages: [{ role: "user", content: "Weather in Tokyo?" }],
tools,
tool_choice: "auto",
});from openai import OpenAI
client = OpenAI()
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g. SF, CA",
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
},
},
"required": ["location"],
"additionalProperties": False,
},
},
}
]
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Weather in Tokyo?"}],
tools=tools,
tool_choice="auto",
)When the model decides to call a function, the response includes a tool_calls array on the assistant message. Each tool call has an id, the function name, and a JSON string of arguments. You parse the arguments, execute your function, and return the result as a message with role "tool" and the matching tool_call_id.
Key detail: OpenAI's newer Responses API supports tool namespaces — grouping related tools under a namespace object with a name, description, and tools array. This is useful when exposing more than 10-15 tools, as it helps the model select the right tool without overloading the context.
Anthropic: Tool Use with input_schema
Anthropic calls this feature “tool use” and has a distinctly different response format from OpenAI. Claude uses a content-block architecture where tool calls and text appear as separate blocks within the assistant's response. This architectural decision means the model can interleave reasoning text with tool calls naturally, which is useful for agentic systems where the model needs to explain its decisions while acting. For a deeper look at Claude's advanced tool use and MCP integration, our dedicated guide covers the extended capabilities.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const tools: Anthropic.Tool[] = [
{
name: "get_weather",
description: "Get current weather for a location",
input_schema: {
type: "object",
properties: {
location: {
type: "string",
description: "City and state, e.g. San Francisco, CA",
},
unit: {
type: "string",
enum: ["celsius", "fahrenheit"],
},
},
required: ["location"],
},
},
];
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 1024,
tools,
messages: [{ role: "user", content: "Weather in Tokyo?" }],
});import anthropic
client = anthropic.Anthropic()
tools = [
{
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g. SF, CA",
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
},
},
"required": ["location"],
},
}
]
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "Weather in Tokyo?"}],
)The response from Claude contains a content array with blocks of type "text" and "tool_use". Each tool_use block has an id, the tool name, and an input object containing the parsed arguments — already a JavaScript object, not a JSON string like OpenAI. You return results as a message with role "user" containing a tool_result content block.
Add strict: true to tool definitions to ensure Claude's tool calls always match your schema exactly. Prevents malformed arguments from reaching your execution layer.
Anthropic introduced tool search where the model pulls tool definitions on demand rather than loading everything at once. Keeps context windows cleaner for large tool sets exceeding 20-30 definitions.
Google: Function Declarations in Gemini
Google's Gemini API uses FunctionDeclaration objects wrapped in a Tool object. The format is structurally similar to OpenAI's approach but with different property names and nesting. As of March 2026, Gemini 3 model APIs generate a unique id for every function call, and passing the matching id in your functionResponse is now recommended for reliable multi-turn conversations.
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: process.env.GOOGLE_AI_KEY });
const getWeatherDeclaration = {
name: "get_weather",
description: "Get current weather for a location",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "City and state, e.g. San Francisco, CA",
},
unit: {
type: "string",
enum: ["celsius", "fahrenheit"],
},
},
required: ["location"],
},
};
const response = await ai.models.generateContent({
model: "gemini-3.1-pro",
contents: [{ role: "user", parts: [{ text: "Weather in Tokyo?" }] }],
config: {
tools: [{ functionDeclarations: [getWeatherDeclaration] }],
},
});from google import genai
client = genai.Client(api_key=os.environ["GOOGLE_AI_KEY"])
get_weather = {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g. SF, CA",
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
},
},
"required": ["location"],
},
}
response = client.models.generate_content(
model="gemini-3.1-pro",
contents="Weather in Tokyo?",
config={
"tools": [{"function_declarations": [get_weather]}],
},
)Gemini's response contains functionCall parts with the function name and arguments as a parsed object (like Anthropic, not a JSON string like OpenAI). You send results back as a functionResponse part. Google also supports automatic function execution via the automatic_function_calling configuration, where the SDK handles the call-execute-return loop for you — useful for prototyping but not recommended for production where you need control over execution and error handling.
Side-by-Side Syntax Comparison
The following table maps the equivalent concepts across all three providers. Despite different naming conventions, the underlying structure is remarkably similar — each provider needs a function name, description, parameter schema, and a way to return results.
| Concept | OpenAI | Anthropic | |
|---|---|---|---|
| Tool definition wrapper | tools[] | tools[] | tools[].functionDeclarations[] |
| Parameter schema key | parameters | input_schema | parameters |
| Call in response | tool_calls[] | tool_use block | functionCall part |
| Arguments format | JSON string | Parsed object | Parsed object |
| Result return role | role: "tool" | tool_result in user msg | functionResponse part |
| Strict mode | strict: true | strict: true | allowed_function_names |
Migration note: The biggest practical difference when migrating between providers is the arguments format. OpenAI returns arguments as a JSON string that requires JSON.parse(), while Anthropic and Google return parsed objects. Failing to account for this is the most common source of migration bugs.
Implementation Patterns: TypeScript and Python
The complete function-calling loop — send request, detect tool calls, execute functions, return results — follows the same structure in every language and provider. The following pattern shows a production-ready TypeScript implementation that handles the full multi-turn loop with proper error handling.
// Define your function implementations
const functions: Record<string, (args: unknown) => Promise<string>> = {
get_weather: async (args) => {
const { location } = args as { location: string };
// Call your actual weather API here
return JSON.stringify({ temp: 18.2, unit: "celsius", location });
},
};
// The tool loop: keeps calling until the model stops requesting tools
async function runToolLoop(userMessage: string) {
const messages: OpenAI.ChatCompletionMessageParam[] = [
{ role: "user", content: userMessage },
];
while (true) {
const response = await client.chat.completions.create({
model: "gpt-5.4",
messages,
tools,
});
const choice = response.choices[0];
messages.push(choice.message);
// If no tool calls, return the final text response
if (!choice.message.tool_calls?.length) {
return choice.message.content;
}
// Execute each tool call and add results
for (const toolCall of choice.message.tool_calls) {
const fn = functions[toolCall.function.name];
const args = JSON.parse(toolCall.function.arguments);
const result = await fn(args);
messages.push({
role: "tool",
tool_call_id: toolCall.id,
content: result,
});
}
}
}The equivalent Anthropic loop differs mainly in how tool calls are detected (content blocks with type tool_use) and how results are returned (as tool_result blocks in a user message). The logical structure is identical.
async function runToolLoop(userMessage: string) {
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: userMessage },
];
while (true) {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 4096,
tools,
messages,
});
// Add assistant response to history
messages.push({ role: "assistant", content: response.content });
// If model stopped naturally (not waiting for tool results)
if (response.stop_reason !== "tool_use") {
const textBlock = response.content.find((b) => b.type === "text");
return textBlock?.text ?? "";
}
// Execute tool calls and build result blocks
const toolResults = [];
for (const block of response.content) {
if (block.type === "tool_use") {
const fn = functions[block.name];
const result = await fn(block.input);
toolResults.push({
type: "tool_result" as const,
tool_use_id: block.id,
content: result,
});
}
}
messages.push({ role: "user", content: toolResults });
}
}Both implementations follow a while loop that continues until the model produces a response without requesting additional tool calls. This pattern handles chained tool calls naturally — the model can call one tool, use the result to decide which tool to call next, and repeat until it has enough information to answer the original question.
MCP: The Emerging Cross-Provider Standard
Model Context Protocol (MCP) is the most significant development in function calling since OpenAI introduced the feature in 2023. Originally launched by Anthropic in November 2024, MCP defines a standard protocol for connecting AI models to external tools and data sources. Both OpenAI and Google have adopted MCP, and OpenAI deprecated its Assistants API in favor of MCP with a mid-2026 sunset. For a comprehensive overview of how MCP fits alongside other agent protocols, see our AI agent protocol ecosystem map covering MCP, A2A, ACP, and UCP.
MCP tool servers expose tools through a standard protocol. Build the integration once and it works with Claude, GPT, Gemini, and any MCP-compatible client without per-provider adapter code.
MCP servers run as separate processes that expose tools, resources, and prompts. Clients (AI models) connect via stdio or HTTP. This separation means tool servers can be shared across applications and teams.
Models can discover available tools at runtime through MCP rather than receiving static definitions in every API call. This reduces context window consumption for applications with large tool sets.
The practical recommendation is straightforward: if you are building new tool integrations today, build them as MCP servers. The upfront investment is comparable to building a provider-specific integration, and the result works across all three major providers. For existing integrations using provider-specific formats, the migration priority depends on whether you need multi-provider support. Single-provider applications work fine with native tool definitions.
Error Handling and Schema Validation
Production tool-calling systems need robust error handling at three layers: schema validation (preventing malformed arguments from reaching your code), execution errors (handling failures in the functions themselves), and result formatting (returning errors in a way the model can understand and recover from).
import { z } from "zod";
// Define schemas for validation
const weatherSchema = z.object({
location: z.string().min(1),
unit: z.enum(["celsius", "fahrenheit"]).optional(),
});
async function executeToolCall(
name: string,
args: unknown
): Promise<{ result?: string; error?: string }> {
try {
// 1. Validate arguments against schema
if (name === "get_weather") {
const parsed = weatherSchema.safeParse(args);
if (!parsed.success) {
return {
error: `Invalid arguments: ${parsed.error.message}`,
};
}
// 2. Execute the function
const data = await fetchWeather(parsed.data.location);
return { result: JSON.stringify(data) };
}
return { error: `Unknown function: ${name}` };
} catch (err) {
// 3. Return execution errors to the model
return {
error: `Function execution failed: ${
err instanceof Error ? err.message : "Unknown error"
}`,
};
}
}Always validate before execution
Even with strict mode enabled, validate arguments with Zod or a JSON schema validator before passing them to your function. Defense in depth prevents edge cases from reaching your business logic.
Return structured errors, not exceptions
When a function call fails, return the error as a structured result to the model. This lets the model self-correct, retry with different arguments, or explain the failure to the user. Throwing an unhandled exception crashes the loop.
Set maximum loop iterations
The while-true tool loop should have a maximum iteration count (typically 5-10 rounds) to prevent infinite loops where the model keeps calling tools without converging on an answer. This is especially important for agentic applications.
Provider Selection: Accuracy, Latency, Cost
Choosing a provider for function-calling workloads involves three dimensions: tool-call accuracy (how reliably the model generates valid function calls), latency (time to first tool call in the response), and cost (per-token pricing at scale). The right choice depends on your application's priority ordering across these dimensions.
Anthropic (Claude Opus 4.6)
Highest reliability score (8.4), best for long-horizon autonomous agents
Higher latency on initial response, premium pricing. Excels when accuracy and reliability across multi-turn interactions matter more than speed.
OpenAI (GPT-5.4 Thinking)
Highest single-turn accuracy (98.7% TAU2), largest ecosystem
Thinking variant adds latency. Standard variant is faster but less reliable on complex tool chains. Best for teams already invested in the OpenAI ecosystem.
Google (Gemini 3.1 Pro)
Best cross-MCP coordination (69.2% MCP-Atlas), lowest latency tier with Flash-Lite
Flash-Lite is the cheapest option for high-throughput tool calls. Pro variant is competitive with frontier models. Best for teams using Google Cloud or needing multimodal tool inputs.
Abstraction layer recommendation: Use the Vercel AI SDK or a similar abstraction to normalize tool definitions across providers. This lets you benchmark the same tool-calling workload across all three providers with a single codebase change, and swap providers in production without rewriting integration code. See our guide on building TypeScript AI agents with MCP servers for implementation details.
Conclusion
Function calling is the foundational capability that transforms language models from text generators into tool-using agents. The syntax differs across OpenAI, Anthropic, and Google, but the four-step pattern is universal: define tools, detect calls, execute functions, return results. The key differences that matter in practice are Anthropic's higher reliability scores for complex tool chains, OpenAI's strongest single-turn accuracy, Google's lowest-latency option with Flash-Lite, and the convergence toward MCP as a cross-provider standard.
For new projects starting today, the recommended approach is to build tool integrations using MCP servers that work across all providers, use an abstraction layer like the Vercel AI SDK to normalize the provider-specific differences, and implement schema validation with Zod at the execution boundary. This combination provides the flexibility to benchmark and swap providers as capabilities and pricing evolve, without rewriting your tool infrastructure. The MCP versus A2A versus ACP comparison for business leaders provides additional strategic context for teams deciding which protocol standards to invest in.
Ready to Build AI-Powered Integrations?
Function calling transforms what AI can do for your business. Our team helps companies design and implement tool-calling architectures that work across providers and scale with your needs.
Related Articles
Continue exploring with these related guides