LangChain Agent Tutorial: Build Production AI Agents with Python 2025
Master LangChain AI agents with the ReAct pattern. Learn how to build AI agents in Python that autonomously reason, execute tools, and manage complex agent workflows from development to production deployment. This comprehensive tutorial covers LangGraph implementation, tool calling, agent memory, and proven patterns from 50+ production deployments. Complete Python guide with code examples for autonomous AI systems in 2025.
Key Takeaways
- ReAct Pattern Foundation:: Agents iterate through Reasoning (analyze task) → Action (use tool) → Observation (process results) cycles, enabling autonomous problem-solving across multiple steps.
- LangGraph Over LangChain:: LangGraph is now recommended for production agents with its stateful graph architecture, human-in-the-loop interrupts, and time-travel debugging capabilities for 2025.
- Tool Creation Essential:: Define tools using @tool decorator with clear name, docstring, and type hints. Agents use docstrings to determine when and how to invoke tools autonomously.
- Memory with MemorySaver:: MemorySaver enables persistent conversation state across sessions. Use SqliteSaver or PostgresSaver for production—MemorySaver is in-memory only for tutorials.
- Production-Ready Architecture:: Implement streaming responses, error handling, observability with LangSmith, and stateful workflows with checkpointing for enterprise-scale agent deployments.
Understanding the ReAct Pattern: The Agent Execution Loop
The ReAct (Reasoning and Acting) pattern is the foundation of modern LangChain agents. It combines chain-of-thought reasoning with the ability to take actions through tools, enabling AI systems to solve complex problems autonomously.
How ReAct Pattern Works
ReAct agents operate in iterative cycles, each consisting of three phases:
The agent analyzes the user's request and current context to determine what action to take next. It reasons about which tools might help and what information is still needed.
Based on its reasoning, the agent selects and invokes a specific tool with appropriate arguments. This could be searching the web, querying a database, performing calculations, or calling an API.
The agent receives and processes the tool's output, incorporating new information into its understanding. It then decides whether to take another action or provide a final answer.
Why ReAct Pattern Matters
Traditional language models can only generate text based on their training data. ReAct agents extend this capability by enabling models to:
- Access real-time information through search engines, APIs, and databases
- Perform accurate computations using calculators and code execution environments
- Interact with external systems to retrieve or modify data autonomously
- Break down complex problems into multi-step workflows with explicit reasoning
According to LangChain's official documentation, agents built with the ReAct pattern demonstrate significantly better performance on complex queries compared to simple chain-based approaches, particularly when tasks require multiple information sources or computational steps.
LangGraph vs LangChain: Choosing the Right Framework
As of 2025, LangChain's development team recommends using LangGraph for all new agent implementations. While LangChain agents continue to be supported, LangGraph offers a more flexible and production-ready architecture for complex workflows.
Architecture Differences
| Feature | LangChain | LangGraph |
|---|---|---|
| Architecture | Linear, chain-based (DAGs) | Graph-based with loops |
| State Management | Pass through chain | First-class persistent state |
| Human-in-the-Loop | Limited support | Built-in interrupts and approvals |
| Debugging | Standard logging | Time-travel debugging |
| Best For | Simple, linear workflows | Complex, stateful multi-agent systems |
| Production Ready | Good for prototypes | Enterprise-scale deployments |
When to Use Each Framework
- • Building quick prototypes or proof-of-concept projects
- • Working with simple, linear workflows (RAG pipelines, basic chatbots)
- • Learning agent concepts before moving to production systems
- • No need for complex state management or backtracking
- • Building production agents that require robust error handling
- • Need persistent state across multiple conversation turns
- • Implementing human-in-the-loop approval workflows or interrupts
- • Working with multi-agent systems with branching logic
- • Require streaming responses and real-time state updates
According to LangChain's blog post "How to think about agent frameworks," LangGraph is envisioned to power "the next wave of AI agent adoption" moving into 2025, with expected features around production monitoring, collaboration tools, and fine-grained control of agent behaviors.
Building Tools for LangChain Agents: Tool Calling and Function Execution
Tools are the interface between your agent and the external world. LangChain provides a simple decorator-based approach for creating tools that agents can discover and use autonomously.
Creating Your First Tool
Tools in LangChain are created using the @tool decorator. The function's docstring becomes the tool's description that the agent uses to determine when to invoke it:
from langchain_core.tools import tool
from typing import Optional
@tool
def search_company_info(company_name: str, info_type: Optional[str] = "overview") -> str:
"""
Search for company information from various sources.
Args:
company_name: The name of the company to search for
info_type: Type of information to retrieve (overview, financials, news)
Returns:
Relevant company information based on the query
Use this tool when the user asks about:
- Company details, background, or general information
- Financial data, revenue, or market position
- Recent news or developments about a company
"""
# Implementation would connect to real data sources
if info_type == "financials":
return f"Financial data for {company_name}: Revenue $10M, Growth 25%"
elif info_type == "news":
return f"Latest news for {company_name}: Expanded to new markets"
else:
return f"{company_name} is a technology company founded in 2020"
@tool
def calculate_roi(investment: float, return_value: float, time_period: int) -> dict:
"""
Calculate Return on Investment (ROI) for financial analysis.
Args:
investment: Initial investment amount in dollars
return_value: Final value or returns in dollars
time_period: Investment period in years
Returns:
Dictionary with ROI percentage and annualized return
Use this tool when the user needs to:
- Calculate investment returns or profitability
- Compare different investment options
- Analyze financial performance metrics
"""
total_return = return_value - investment
roi_percentage = (total_return / investment) * 100
annualized_return = ((return_value / investment) ** (1 / time_period) - 1) * 100
return {
"roi_percentage": round(roi_percentage, 2),
"annualized_return": round(annualized_return, 2),
"total_return": round(total_return, 2)
}Tool Design Best Practices
1. Clear, Descriptive Names
Tool names should be self-explanatory. Use verb-noun patterns like search_database, calculate_price, fetch_user_data. Avoid vague names like helper or utility.
2. Comprehensive Docstrings
The docstring is critical—it's how agents decide when to use your tool. Include: (1) What the tool does, (2) When to use it, (3) Parameter descriptions, (4) Return value format, (5) Example use cases.
3. Type Hints Required
Always include type hints for parameters and return values. LangChain uses these to validate tool calls and generate proper schemas for the agent.
4. Error Handling
Tools should handle errors gracefully and return informative error messages. The agent needs to understand what went wrong to decide whether to retry, use a different tool, or ask for clarification.
Advanced Tool Example: API Integration
from langchain_core.tools import tool
import requests
from typing import Dict, Optional
@tool
def fetch_weather_data(
city: str,
country_code: Optional[str] = None,
units: str = "metric"
) -> Dict:
"""
Fetch current weather data for a specified city.
Args:
city: Name of the city to get weather for
country_code: Optional 2-letter country code (e.g., 'US', 'GB')
units: Temperature units - 'metric' (Celsius) or 'imperial' (Fahrenheit)
Returns:
Dictionary containing temperature, conditions, humidity, and wind speed
Use this tool when the user asks about:
- Current weather conditions in a location
- Temperature or weather forecasts
- Climate information for trip planning
Examples:
- "What's the weather in London?"
- "Is it raining in Tokyo today?"
- "Temperature in New York?"
"""
try:
# Build API query
location = f"{city},{country_code}" if country_code else city
# This is a simplified example - use real API in production
response = requests.get(
"https://api.openweathermap.org/data/2.5/weather",
params={
"q": location,
"units": units,
"appid": "YOUR_API_KEY" # Use environment variable in production
},
timeout=5
)
if response.status_code != 200:
return {
"error": f"Unable to fetch weather data. Status: {response.status_code}",
"city": city
}
data = response.json()
return {
"city": data["name"],
"country": data["sys"]["country"],
"temperature": data["main"]["temp"],
"feels_like": data["main"]["feels_like"],
"conditions": data["weather"][0]["description"],
"humidity": data["main"]["humidity"],
"wind_speed": data["wind"]["speed"],
"units": units
}
except requests.RequestException as e:
return {
"error": f"Network error while fetching weather data: {str(e)}",
"city": city
}
except Exception as e:
return {
"error": f"Error processing weather data: {str(e)}",
"city": city
}This weather tool demonstrates production-ready patterns: comprehensive error handling, timeout configuration, informative return values, and detailed docstrings that guide agent decision-making.
Agent Memory Management with MemorySaver and Checkpointing
Memory enables agents to maintain context across multiple conversation turns. LangGraph's checkpointing system, powered by MemorySaver and its production variants, provides persistent state management for conversational agents.
Understanding MemorySaver
MemorySaver is LangGraph's in-memory checkpointer designed for tutorials and development. It stores conversation state in RAM, making it fast but non-persistent across restarts. For production applications, use SqliteSaver or PostgresSaver instead.
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph
from langchain_openai import ChatOpenAI
# Initialize the language model
model = ChatOpenAI(model="gpt-4-turbo-preview")
# Define the state graph workflow
workflow = StateGraph(state_schema=MessagesState)
def call_model(state: MessagesState):
"""Process messages and generate response."""
response = model.invoke(state["messages"])
return {"messages": response}
# Add nodes and edges
workflow.add_node("model", call_model)
workflow.add_edge(START, "model")
# Add memory checkpointing
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)
# Use the agent with conversation threading
config = {"configurable": {"thread_id": "conversation-1"}}
# First turn
response1 = app.invoke(
{"messages": [{"role": "user", "content": "My name is Alice"}]},
config=config
)
# Second turn - agent remembers context
response2 = app.invoke(
{"messages": [{"role": "user", "content": "What's my name?"}]},
config=config
)
print(response2["messages"][-1].content) # Output: "Your name is Alice"Production Memory with SqliteSaver
For production deployments, SqliteSaver provides persistent storage that survives application restarts. It's ideal for single-server deployments or development environments.
from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.graph import START, MessagesState, StateGraph
from langchain_openai import ChatOpenAI
import os
# Initialize model
model = ChatOpenAI(model="gpt-4-turbo-preview")
# Define workflow
workflow = StateGraph(state_schema=MessagesState)
def call_model(state: MessagesState):
response = model.invoke(state["messages"])
return {"messages": response}
workflow.add_node("model", call_model)
workflow.add_edge(START, "model")
# Create persistent SQLite checkpointer
db_path = os.path.join(os.getcwd(), "checkpoints.db")
memory = SqliteSaver.from_conn_string(db_path)
app = workflow.compile(checkpointer=memory)
# Use with unique thread IDs per user/conversation
user_id = "user_123"
conversation_id = "conv_456"
thread_id = f"{user_id}:{conversation_id}"
config = {"configurable": {"thread_id": thread_id}}
# Conversation persists across application restarts
response = app.invoke(
{"messages": [{"role": "user", "content": "Remember this: Project deadline is Dec 15"}]},
config=config
)
# Days later, after server restart
response = app.invoke(
{"messages": [{"role": "user", "content": "When is the project deadline?"}]},
config=config
)
print(response["messages"][-1].content) # Output: "December 15"Enterprise Memory with PostgresSaver
For multi-server deployments and enterprise applications, PostgresSaver provides scalable, distributed memory storage with transaction support and backup capabilities.
from langgraph.checkpoint.postgres import PostgresSaver
from psycopg import Connection
import os
# Connect to PostgreSQL database
db_url = os.getenv("DATABASE_URL")
connection = Connection.connect(db_url)
# Initialize PostgreSQL checkpointer
memory = PostgresSaver(connection)
# Create checkpoints table (run once during setup)
memory.setup()
# Use with your workflow
app = workflow.compile(checkpointer=memory)
# Same API as other checkpointers
config = {"configurable": {"thread_id": "production-thread-1"}}
response = app.invoke(
{"messages": [{"role": "user", "content": "Store this customer ID: CUS-2025-001"}]},
config=config
)- Thread ID Strategy: Use meaningful, hierarchical thread IDs like
user:conversation:sessionfor easier debugging - State Size Management: Implement context window trimming for long conversations to prevent token limits
- Cleanup Policies: Set up automated cleanup for old checkpoints to manage storage costs
- Backup Strategy: Regularly backup checkpoint databases for production systems
Complete Agent Implementation with LangGraph
Let's build a production-ready ReAct agent using LangGraph with tools, memory, and proper error handling. This example demonstrates the recommended 2025 approach for building LangChain agents.
Full ReAct Agent Example
from typing import Annotated
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import create_react_agent
from langgraph.graph import MessagesState
# Define tools for the agent
@tool
def search_database(query: str) -> str:
"""
Search internal database for customer information.
Args:
query: Search query string
Use when user asks about customer data, orders, or account information.
"""
# Simulated database search
results = {
"query": query,
"results": [
{"customer": "John Doe", "orders": 5, "total": 2500},
{"customer": "Jane Smith", "orders": 3, "total": 1800}
]
}
return f"Found {len(results['results'])} customers matching '{query}'"
@tool
def calculate_metrics(metric_type: str, values: list[float]) -> dict:
"""
Calculate statistical metrics for business analysis.
Args:
metric_type: Type of metric (mean, median, sum, growth)
values: List of numerical values to analyze
Use for financial calculations, performance analysis, or data aggregation.
"""
if metric_type == "mean":
result = sum(values) / len(values)
return {"metric": "mean", "value": round(result, 2)}
elif metric_type == "sum":
result = sum(values)
return {"metric": "sum", "value": result}
elif metric_type == "growth":
if len(values) < 2:
return {"error": "Need at least 2 values for growth calculation"}
growth = ((values[-1] - values[0]) / values[0]) * 100
return {"metric": "growth_percentage", "value": round(growth, 2)}
else:
return {"error": f"Unknown metric type: {metric_type}"}
@tool
def send_notification(recipient: str, message: str, priority: str = "normal") -> dict:
"""
Send notifications to team members.
Args:
recipient: Email or username of recipient
message: Notification message content
priority: Priority level (normal, high, urgent)
Use when user requests to notify team members or send alerts.
"""
# Simulated notification service
return {
"status": "sent",
"recipient": recipient,
"message": message,
"priority": priority,
"timestamp": "2025-10-22T10:30:00Z"
}
# Initialize the language model
model = ChatOpenAI(
model="gpt-4-turbo-preview",
temperature=0 # Lower temperature for more consistent reasoning
)
# Combine all tools
tools = [search_database, calculate_metrics, send_notification]
# Create memory checkpointer
memory = MemorySaver()
# Create the ReAct agent
agent = create_react_agent(
model=model,
tools=tools,
checkpointer=memory,
state_modifier="You are a helpful business intelligence assistant. "
"Use the available tools to help users analyze data, "
"search information, and notify team members. "
"Always explain your reasoning before taking actions."
)
# Example usage: Multi-turn conversation
def run_agent_conversation():
# Create unique thread ID for this conversation
config = {"configurable": {"thread_id": "demo-thread-1"}}
# Turn 1: Search for data
print("=== Turn 1: Data Search ===")
response1 = agent.invoke(
{"messages": [{"role": "user", "content": "Search for customers named John"}]},
config=config
)
print(response1["messages"][-1].content)
# Turn 2: Calculate metrics (agent remembers context)
print("\n=== Turn 2: Metric Calculation ===")
response2 = agent.invoke(
{"messages": [{"role": "user", "content": "Calculate the average of these values: 100, 150, 200, 175"}]},
config=config
)
print(response2["messages"][-1].content)
# Turn 3: Send notification based on analysis
print("\n=== Turn 3: Notification ===")
response3 = agent.invoke(
{"messages": [{"role": "user", "content": "Send a high priority notification to manager@company.com about the average calculation"}]},
config=config
)
print(response3["messages"][-1].content)
# Turn 4: Reference earlier context
print("\n=== Turn 4: Context Recall ===")
response4 = agent.invoke(
{"messages": [{"role": "user", "content": "What was the customer search I asked about earlier?"}]},
config=config
)
print(response4["messages"][-1].content)
if __name__ == "__main__":
run_agent_conversation()Agent Execution Flow Explained
When you invoke the agent, here's what happens internally:
Agent receives user message and loads conversation history from checkpointer using the thread_id.
Model analyzes request, reviews available tools and their descriptions, decides which tool(s) to use.
Agent calls selected tool with appropriate arguments, receives results.
Agent processes tool results and decides: provide final answer or take another action (repeat steps 2-4).
Conversation state (messages, tool calls, results) saved to checkpointer for future turns.
This iterative process continues until the agent determines it has enough information to provide a final answer to the user's question.
Production-Ready Agent Deployment Patterns and Best Practices
Moving from development to production requires implementing robust patterns for error handling, observability, and scalability. Here are the essential production patterns for LangChain agents in 2025.
1. Streaming Responses
Stream agent responses for better user experience, especially for long-running operations:
# Enable streaming for real-time response updates
config = {"configurable": {"thread_id": "stream-demo"}}
async def stream_agent_response(user_message: str):
"""Stream agent responses token by token."""
async for event in agent.astream_events(
{"messages": [{"role": "user", "content": user_message}]},
config=config,
version="v2"
):
kind = event["event"]
if kind == "on_chat_model_stream":
content = event["data"]["chunk"].content
if content:
print(content, end="", flush=True)
elif kind == "on_tool_start":
print(f"\n[Using tool: {event['name']}]", flush=True)
elif kind == "on_tool_end":
print(f"[Tool completed]\n", flush=True)
# Usage
import asyncio
asyncio.run(stream_agent_response("Search for high-value customers"))2. Error Handling & Retries
from tenacity import retry, stop_after_attempt, wait_exponential
from langchain_core.runnables import RunnableConfig
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10)
)
def invoke_agent_with_retry(agent, messages, config):
"""Invoke agent with exponential backoff retry logic."""
try:
return agent.invoke(messages, config)
except Exception as e:
print(f"Agent invocation failed: {e}")
raise
# Use in production
try:
response = invoke_agent_with_retry(
agent,
{"messages": [{"role": "user", "content": "Process this request"}]},
config
)
except Exception as e:
# Log error and provide fallback response
print(f"Agent failed after retries: {e}")
response = {"messages": [{"role": "assistant", "content": "I encountered an error. Please try again later."}]}3. Observability with LangSmith
LangSmith provides essential observability for production agents, including trajectory tracking, tool call monitoring, and performance analytics:
import os
from langsmith import Client
# Configure LangSmith (set these in environment variables)
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
os.environ["LANGCHAIN_PROJECT"] = "production-agents"
# LangSmith automatically traces all agent operations
response = agent.invoke(
{"messages": [{"role": "user", "content": "Analyze Q4 sales"}]},
config=config
)
# Access trace data programmatically
client = Client()
runs = client.list_runs(
project_name="production-agents",
execution_order=1,
error=False
)
for run in runs:
print(f"Run: {run.name}")
print(f"Duration: {run.total_tokens} tokens")
print(f"Cost: ${run.total_cost}")
print(f"Tools used: {[tool.name for tool in run.child_runs]}")4. Rate Limiting & Resource Management
from functools import wraps
import time
from collections import defaultdict
from threading import Lock
class RateLimiter:
"""Simple in-memory rate limiter for agent API calls."""
def __init__(self, max_requests: int = 100, window_seconds: int = 60):
self.max_requests = max_requests
self.window_seconds = window_seconds
self.requests = defaultdict(list)
self.lock = Lock()
def allow_request(self, user_id: str) -> bool:
"""Check if request is allowed under rate limit."""
with self.lock:
now = time.time()
window_start = now - self.window_seconds
# Remove old requests outside window
self.requests[user_id] = [
req_time for req_time in self.requests[user_id]
if req_time > window_start
]
# Check limit
if len(self.requests[user_id]) >= self.max_requests:
return False
self.requests[user_id].append(now)
return True
# Apply rate limiting
rate_limiter = RateLimiter(max_requests=50, window_seconds=60)
def invoke_with_rate_limit(user_id: str, message: str):
"""Invoke agent with rate limiting by user."""
if not rate_limiter.allow_request(user_id):
return {
"error": "Rate limit exceeded. Please try again in a minute.",
"status": 429
}
config = {"configurable": {"thread_id": f"user-{user_id}"}}
return agent.invoke({"messages": [{"role": "user", "content": message}]}, config)
# Usage
response = invoke_with_rate_limit("user_123", "Search customer database")5. Human-in-the-Loop Workflows
LangGraph's interrupt feature enables human approval before critical operations:
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, END
from typing import TypedDict
class AgentState(TypedDict):
messages: list
pending_approval: bool
approved_action: str | None
workflow = StateGraph(AgentState)
def process_request(state: AgentState):
"""Process user request and determine if approval needed."""
last_message = state["messages"][-1]["content"]
# Check if action requires approval
requires_approval = any(
keyword in last_message.lower()
for keyword in ["delete", "purchase", "transfer", "modify"]
)
return {
"pending_approval": requires_approval,
"approved_action": last_message if requires_approval else None
}
def execute_action(state: AgentState):
"""Execute action after approval."""
if state.get("approved_action"):
# Execute the approved action
result = f"Executed: {state['approved_action']}"
return {"messages": [{"role": "assistant", "content": result}]}
return {"messages": [{"role": "assistant", "content": "Action cancelled"}]}
workflow.add_node("process", process_request)
workflow.add_node("execute", execute_action)
# Add conditional interrupt for approval
workflow.add_edge("process", "execute")
workflow.set_entry_point("process")
workflow.add_edge("execute", END)
memory = MemorySaver()
app = workflow.compile(
checkpointer=memory,
interrupt_before=["execute"] # Pause before execution
)
# Usage: Submit request
config = {"configurable": {"thread_id": "approval-demo"}}
state = app.invoke(
{"messages": [{"role": "user", "content": "Delete customer record ID 123"}]},
config=config
)
# At this point, execution pauses. Check pending approval:
if state.get("pending_approval"):
print(f"Action requires approval: {state['approved_action']}")
# After human review, either resume or cancel
user_approves = input("Approve? (yes/no): ").lower() == "yes"
if user_approves:
# Resume execution
final_state = app.invoke(None, config=config)
print(final_state["messages"][-1])
else:
# Cancel by updating state
app.update_state(config, {"approved_action": None})
print("Action cancelled by user")- ✓ Implement streaming for better UX
- ✓ Add retry logic with exponential backoff
- ✓ Enable LangSmith tracing for observability
- ✓ Set up rate limiting per user/tenant
- ✓ Configure human-in-the-loop for sensitive actions
- ✓ Use PostgresSaver for distributed deployments
- ✓ Implement comprehensive error handling
- ✓ Monitor costs and token usage
- ✓ Set up automated cleanup for old checkpoints
Advanced Agent Techniques
Once you've mastered basic agents, these advanced techniques enable more sophisticated behaviors and better performance at scale.
Multi-Agent Orchestration
Complex tasks often benefit from multiple specialized agents working together. LangGraph supports three main orchestration patterns:
A supervisor agent routes tasks to specialized worker agents (research agent, code agent, writing agent) and synthesizes results.
Best for: Tasks requiring different expertise domains
Multiple agents communicate peer-to-peer, sharing information and collaborating on subtasks autonomously.
Best for: Parallel processing and distributed problem-solving
Agents execute in a defined sequence, each processing the output of the previous agent (pipeline architecture).
Best for: Multi-stage workflows with clear dependencies
Context Window Management
Long conversations can exceed model context limits. Implement intelligent truncation:
from langchain_core.messages import trim_messages
def create_agent_with_context_management(max_tokens: int = 4000):
"""Create agent with automatic context window management."""
# Create trimmer that keeps system message and recent history
trimmer = trim_messages(
max_tokens=max_tokens,
strategy="last", # Keep most recent messages
token_counter=len, # Use simple token counter
include_system=True, # Always keep system message
allow_partial=False # Don't split messages
)
# Modify agent to use trimmer
workflow = StateGraph(state_schema=MessagesState)
def call_model_with_trimming(state: MessagesState):
# Trim messages before invoking model
trimmed_messages = trimmer.invoke(state["messages"])
response = model.invoke(trimmed_messages)
return {"messages": [response]}
workflow.add_node("model", call_model_with_trimming)
workflow.add_edge(START, "model")
return workflow.compile(checkpointer=memory)
# Agent now handles long conversations gracefully
agent = create_agent_with_context_management(max_tokens=4000)Custom Agent Reasoning
Override default ReAct prompting for domain-specific reasoning patterns:
# Custom system prompt for financial analysis agent
financial_agent_prompt = """You are a financial analysis agent specialized in investment research.
When analyzing requests:
1. REASONING: Always start by breaking down the financial question
2. DATA GATHERING: Identify what data sources you need
3. CALCULATION: Use tools to retrieve and calculate metrics
4. RISK ASSESSMENT: Consider potential risks and limitations
5. RECOMMENDATION: Provide actionable insights with confidence levels
Available tools:
- fetch_stock_data: Get historical stock prices
- calculate_metrics: Compute financial ratios
- search_financial_news: Find relevant market news
Always cite data sources and explain your analytical reasoning."""
agent = create_react_agent(
model=model,
tools=tools,
checkpointer=memory,
state_modifier=financial_agent_prompt # Custom reasoning instructions
)Tool Result Validation
Add validation layers to ensure tool outputs are safe and expected before agents process them:
from pydantic import BaseModel, Field, ValidationError
from typing import Any
class ToolResultValidator(BaseModel):
"""Validate tool outputs before agent processes them."""
success: bool = Field(description="Whether tool executed successfully")
data: Any = Field(description="Tool output data")
error: str | None = Field(default=None, description="Error message if failed")
class Config:
extra = "forbid" # Reject unexpected fields
def validated_tool(func):
"""Decorator to add validation to tool results."""
@wraps(func)
def wrapper(*args, **kwargs):
try:
result = func(*args, **kwargs)
# Validate result structure
validated = ToolResultValidator(
success=True,
data=result,
error=None
)
return validated.model_dump()
except ValidationError as e:
return {
"success": False,
"data": None,
"error": f"Tool output validation failed: {str(e)}"
}
except Exception as e:
return {
"success": False,
"data": None,
"error": f"Tool execution failed: {str(e)}"
}
return wrapper
@tool
@validated_tool
def fetch_api_data(endpoint: str) -> dict:
"""Fetch data from external API with validation."""
response = requests.get(endpoint)
response.raise_for_status()
return response.json()
# Agent receives validated, structured tool resultsThese advanced techniques transform basic agents into robust, production-ready systems capable of handling complex enterprise workflows.
Frequently Asked Questions
Should I use LangChain or LangGraph for new projects in 2025?
Use LangGraph for new projects. While LangChain agents continue to be supported, LangGraph is recommended by the LangChain team for production deployments. It offers superior state management, human-in-the-loop workflows, and debugging capabilities essential for enterprise applications.
What's the difference between MemorySaver, SqliteSaver, and PostgresSaver?
MemorySaver stores conversation state in RAM—fast but non-persistent across restarts, ideal for tutorials. SqliteSaver persists to disk, suitable for single-server deployments. PostgresSaver offers distributed storage with transaction support, required for multi-server production deployments. Choose based on your persistence and scale needs.
How do I prevent agents from infinite loops?
Set maximum iteration limits in your agent configuration. Use recursion_limit in LangGraph to cap the number of reasoning cycles. Implement timeout logic at the application level. Monitor agent trajectories with LangSmith to identify and debug looping behavior. Design tools with clear success criteria to help agents know when to stop.
What's the recommended way to handle tool errors?
Tools should return structured error messages that agents can understand, not raise exceptions. Use try-except blocks inside tools and return dictionaries with "error" fields. This allows agents to reason about failures and try alternative approaches. Implement retry logic for transient errors at the tool level, not agent level.
How do I monitor agent performance and costs?
Enable LangSmith tracing by setting LANGCHAIN_TRACING_V2=true. LangSmith automatically tracks token usage, tool calls, latency, and costs per conversation. Use the LangSmith dashboard to identify expensive patterns, optimize tool usage, and set budget alerts. Export traces programmatically for custom analytics.
Can I use multiple LLM models in one agent?
Yes. Use faster, cheaper models (GPT-4o-mini, Gemini 1.5 Flash) for simple tool-calling decisions and reserve premium models (GPT-5 Pro, Claude Opus 4) for complex reasoning steps. Implement model routing in your call_model function based on task complexity or conversation state. This can reduce costs by 60-80% while maintaining quality.
How do I migrate from LangChain agents to LangGraph?
Replace create_react_agent (LangChain) with create_react_agent from langgraph.prebuilt. The API is nearly identical—both accept model, tools, and checkpointer parameters. The main difference is LangGraph returns a compiled graph instead of a chain. Update imports and you're mostly done. Existing tools work without modification.
What's the maximum number of tools an agent can handle effectively?
Most models handle 10-20 tools effectively. Beyond that, tool selection accuracy degrades. For larger tool collections, implement tool retrieval: embed tool descriptions in a vector database and dynamically select relevant tools based on user queries. This enables agents to work with hundreds of tools by only loading relevant ones per request.
Ready to Build Production AI Agents?
You now have a comprehensive understanding of LangChain AI agents, from the ReAct pattern fundamentals to production deployment strategies. Start building with LangGraph for maximum flexibility and production readiness.
Need expert help implementing AI agents for your business? Digital Applied specializes in production AI systems and can accelerate your development timeline with proven patterns and best practices.
Related Articles
Master Claude Agent Skills: organized instructions, dynamic loading, domain-specific agents. Complete guide with code examples and production patterns.
Complete OpenAI AgentKit tutorial: Learn step-by-step how to build AI agents with drag-and-drop builder, ChatKit, and MCP connectors. Includes pricing, examples, and deployment guide for 2025.
Master the entire Qwen3 model family - flagship Max-Preview, Coder-480B, Thinking models, and deployment strategies for every use case.