LangChain Agent Tutorial: Build Production AI Agents with Python 2025

Understanding the ReAct Pattern: The Agent Execution Loop

The ReAct (Reasoning and Acting) pattern is the foundation of modern LangChain agents. It combines chain-of-thought reasoning with the ability to take actions through tools, enabling AI systems to solve complex problems autonomously.

How ReAct Pattern Works

ReAct agents operate in iterative cycles, each consisting of three phases:

1. Reasoning (Thought)

The agent analyzes the user's request and current context to determine what action to take next. It reasons about which tools might help and what information is still needed.

2. Action (Tool Use)

Based on its reasoning, the agent selects and invokes a specific tool with appropriate arguments. This could be searching the web, querying a database, performing calculations, or calling an API.

3. Observation (Result)

The agent receives and processes the tool's output, incorporating new information into its understanding. It then decides whether to take another action or provide a final answer.

Why ReAct Pattern Matters

Traditional language models can only generate text based on their training data. ReAct agents extend this capability by enabling models to:

Access real-time information through search engines, APIs, and databases
Perform accurate computations using calculators and code execution environments
Interact with external systems to retrieve or modify data autonomously
Break down complex problems into multi-step workflows with explicit reasoning

According to LangChain's official documentation, agents built with the ReAct pattern demonstrate significantly better performance on complex queries compared to simple chain-based approaches, particularly when tasks require multiple information sources or computational steps.

Need help implementing AI agents for your business? Digital Applied specializes in AI & Digital Transformation solutions that leverage cutting-edge agent frameworks for enterprise automation.

LangGraph vs LangChain: Choosing the Right Framework

As of 2025, LangChain's development team recommends using LangGraph for all new agent implementations. While LangChain agents continue to be supported, LangGraph offers a more flexible and production-ready architecture for complex workflows.

Architecture Differences

Feature	LangChain	LangGraph
Architecture	Linear, chain-based (DAGs)	Graph-based with loops
State Management	Pass through chain	First-class persistent state
Human-in-the-Loop	Limited support	Built-in interrupts and approvals
Debugging	Standard logging	Time-travel debugging
Best For	Simple, linear workflows	Complex, stateful multi-agent systems
Production Ready	Good for prototypes	Enterprise-scale deployments

When to Use Each Framework

Use LangChain When:

• Building quick prototypes or proof-of-concept projects
• Working with simple, linear workflows (RAG pipelines, basic chatbots)
• Learning agent concepts before moving to production systems
• No need for complex state management or backtracking

Use LangGraph When:

• Building production agents that require robust error handling
• Need persistent state across multiple conversation turns
• Implementing human-in-the-loop approval workflows or interrupts
• Working with multi-agent systems with branching logic
• Require streaming responses and real-time state updates

According to LangChain's blog post "How to think about agent frameworks," LangGraph is envisioned to power "the next wave of AI agent adoption" moving into 2025, with expected features around production monitoring, collaboration tools, and fine-grained control of agent behaviors.

Looking to automate business processes? Explore our CRM & Automation services to integrate AI agents with your existing workflows and systems.

Building Tools for LangChain Agents: Tool Calling and Function Execution

Tools are the interface between your agent and the external world. LangChain provides a simple decorator-based approach for creating tools that agents can discover and use autonomously.

Creating Your First Tool

Tools in LangChain are created using the @tool decorator. The function's docstring becomes the tool's description that the agent uses to determine when to invoke it:

from langchain_core.tools import tool
from typing import Optional

@tool
def search_company_info(company_name: str, info_type: Optional[str] = "overview") -> str:
    """
    Search for company information from various sources.

    Args:
        company_name: The name of the company to search for
        info_type: Type of information to retrieve (overview, financials, news)

    Returns:
        Relevant company information based on the query

    Use this tool when the user asks about:
    - Company details, background, or general information
    - Financial data, revenue, or market position
    - Recent news or developments about a company
    """
    # Implementation would connect to real data sources
    if info_type == "financials":
        return f"Financial data for {company_name}: Revenue $10M, Growth 25%"
    elif info_type == "news":
        return f"Latest news for {company_name}: Expanded to new markets"
    else:
        return f"{company_name} is a technology company founded in 2020"


@tool
def calculate_roi(investment: float, return_value: float, time_period: int) -> dict:
    """
    Calculate Return on Investment (ROI) for financial analysis.

    Args:
        investment: Initial investment amount in dollars
        return_value: Final value or returns in dollars
        time_period: Investment period in years

    Returns:
        Dictionary with ROI percentage and annualized return

    Use this tool when the user needs to:
    - Calculate investment returns or profitability
    - Compare different investment options
    - Analyze financial performance metrics
    """
    total_return = return_value - investment
    roi_percentage = (total_return / investment) * 100
    annualized_return = ((return_value / investment) ** (1 / time_period) - 1) * 100

    return {
        "roi_percentage": round(roi_percentage, 2),
        "annualized_return": round(annualized_return, 2),
        "total_return": round(total_return, 2)
    }

Tool Design Best Practices

1. Clear, Descriptive Names

Tool names should be self-explanatory. Use verb-noun patterns like search_database, calculate_price, fetch_user_data. Avoid vague names like helper or utility.

2. Comprehensive Docstrings

The docstring is critical—it's how agents decide when to use your tool. Include: (1) What the tool does, (2) When to use it, (3) Parameter descriptions, (4) Return value format, (5) Example use cases.

3. Type Hints Required

Always include type hints for parameters and return values. LangChain uses these to validate tool calls and generate proper schemas for the agent.

4. Error Handling

Tools should handle errors gracefully and return informative error messages. The agent needs to understand what went wrong to decide whether to retry, use a different tool, or ask for clarification.

Advanced Tool Example: API Integration

from langchain_core.tools import tool
import requests
from typing import Dict, Optional

@tool
def fetch_weather_data(
    city: str,
    country_code: Optional[str] = None,
    units: str = "metric"
) -> Dict:
    """
    Fetch current weather data for a specified city.

    Args:
        city: Name of the city to get weather for
        country_code: Optional 2-letter country code (e.g., 'US', 'GB')
        units: Temperature units - 'metric' (Celsius) or 'imperial' (Fahrenheit)

    Returns:
        Dictionary containing temperature, conditions, humidity, and wind speed

    Use this tool when the user asks about:
    - Current weather conditions in a location
    - Temperature or weather forecasts
    - Climate information for trip planning

    Examples:
    - "What's the weather in London?"
    - "Is it raining in Tokyo today?"
    - "Temperature in New York?"
    """
    try:
        # Build API query
        location = f"{city},{country_code}" if country_code else city

        # This is a simplified example - use real API in production
        response = requests.get(
            "https://api.openweathermap.org/data/2.5/weather",
            params={
                "q": location,
                "units": units,
                "appid": "YOUR_API_KEY"  # Use environment variable in production
            },
            timeout=5
        )

        if response.status_code != 200:
            return {
                "error": f"Unable to fetch weather data. Status: {response.status_code}",
                "city": city
            }

        data = response.json()

        return {
            "city": data["name"],
            "country": data["sys"]["country"],
            "temperature": data["main"]["temp"],
            "feels_like": data["main"]["feels_like"],
            "conditions": data["weather"][0]["description"],
            "humidity": data["main"]["humidity"],
            "wind_speed": data["wind"]["speed"],
            "units": units
        }

    except requests.RequestException as e:
        return {
            "error": f"Network error while fetching weather data: {str(e)}",
            "city": city
        }
    except Exception as e:
        return {
            "error": f"Error processing weather data: {str(e)}",
            "city": city
        }

This weather tool demonstrates production-ready patterns: comprehensive error handling, timeout configuration, informative return values, and detailed docstrings that guide agent decision-making.

Building custom AI applications? Our Web Development team can create production-ready AI agent systems integrated seamlessly with your web applications.

Agent Memory Management with MemorySaver and Checkpointing

Memory enables agents to maintain context across multiple conversation turns. LangGraph's checkpointing system, powered by MemorySaver and its production variants, provides persistent state management for conversational agents.

Understanding MemorySaver

MemorySaver is LangGraph's in-memory checkpointer designed for tutorials and development. It stores conversation state in RAM, making it fast but non-persistent across restarts. For production applications, use SqliteSaver or PostgresSaver instead.

from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph
from langchain_openai import ChatOpenAI

# Initialize the language model
model = ChatOpenAI(model="gpt-4-turbo-preview")

# Define the state graph workflow
workflow = StateGraph(state_schema=MessagesState)

def call_model(state: MessagesState):
    """Process messages and generate response."""
    response = model.invoke(state["messages"])
    return {"messages": response}

# Add nodes and edges
workflow.add_node("model", call_model)
workflow.add_edge(START, "model")

# Add memory checkpointing
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

# Use the agent with conversation threading
config = {"configurable": {"thread_id": "conversation-1"}}

# First turn
response1 = app.invoke(
    {"messages": [{"role": "user", "content": "My name is Alice"}]},
    config=config
)

# Second turn - agent remembers context
response2 = app.invoke(
    {"messages": [{"role": "user", "content": "What's my name?"}]},
    config=config
)

print(response2["messages"][-1].content)  # Output: "Your name is Alice"

Production Memory with SqliteSaver

For production deployments, SqliteSaver provides persistent storage that survives application restarts. It's ideal for single-server deployments or development environments.

from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.graph import START, MessagesState, StateGraph
from langchain_openai import ChatOpenAI
import os

# Initialize model
model = ChatOpenAI(model="gpt-4-turbo-preview")

# Define workflow
workflow = StateGraph(state_schema=MessagesState)

def call_model(state: MessagesState):
    response = model.invoke(state["messages"])
    return {"messages": response}

workflow.add_node("model", call_model)
workflow.add_edge(START, "model")

# Create persistent SQLite checkpointer
db_path = os.path.join(os.getcwd(), "checkpoints.db")
memory = SqliteSaver.from_conn_string(db_path)

app = workflow.compile(checkpointer=memory)

# Use with unique thread IDs per user/conversation
user_id = "user_123"
conversation_id = "conv_456"
thread_id = f"{user_id}:{conversation_id}"

config = {"configurable": {"thread_id": thread_id}}

# Conversation persists across application restarts
response = app.invoke(
    {"messages": [{"role": "user", "content": "Remember this: Project deadline is Dec 15"}]},
    config=config
)

# Days later, after server restart
response = app.invoke(
    {"messages": [{"role": "user", "content": "When is the project deadline?"}]},
    config=config
)

print(response["messages"][-1].content)  # Output: "December 15"

Enterprise Memory with PostgresSaver

For multi-server deployments and enterprise applications, PostgresSaver provides scalable, distributed memory storage with transaction support and backup capabilities.

from langgraph.checkpoint.postgres import PostgresSaver
from psycopg import Connection
import os

# Connect to PostgreSQL database
db_url = os.getenv("DATABASE_URL")
connection = Connection.connect(db_url)

# Initialize PostgreSQL checkpointer
memory = PostgresSaver(connection)

# Create checkpoints table (run once during setup)
memory.setup()

# Use with your workflow
app = workflow.compile(checkpointer=memory)

# Same API as other checkpointers
config = {"configurable": {"thread_id": "production-thread-1"}}
response = app.invoke(
    {"messages": [{"role": "user", "content": "Store this customer ID: CUS-2025-001"}]},
    config=config
)

Memory Best Practices

Thread ID Strategy: Use meaningful, hierarchical thread IDs like user:conversation:session for easier debugging
State Size Management: Implement context window trimming for long conversations to prevent token limits
Cleanup Policies: Set up automated cleanup for old checkpoints to manage storage costs
Backup Strategy: Regularly backup checkpoint databases for production systems

Complete Agent Implementation with LangGraph

Let's build a production-ready ReAct agent using LangGraph with tools, memory, and proper error handling. This example demonstrates the recommended 2025 approach for building LangChain agents.

Full ReAct Agent Example

from typing import Annotated
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import create_react_agent
from langgraph.graph import MessagesState

# Define tools for the agent
@tool
def search_database(query: str) -> str:
    """
    Search internal database for customer information.

    Args:
        query: Search query string

    Use when user asks about customer data, orders, or account information.
    """
    # Simulated database search
    results = {
        "query": query,
        "results": [
            {"customer": "John Doe", "orders": 5, "total": 2500},
            {"customer": "Jane Smith", "orders": 3, "total": 1800}
        ]
    }
    return f"Found {len(results['results'])} customers matching '{query}'"


@tool
def calculate_metrics(metric_type: str, values: list[float]) -> dict:
    """
    Calculate statistical metrics for business analysis.

    Args:
        metric_type: Type of metric (mean, median, sum, growth)
        values: List of numerical values to analyze

    Use for financial calculations, performance analysis, or data aggregation.
    """
    if metric_type == "mean":
        result = sum(values) / len(values)
        return {"metric": "mean", "value": round(result, 2)}
    elif metric_type == "sum":
        result = sum(values)
        return {"metric": "sum", "value": result}
    elif metric_type == "growth":
        if len(values) < 2:
            return {"error": "Need at least 2 values for growth calculation"}
        growth = ((values[-1] - values[0]) / values[0]) * 100
        return {"metric": "growth_percentage", "value": round(growth, 2)}
    else:
        return {"error": f"Unknown metric type: {metric_type}"}


@tool
def send_notification(recipient: str, message: str, priority: str = "normal") -> dict:
    """
    Send notifications to team members.

    Args:
        recipient: Email or username of recipient
        message: Notification message content
        priority: Priority level (normal, high, urgent)

    Use when user requests to notify team members or send alerts.
    """
    # Simulated notification service
    return {
        "status": "sent",
        "recipient": recipient,
        "message": message,
        "priority": priority,
        "timestamp": "2025-10-22T10:30:00Z"
    }


# Initialize the language model
model = ChatOpenAI(
    model="gpt-4-turbo-preview",
    temperature=0  # Lower temperature for more consistent reasoning
)

# Combine all tools
tools = [search_database, calculate_metrics, send_notification]

# Create memory checkpointer
memory = MemorySaver()

# Create the ReAct agent
agent = create_react_agent(
    model=model,
    tools=tools,
    checkpointer=memory,
    state_modifier="You are a helpful business intelligence assistant. "
                   "Use the available tools to help users analyze data, "
                   "search information, and notify team members. "
                   "Always explain your reasoning before taking actions."
)


# Example usage: Multi-turn conversation
def run_agent_conversation():
    # Create unique thread ID for this conversation
    config = {"configurable": {"thread_id": "demo-thread-1"}}

    # Turn 1: Search for data
    print("=== Turn 1: Data Search ===")
    response1 = agent.invoke(
        {"messages": [{"role": "user", "content": "Search for customers named John"}]},
        config=config
    )
    print(response1["messages"][-1].content)

    # Turn 2: Calculate metrics (agent remembers context)
    print("\n=== Turn 2: Metric Calculation ===")
    response2 = agent.invoke(
        {"messages": [{"role": "user", "content": "Calculate the average of these values: 100, 150, 200, 175"}]},
        config=config
    )
    print(response2["messages"][-1].content)

    # Turn 3: Send notification based on analysis
    print("\n=== Turn 3: Notification ===")
    response3 = agent.invoke(
        {"messages": [{"role": "user", "content": "Send a high priority notification to manager@company.com about the average calculation"}]},
        config=config
    )
    print(response3["messages"][-1].content)

    # Turn 4: Reference earlier context
    print("\n=== Turn 4: Context Recall ===")
    response4 = agent.invoke(
        {"messages": [{"role": "user", "content": "What was the customer search I asked about earlier?"}]},
        config=config
    )
    print(response4["messages"][-1].content)


if __name__ == "__main__":
    run_agent_conversation()

Agent Execution Flow Explained

When you invoke the agent, here's what happens internally:

User Input Processing:

Agent receives user message and loads conversation history from checkpointer using the thread_id.

Reasoning Phase:

Model analyzes request, reviews available tools and their descriptions, decides which tool(s) to use.

Tool Invocation:

Agent calls selected tool with appropriate arguments, receives results.

Observation & Iteration:

Agent processes tool results and decides: provide final answer or take another action (repeat steps 2-4).

State Persistence:

Conversation state (messages, tool calls, results) saved to checkpointer for future turns.

This iterative process continues until the agent determines it has enough information to provide a final answer to the user's question.

Production-Ready Agent Deployment Patterns and Best Practices

Moving from development to production requires implementing robust patterns for error handling, observability, and scalability. Here are the essential production patterns for LangChain agents in 2025.

1. Streaming Responses

Stream agent responses for better user experience, especially for long-running operations:

# Enable streaming for real-time response updates
config = {"configurable": {"thread_id": "stream-demo"}}

async def stream_agent_response(user_message: str):
    """Stream agent responses token by token."""
    async for event in agent.astream_events(
        {"messages": [{"role": "user", "content": user_message}]},
        config=config,
        version="v2"
    ):
        kind = event["event"]

        if kind == "on_chat_model_stream":
            content = event["data"]["chunk"].content
            if content:
                print(content, end="", flush=True)

        elif kind == "on_tool_start":
            print(f"\n[Using tool: {event['name']}]", flush=True)

        elif kind == "on_tool_end":
            print(f"[Tool completed]\n", flush=True)

# Usage
import asyncio
asyncio.run(stream_agent_response("Search for high-value customers"))

2. Error Handling & Retries

from tenacity import retry, stop_after_attempt, wait_exponential
from langchain_core.runnables import RunnableConfig

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
def invoke_agent_with_retry(agent, messages, config):
    """Invoke agent with exponential backoff retry logic."""
    try:
        return agent.invoke(messages, config)
    except Exception as e:
        print(f"Agent invocation failed: {e}")
        raise

# Use in production
try:
    response = invoke_agent_with_retry(
        agent,
        {"messages": [{"role": "user", "content": "Process this request"}]},
        config
    )
except Exception as e:
    # Log error and provide fallback response
    print(f"Agent failed after retries: {e}")
    response = {"messages": [{"role": "assistant", "content": "I encountered an error. Please try again later."}]}

3. Observability with LangSmith

LangSmith provides essential observability for production agents, including trajectory tracking, tool call monitoring, and performance analytics:

import os
from langsmith import Client

# Configure LangSmith (set these in environment variables)
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
os.environ["LANGCHAIN_PROJECT"] = "production-agents"

# LangSmith automatically traces all agent operations
response = agent.invoke(
    {"messages": [{"role": "user", "content": "Analyze Q4 sales"}]},
    config=config
)

# Access trace data programmatically
client = Client()
runs = client.list_runs(
    project_name="production-agents",
    execution_order=1,
    error=False
)

for run in runs:
    print(f"Run: {run.name}")
    print(f"Duration: {run.total_tokens} tokens")
    print(f"Cost: ${run.total_cost}")
    print(f"Tools used: {[tool.name for tool in run.child_runs]}")

4. Rate Limiting & Resource Management

from functools import wraps
import time
from collections import defaultdict
from threading import Lock

class RateLimiter:
    """Simple in-memory rate limiter for agent API calls."""

    def __init__(self, max_requests: int = 100, window_seconds: int = 60):
        self.max_requests = max_requests
        self.window_seconds = window_seconds
        self.requests = defaultdict(list)
        self.lock = Lock()

    def allow_request(self, user_id: str) -> bool:
        """Check if request is allowed under rate limit."""
        with self.lock:
            now = time.time()
            window_start = now - self.window_seconds

            # Remove old requests outside window
            self.requests[user_id] = [
                req_time for req_time in self.requests[user_id]
                if req_time > window_start
            ]

            # Check limit
            if len(self.requests[user_id]) >= self.max_requests:
                return False

            self.requests[user_id].append(now)
            return True

# Apply rate limiting
rate_limiter = RateLimiter(max_requests=50, window_seconds=60)

def invoke_with_rate_limit(user_id: str, message: str):
    """Invoke agent with rate limiting by user."""
    if not rate_limiter.allow_request(user_id):
        return {
            "error": "Rate limit exceeded. Please try again in a minute.",
            "status": 429
        }

    config = {"configurable": {"thread_id": f"user-{user_id}"}}
    return agent.invoke({"messages": [{"role": "user", "content": message}]}, config)

# Usage
response = invoke_with_rate_limit("user_123", "Search customer database")

5. Human-in-the-Loop Workflows

LangGraph's interrupt feature enables human approval before critical operations:

from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, END
from typing import TypedDict

class AgentState(TypedDict):
    messages: list
    pending_approval: bool
    approved_action: str | None

workflow = StateGraph(AgentState)

def process_request(state: AgentState):
    """Process user request and determine if approval needed."""
    last_message = state["messages"][-1]["content"]

    # Check if action requires approval
    requires_approval = any(
        keyword in last_message.lower()
        for keyword in ["delete", "purchase", "transfer", "modify"]
    )

    return {
        "pending_approval": requires_approval,
        "approved_action": last_message if requires_approval else None
    }

def execute_action(state: AgentState):
    """Execute action after approval."""
    if state.get("approved_action"):
        # Execute the approved action
        result = f"Executed: {state['approved_action']}"
        return {"messages": [{"role": "assistant", "content": result}]}
    return {"messages": [{"role": "assistant", "content": "Action cancelled"}]}

workflow.add_node("process", process_request)
workflow.add_node("execute", execute_action)

# Add conditional interrupt for approval
workflow.add_edge("process", "execute")
workflow.set_entry_point("process")
workflow.add_edge("execute", END)

memory = MemorySaver()
app = workflow.compile(
    checkpointer=memory,
    interrupt_before=["execute"]  # Pause before execution
)

# Usage: Submit request
config = {"configurable": {"thread_id": "approval-demo"}}
state = app.invoke(
    {"messages": [{"role": "user", "content": "Delete customer record ID 123"}]},
    config=config
)

# At this point, execution pauses. Check pending approval:
if state.get("pending_approval"):
    print(f"Action requires approval: {state['approved_action']}")

    # After human review, either resume or cancel
    user_approves = input("Approve? (yes/no): ").lower() == "yes"

    if user_approves:
        # Resume execution
        final_state = app.invoke(None, config=config)
        print(final_state["messages"][-1])
    else:
        # Cancel by updating state
        app.update_state(config, {"approved_action": None})
        print("Action cancelled by user")

Production Checklist

✓ Implement streaming for better UX
✓ Add retry logic with exponential backoff
✓ Enable LangSmith tracing for observability
✓ Set up rate limiting per user/tenant
✓ Configure human-in-the-loop for sensitive actions
✓ Use PostgresSaver for distributed deployments
✓ Implement comprehensive error handling
✓ Monitor costs and token usage
✓ Set up automated cleanup for old checkpoints

Advanced Agent Techniques

Once you've mastered basic agents, these advanced techniques enable more sophisticated behaviors and better performance at scale.

Multi-Agent Orchestration

Complex tasks often benefit from multiple specialized agents working together. LangGraph supports three main orchestration patterns:

1. Supervisor Pattern

A supervisor agent routes tasks to specialized worker agents (research agent, code agent, writing agent) and synthesizes results.

Best for: Tasks requiring different expertise domains

2. Swarm Pattern

Multiple agents communicate peer-to-peer, sharing information and collaborating on subtasks autonomously.

Best for: Parallel processing and distributed problem-solving

3. Sequential Pattern

Agents execute in a defined sequence, each processing the output of the previous agent (pipeline architecture).

Best for: Multi-stage workflows with clear dependencies

Context Window Management

Long conversations can exceed model context limits. Implement intelligent truncation:

from langchain_core.messages import trim_messages

def create_agent_with_context_management(max_tokens: int = 4000):
    """Create agent with automatic context window management."""

    # Create trimmer that keeps system message and recent history
    trimmer = trim_messages(
        max_tokens=max_tokens,
        strategy="last",  # Keep most recent messages
        token_counter=len,  # Use simple token counter
        include_system=True,  # Always keep system message
        allow_partial=False  # Don't split messages
    )

    # Modify agent to use trimmer
    workflow = StateGraph(state_schema=MessagesState)

    def call_model_with_trimming(state: MessagesState):
        # Trim messages before invoking model
        trimmed_messages = trimmer.invoke(state["messages"])
        response = model.invoke(trimmed_messages)
        return {"messages": [response]}

    workflow.add_node("model", call_model_with_trimming)
    workflow.add_edge(START, "model")

    return workflow.compile(checkpointer=memory)

# Agent now handles long conversations gracefully
agent = create_agent_with_context_management(max_tokens=4000)

Custom Agent Reasoning

Override default ReAct prompting for domain-specific reasoning patterns:

# Custom system prompt for financial analysis agent
financial_agent_prompt = """You are a financial analysis agent specialized in investment research.

When analyzing requests:
1. REASONING: Always start by breaking down the financial question
2. DATA GATHERING: Identify what data sources you need
3. CALCULATION: Use tools to retrieve and calculate metrics
4. RISK ASSESSMENT: Consider potential risks and limitations
5. RECOMMENDATION: Provide actionable insights with confidence levels

Available tools:
- fetch_stock_data: Get historical stock prices
- calculate_metrics: Compute financial ratios
- search_financial_news: Find relevant market news

Always cite data sources and explain your analytical reasoning."""

agent = create_react_agent(
    model=model,
    tools=tools,
    checkpointer=memory,
    state_modifier=financial_agent_prompt  # Custom reasoning instructions
)

Tool Result Validation

Add validation layers to ensure tool outputs are safe and expected before agents process them:

from pydantic import BaseModel, Field, ValidationError
from typing import Any

class ToolResultValidator(BaseModel):
    """Validate tool outputs before agent processes them."""

    success: bool = Field(description="Whether tool executed successfully")
    data: Any = Field(description="Tool output data")
    error: str | None = Field(default=None, description="Error message if failed")

    class Config:
        extra = "forbid"  # Reject unexpected fields

def validated_tool(func):
    """Decorator to add validation to tool results."""
    @wraps(func)
    def wrapper(*args, **kwargs):
        try:
            result = func(*args, **kwargs)

            # Validate result structure
            validated = ToolResultValidator(
                success=True,
                data=result,
                error=None
            )

            return validated.model_dump()

        except ValidationError as e:
            return {
                "success": False,
                "data": None,
                "error": f"Tool output validation failed: {str(e)}"
            }
        except Exception as e:
            return {
                "success": False,
                "data": None,
                "error": f"Tool execution failed: {str(e)}"
            }

    return wrapper

@tool
@validated_tool
def fetch_api_data(endpoint: str) -> dict:
    """Fetch data from external API with validation."""
    response = requests.get(endpoint)
    response.raise_for_status()
    return response.json()

# Agent receives validated, structured tool results

These advanced techniques transform basic agents into robust, production-ready systems capable of handling complex enterprise workflows.

Frequently Asked Questions

Should I use LangChain or LangGraph for new projects in 2025?

Use LangGraph for new projects. While LangChain agents continue to be supported, LangGraph is recommended by the LangChain team for production deployments. It offers superior state management, human-in-the-loop workflows, and debugging capabilities essential for enterprise applications.

What's the difference between MemorySaver, SqliteSaver, and PostgresSaver?

MemorySaver stores conversation state in RAM—fast but non-persistent across restarts, ideal for tutorials. SqliteSaver persists to disk, suitable for single-server deployments. PostgresSaver offers distributed storage with transaction support, required for multi-server production deployments. Choose based on your persistence and scale needs.

How do I prevent agents from infinite loops?

Set maximum iteration limits in your agent configuration. Use recursion_limit in LangGraph to cap the number of reasoning cycles. Implement timeout logic at the application level. Monitor agent trajectories with LangSmith to identify and debug looping behavior. Design tools with clear success criteria to help agents know when to stop.

What's the recommended way to handle tool errors?

Tools should return structured error messages that agents can understand, not raise exceptions. Use try-except blocks inside tools and return dictionaries with "error" fields. This allows agents to reason about failures and try alternative approaches. Implement retry logic for transient errors at the tool level, not agent level.

How do I monitor agent performance and costs?

Enable LangSmith tracing by setting LANGCHAIN_TRACING_V2=true. LangSmith automatically tracks token usage, tool calls, latency, and costs per conversation. Use the LangSmith dashboard to identify expensive patterns, optimize tool usage, and set budget alerts. Export traces programmatically for custom analytics.

Can I use multiple LLM models in one agent?

Yes. Use faster, cheaper models (GPT-4o-mini, Gemini 1.5 Flash) for simple tool-calling decisions and reserve premium models (GPT-5 Pro, Claude Opus 4) for complex reasoning steps. Implement model routing in your call_model function based on task complexity or conversation state. This can reduce costs by 60-80% while maintaining quality.

How do I migrate from LangChain agents to LangGraph?

Replace create_react_agent (LangChain) with create_react_agent from langgraph.prebuilt. The API is nearly identical—both accept model, tools, and checkpointer parameters. The main difference is LangGraph returns a compiled graph instead of a chain. Update imports and you're mostly done. Existing tools work without modification.

What's the maximum number of tools an agent can handle effectively?

Most models handle 10-20 tools effectively. Beyond that, tool selection accuracy degrades. For larger tool collections, implement tool retrieval: embed tool descriptions in a vector database and dynamically select relevant tools based on user queries. This enables agents to work with hundreds of tools by only loading relevant ones per request.

Ready to Build Production AI Agents?

You now have a comprehensive understanding of LangChain AI agents, from the ReAct pattern fundamentals to production deployment strategies. Start building with LangGraph for maximum flexibility and production readiness.

Need expert help implementing AI agents for your business? Digital Applied specializes in production AI systems and can accelerate your development timeline with proven patterns and best practices.

AI Development Services Get Expert Consultation

AI Development

Claude Agent Skills Framework: Build Specialized AI Agents

Master Claude Agent Skills: organized instructions, dynamic loading, domain-specific agents. Complete guide with code examples and production patterns.

11 min read

AI Development

OpenAI AgentKit Tutorial: Step-by-Step Guide to Building AI Agents (2025)

Complete OpenAI AgentKit tutorial: Learn step-by-step how to build AI agents with drag-and-drop builder, ChatKit, and MCP connectors. Includes pricing, examples, and deployment guide for 2025.

12 min read

AI Development

Qwen Models Guide: 600M to 1 Trillion Parameters

Master the entire Qwen3 model family - flagship Max-Preview, Coder-480B, Thinking models, and deployment strategies for every use case.

6 min read

Key Takeaways

Understanding the ReAct Pattern: The Agent Execution Loop

How ReAct Pattern Works

Why ReAct Pattern Matters

LangGraph vs LangChain: Choosing the Right Framework

Architecture Differences

When to Use Each Framework

Building Tools for LangChain Agents: Tool Calling and Function Execution

Creating Your First Tool

Tool Design Best Practices

1. Clear, Descriptive Names

2. Comprehensive Docstrings

3. Type Hints Required

4. Error Handling

Advanced Tool Example: API Integration

Agent Memory Management with MemorySaver and Checkpointing

Understanding MemorySaver

Production Memory with SqliteSaver

Enterprise Memory with PostgresSaver

Complete Agent Implementation with LangGraph

Full ReAct Agent Example

Agent Execution Flow Explained

Production-Ready Agent Deployment Patterns and Best Practices

1. Streaming Responses

2. Error Handling & Retries

3. Observability with LangSmith

4. Rate Limiting & Resource Management

5. Human-in-the-Loop Workflows

Advanced Agent Techniques

Multi-Agent Orchestration

Context Window Management

Custom Agent Reasoning

Tool Result Validation

Frequently Asked Questions

Should I use LangChain or LangGraph for new projects in 2025?

What's the difference between MemorySaver, SqliteSaver, and PostgresSaver?

How do I prevent agents from infinite loops?

What's the recommended way to handle tool errors?

How do I monitor agent performance and costs?

Can I use multiple LLM models in one agent?

How do I migrate from LangChain agents to LangGraph?

What's the maximum number of tools an agent can handle effectively?

Ready to Build Production AI Agents?

Related Articles