AI

Building AI Agents with Python: A Practical Guide

Harshit Rathod
#ai#python#langchain#agents

AI agents are programs that can reason, plan, and take actions autonomously. Unlike simple chatbots that just respond to prompts, agents can use tools, maintain memory, and execute multi-step workflows. In this guide, we’ll build one from scratch.

Abstract visualization of an AI neural network with glowing nodes and connections

“The best way to predict the future is to invent it.” — Alan Kay

What You’ll Learn

  1. What AI agents are and how they differ from chatbots
  2. Setting up LangChain with OpenAI
  3. Building a tool-using agent
  4. Adding memory and context
  5. Deployment considerations

Prerequisites

Before we start, make sure you have:

ToolVersionPurpose
Python3.11+Runtime
LangChain0.1.xAgent framework
OpenAI SDK1.xLLM provider
FastAPI0.100+API serving
Redis7.xMemory store

Setting Up the Project

First, create a new project and install dependencies:

mkdir ai-agent && cd ai-agent
python -m venv .venv
source .venv/bin/activate
pip install langchain openai fastapi uvicorn redis

Create the project structure:

ai-agent/
├── agent/
│   ├── __init__.py
│   ├── core.py          # Agent logic
│   ├── tools.py         # Custom tools
│   └── memory.py        # Memory management
├── api/
│   └── server.py        # FastAPI endpoints
├── tests/
│   └── test_agent.py
├── .env
└── requirements.txt

Building the Agent Core

Here’s the heart of our agent — the core module that ties together the LLM, tools, and memory:

from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationBufferWindowMemory

class AIAgent:
    """An autonomous AI agent with tool use and memory."""

    def __init__(self, model: str = "gpt-4", temperature: float = 0.1):
        self.llm = ChatOpenAI(model=model, temperature=temperature)
        self.memory = ConversationBufferWindowMemory(
            memory_key="chat_history",
            return_messages=True,
            k=10  # Keep last 10 exchanges
        )
        self.tools = self._load_tools()
        self.agent = self._build_agent()

    def _load_tools(self) -> list:
        """Load all available tools for the agent."""
        from agent.tools import web_search, code_executor, file_reader
        return [web_search, code_executor, file_reader]

    def _build_agent(self) -> AgentExecutor:
        """Construct the agent with prompt, tools, and memory."""
        prompt = ChatPromptTemplate.from_messages([
            ("system", "You are a helpful AI assistant. Use tools when needed."),
            MessagesPlaceholder("chat_history"),
            ("human", "{input}"),
            MessagesPlaceholder("agent_scratchpad"),
        ])

        agent = create_openai_tools_agent(self.llm, self.tools, prompt)

        return AgentExecutor(
            agent=agent,
            tools=self.tools,
            memory=self.memory,
            verbose=True,
            max_iterations=5,
        )

    async def run(self, query: str) -> str:
        """Execute the agent with a user query."""
        result = await self.agent.ainvoke({"input": query})
        return result["output"]

Key Design Decisions

Notice a few things about this implementation:

Warning: Always set max_iterations when building agents. Without it, a confused agent can enter an infinite loop, burning through your API credits.

Creating Custom Tools

Tools are what give agents their power. Here’s how to create a web search tool:

from langchain.tools import tool
import httpx

@tool
def web_search(query: str) -> str:
    """Search the web for current information.

    Args:
        query: The search query string

    Returns:
        Search results as formatted text
    """
    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://api.search.example.com/search",
            params={"q": query, "limit": 5}
        )
        results = response.json()

    formatted = []
    for r in results["items"]:
        formatted.append(f"**{r['title']}**\n{r['snippet']}\n{r['url']}")

    return "\n\n".join(formatted)

And a code execution tool with sandboxing:

@tool
def code_executor(code: str, language: str = "python") -> str:
    """Execute code in a sandboxed environment.

    Args:
        code: The code to execute
        language: Programming language (python, javascript)

    Returns:
        Execution output or error message
    """
    import subprocess
    import tempfile

    with tempfile.NamedTemporaryFile(
        mode="w", suffix=f".{language}", delete=True
    ) as f:
        f.write(code)
        f.flush()

        result = subprocess.run(
            ["python", f.name],
            capture_output=True,
            text=True,
            timeout=30,  # Kill after 30 seconds
        )

    if result.returncode != 0:
        return f"Error:\n{result.stderr}"

    return result.stdout

The Agent Loop Explained

Here’s how the agent processes a request — visualized as a flowchart:

flowchart TD
    A([User Query]) --> B[LLM Reasoning\nGPT-4 + Chat History]
    B --> C{Use a tool?}
    C -- No --> D([Return Response])
    C -- Yes --> E[Tool Execution\nsearch, code, file read]
    E --> F[Tool Result]
    F --> B

Python code on a dark IDE with AI-related imports and function definitions

Agent Architecture Overview

Here’s the high-level architecture of our agent system:

graph LR
    U([User\nAPI / CLI]) --> AC

    subgraph AC[Agent Core]
        PB[Prompt Builder]
        TR[Tool Router]
        OP[Output Parser]
        IG[Iteration Guard]
    end

    AC --> LLM[LLM\nGPT-4 / Claude]
    AC --> Tools

    subgraph Tools
        WS[Web Search]
        CE[Code Executor]
        FR[File Reader]
    end

    AC -.-> Mem[(Memory\nRedis / Buffer)]

Adding the API Layer

Wrap the agent in a FastAPI server:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from agent.core import AIAgent

app = FastAPI(title="AI Agent API")
agent = AIAgent()

class QueryRequest(BaseModel):
    query: str
    session_id: str | None = None

class QueryResponse(BaseModel):
    response: str
    tools_used: list[str]

@app.post("/chat", response_model=QueryResponse)
async def chat(request: QueryRequest):
    try:
        result = await agent.run(request.query)
        return QueryResponse(
            response=result,
            tools_used=agent.last_tools_used
        )
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Run it:

uvicorn api.server:app --reload --port 8000

Test with curl:

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the current price of Bitcoin?"}'

Performance Comparison

Here’s how different LLM backends compare for agent tasks:

ModelTool AccuracyLatency (p50)Cost/1K tokens
GPT-495%2.1s$0.03
GPT-3.5 Turbo78%0.8s$0.002
Claude 3 Opus93%2.5s$0.015
Llama 3 70B71%1.2s$0.001

GPT-4 remains the gold standard for complex tool use, but Claude 3 is closing the gap fast — especially for code-related tasks.

Testing Your Agent

Always test agents with deterministic inputs:

import pytest
from agent.core import AIAgent

@pytest.fixture
def agent():
    return AIAgent(model="gpt-3.5-turbo", temperature=0)

@pytest.mark.asyncio
async def test_basic_query(agent):
    result = await agent.run("What is 2 + 2?")
    assert "4" in result

@pytest.mark.asyncio
async def test_tool_use(agent):
    result = await agent.run("Search for the latest Python release")
    assert len(result) > 0
    assert agent.last_tools_used == ["web_search"]

@pytest.mark.asyncio
async def test_memory_persistence(agent):
    await agent.run("My name is Harshit")
    result = await agent.run("What is my name?")
    assert "Harshit" in result

Common Pitfalls

Here are mistakes I’ve made so you don’t have to:

  1. Not setting token limits — Agents can generate massive prompts
  2. Skipping tool descriptions — The LLM reads these to decide when to use tools
  3. No error handling in tools — A crashing tool breaks the entire agent loop
  4. Trusting agent output blindly — Always validate before executing actions

The Fix Pattern

# ❌ Bad: No error handling
@tool
def dangerous_tool(input: str) -> str:
    return requests.get(input).text

# ✅ Good: Defensive tool design
@tool
def safe_tool(input: str) -> str:
    """Fetch content from a URL safely."""
    try:
        response = requests.get(input, timeout=10)
        response.raise_for_status()
        return response.text[:5000]  # Limit output size
    except requests.RequestException as e:
        return f"Error fetching URL: {e}"

A futuristic server room with blue lighting representing AI infrastructure

What’s Next?

In the next post, we’ll cover:


Found this useful? Subscribe to my newsletter for more AI engineering tutorials.

← Back to Blog

Subscribe to the Newsletter

Get notified when I publish new posts. No spam, unsubscribe anytime.