April 8, 2025

Building a Conversational Agent with LangGraph, Twilio, Modal and FastAPI

Imagine you’re an inventory manager for a retail business and you need to quickly verify stock levels for an unexpected large order while away from your computer. Instead of logging into a dashboard, navigating through menus, and analyzing spreadsheets, you simply send a text message:

“Do we have enough iPhones to fulfill an order for 100 units?”

Within seconds, you receive a reply:

“You have 157 iPhones currently in stock, so you can fulfill this order. Based on your current sales rate of 5.3 units/day, you should consider reordering within 2 weeks to maintain optimal inventory levels.”

This is the power of conversational AI agents that integrate with real business data – they provide immediate, actionable insights through natural language interactions, accessible anywhere you have your phone.

In this tutorial, we’ll build this kind of agent. LangGraph will provide the foundation for creating a stateful, intelligent conversational system. Twilio will enable communication through familiar SMS channels, while Modal will handle the deployment infrastructure.

By the end of this tutorial, you’ll have created an inventory management assistant that can:

Answer complex questions about current stock levels
Provide recommendations on what products to reorder
Calculate when products will run out based on current sales rates
Identify trends in product performance
Deliver insights in a concise format optimized for SMS

This approach represents a significant evolution beyond traditional interfaces, allowing AI to connect with users through the most natural form of interaction – conversation. Whether you’re building for inventory management, customer service, sales support, or any other business function, the architecture we’ll explore provides a flexible foundation you can adapt to your specific needs. You can download the completed project on Github.

Prerequisites

To follow this tutorial, you will need the following:

A Modal account (free tier is sufficient)
A Twilio account with credits for SMS and voice capabilities
An OpenAI API key or Anthropic API key
Python 3.10+ installed

Setting up Modal

Go to www.modal.com and create an account. After creating an account, set up the Modal client library on your machine:

pip install modal
python3 -m modal setup

After going through the setup, create a new directory to hold your project, and create a file for your server code.

mkdir conversational-agent && cd conversational-agent
touch server.py

Open Open server.py in your editor of choice and paste in Modal’s hello world code. We will run this to make sure everything is set up correctly.

import modal

app = modal.App("example-get-started")


@app.function()
def square(x):
    print("This code is running on a remote worker!")
    return x**2


@app.local_entrypoint()
def main():
    print("the square is", square.remote(42))

modal run server.py

When it runs successfully, you should see some console output printing the value of the square (1754) and in Modal, you should see evidence of its run under the Apps -> Stopped Apps section. If everything checks out, you can clear out the server.py file.

While we’re in Modal, let’s create a secrets container. Go to the secrets tab along the top, and create a new secret. For the type of secret, select custom, and give it a name. It won’t let you save until you add at least one secret, so now is a good time to add your Anthropic or OpenAI API key.

Back in your editor, add a requirements.txt to hold our dependencies. Here are the dependencies we will need:

modal
fastapi
python-dotenv

twilio

langgraph
langchain
langchain-openai
langchain-core==0.3.33
langgraph-supervisor

Install the dependencies with pip3 install -r requirements.txt

With that, our Modal project is ready to go. You should be able to run it now, even though it won’t do anything yet.

Project Scaffolding

We will add some of the application code now that will form the scaffolding of our application. The code in server.py configures the image that will run our application and sets up FastAPI. The routes.py file contains the stubbed out routes that will eventually handle our text and voice messages. These are the webhooks that Twilio will call whenever we receive a text or a voice message.

# server.py

import os
import modal

app = modal.App(name="conversational-agent")

image = modal.Image.debian_slim(python_version="3.12").pip_install_from_requirements("requirements.txt").apt_install().run_commands([
    "curl -fsSL https://deb.nodesource.com/setup_20.x | bash -",
    "apt-get install -y nodejs",
    "rm -rf /var/lib/apt/lists/*",
]).add_local_python_source("models").add_local_file("router.py", "/root/router.py")
@app.cls(
    image=image,
    container_idle_timeout=900,
    allow_concurrent_inputs=50
)

class ConversationalAgent:
    @modal.asgi_app()
    def fastapi_app(self):
        from fastapi import FastAPI
        from router import router
        from fastapi.middleware.cors import CORSMiddleware

        web_app = FastAPI()

        # Add CORS middleware
        web_app.add_middleware(
            CORSMiddleware,
            allow_origins=[],
            allow_credentials=True,
            allow_methods=["*"],
            allow_headers=["*"],
        )

        web_app.include_router(router)
        return web_app

# router.py

from fastapi import Request
from fastapi import APIRouter
from fastapi.responses import PlainTextResponse

router = APIRouter()

@router.post("/twilio/text", response_class=PlainTextResponse)
async def handle_sms_webhook(request: Request):
    pass

@router.post("/twilio/voice", response_class=PlainTextResponse)
async def handle_voice_webhook(request: Request):
    pass

Notice how when you save the file, modal automatically deploys it to your application. Pretty neat! Check the output to make sure things aren’t broken, and then let’s move on to Twilio.

Twilio

Before we can communicate with users, we need to establish communication channels. Twilio provides a straightforward way to handle both SMS and voice interactions through their platform. In this section, we’ll walk through the process of acquiring a phone number that will serve as the entry point for all communications with our agent, creating a TwiML application to handle the incoming messages and calls, creating a messaging service and connecting these components by configuring our endpoints. This setup will create the bridge between users’ devices and our Modal-deployed LangGraph agent.

Create a Twilio account and in the Develop tab on the left, you may see “Phone Numbers” listed. If not, click on “Explore Products” and search for “Phone Numbers”. When you find it, hit the pin to pin it to your sidebar for easy access. Do the same for the Messaging service.

There are three things we need to do here. First create a TwiML app, then create a message service and finally provision a phone number.

Click on “TwiML apps” under the “Manage” section in the sidebar. In the top right corner, click “Create new TwiML app”. Give it a friendly name and then click Create. Notice the sections for Voice Configuration and Messaging Configuration. We don’t have these URLs yet but this is what we will configure later once we’re ready to hook Twilio up to the Modal application.

After creating your TwiML app, create the messaging service that we will use with our phone number. Accept the default settings here.

After creating your messaging service, buy a phone number. Make sure you find numbers that work with Voice, SMS and MMS. In the Voice Configuration and Messaging Configuration sections, select “TwiML App” in the “Configure with” dropdown. Select the TwiML app you just created in the next dropdown. In the Messaging Configuration section, you will also select the “Messaging Service” we just created as well as the TwiML app.

Now that Twilio is (mostly) setup, and you’ve got a phone number provisioned, hop back over to Modal and add the required Twilio related secrets:

Integrating Twilio

Now that our Twilio account is set up, it’s time to integrate it with our application. Now that we have the (stubbed out) endpoints, we can go back to Twilio and configure our TwiML application to use them. In the Twilio console, navigate to the TwiML apps section. There are two URLs you need to set here. To get the root of the URL, look in the console where you’re running Modal and append our text and voice routes onto the end.

In router.py, update your /text route to the following:

@router.post("/twilio/text", response_class=PlainTextResponse)
async def handle_sms_webhook(request: Request):
    form_data = await request.form()
    body = form_data.get("Body")

    twilio_signature = request.headers.get("X-Twilio-Signature")
    if not twilio_signature:
        raise HTTPException(status_code=400, detail="Missing Twilio signature")
    if not twilio_validator.validate(str(request.url), dict(form_data), twilio_signature):
        raise HTTPException(status_code=403, detail="Invalid Twilio signature")

    try:
        if body.lower() == "start":
            response_message = f"Hello, we received your message to enroll in our conversational agent service. How can I help you today?"
        elif body.lower() == "stop":
            response_message = f"You have successfully been unsubscribed. You will not receive any more messages from this number. Reply START to resubscribe."
        else:
            # This is where we will handle passing the message to the agent
            response_message = "Hello"
        return response_message

    except Exception as e:
        raise HTTPException(status_code=500, detail="Error processing webhook")

At the top of the file around where you create the router, you will also need to add the twilio_validator object:

router = APIRouter()
twilio_validator = RequestValidator(os.environ['TWILIO_AUTH_TOKEN'])

At this point, you should be able to text your Twilio phone number and it should respond with “Hello”. Now it’s time to build our agent.

What is LangGraph

You may already be familiar with LangChain. LangChain emerged as one of the first frameworks that allowed developers to create chains of operations with language models, connecting various components like prompts, models, and tools into sequential workflows. While powerful, these chains were primarily linear in nature, which limited their ability to handle complex, non-linear conversations.

LangGraph builds upon LangChain’s foundation by introducing stateful, cyclical graph-based workflows. This is a fundamental shift that enables true agent behavior. Rather than forcing interactions into a predetermined linear path, LangGraph allows the conversation to flow naturally between different states based on the context and needs of the interaction.

Think of LangChain as creating a flowchart with a clear beginning and end, while LangGraph creates something more akin to a state machine that can transition between different modes of operation as needed, maintain memory of previous states, and even loop back to earlier points in the conversation.

What are AI Agents

At its core, an AI agent is a system that can perceive its environment (through user inputs or external data), make decisions based on that information, and take actions to achieve specific goals. What distinguishes agents from simple chatbots or Q&A systems is their ability to:

– Maintain persistent state across multiple interactions

– Make autonomous decisions about what actions to take next

– Utilize tools and external systems to accomplish tasks

– Adapt their behavior based on feedback and changing circumstances

LangGraph provides the infrastructure needed to create these capabilities through its state management, tool integration, and flexible graph-based workflows. By maintaining context and allowing dynamic transitions between different processing nodes, it enables the creation of agents that feel more like assistants with agency rather than simple query-response systems.

The Supervisor Architecture

The supervisor architecture represents one of the most effective patterns for building complex AI agent systems. In this approach, a central “supervisor” agent coordinates the activities of specialized sub-agents, each focused on specific tasks or domains of knowledge.

This architecture offers several key advantages:

Separation of concerns: Each sub-agent can be optimized for its specific task without being burdened by the complexity of the entire system.
Improved reasoning: The supervisor can make high-level decisions about which specialist to engage, while specialists can perform deep reasoning within their domains.
Flexible composition: New capabilities can be added by introducing new specialist agents without redesigning the entire system.
Enhanced reliability: If one sub-agent fails or provides low-confidence results, the supervisor can redirect to alternative approaches.

As we move into implementing our example, keep in mind that this architecture mimics how human organizations often work—with generalists who coordinate and specialists who execute.

Our Agent

Our architecture follows the supervisor model, with a primary coordinator agent that handles incoming requests, maintains conversation context, and delegates specialized tasks to expert sub-agents. For this tutorial, we’ll implement an Inventory Management Agent as our first specialist, capable of analyzing stock levels, identifying trends, and making recommendations to optimize inventory positions.

While we focus on inventory management for this demonstration, the architecture we’re creating is designed to scale. You could extend this system by adding specialized agents for financial analysis, marketing performance, customer engagement, or any other business function—all coordinated by the same supervisor framework. This modular approach allows the system to grow alongside your business without requiring a fundamental redesign.

What makes this approach powerful is how it transforms complex business intelligence processes into simple conversations. Rather than requiring business owners to navigate multiple dashboards, run reports, or analyze spreadsheets, they can simply ask questions in plain language and receive insights tailored to their specific context and needs. The agent will not only provide data but interpret it and suggest concrete actions based on business priorities and constraints.

Start by adding a graphs folder with some empty files to contain our agent code. Your folder structure should look like this:

graphs/
    __init__.py
    supervisor.py
    agent_system.py
    inventory_agent/
        __init__.py
        inventory_agent.py
        tools.py

Let’s walk through what these files will contain.

agent_system.py: The main orchestrator that sets up the agent system, creating an inventory agent and a supervisor agent, and providing a message processing interface.

supervisor.py: Contains the supervisor agent implementation that manages and routes conversations to specialized agents (currently the inventory agent), with specific routing rules and conversation management logic.

inventory_agent/inventory_agent.py: Implements a specialized inventory management agent that provides expertise in supply chain optimization, demand forecasting, and inventory control, with detailed capabilities and communication guidelines.

inventory_agent/tools.py: This will contain the specific tools that the inventory agent can use to perform its tasks, such as inventory analysis, forecasting, and optimization functions.

Add the following code to supervisor.py:

from typing import List, Dict, Any
from langchain_openai import ChatOpenAI
from langgraph_supervisor import create_supervisor
from langgraph.prebuilt import create_react_agent

def create_supervisor_agent(agents: List[create_react_agent], model: ChatOpenAI):
    supervisor = create_supervisor(
        agents,
        model=model,
        supervisor_name="supervisor",
        output_mode="last_message",
        prompt="""
        You are a business intelligence assistant that provides cross-functional insights to business owners.

        You have access to one expert:
        - Inventory Management Expert: For all inventory-related questions and analyses

        ROUTING RULES:
        - Route ALL inventory-related questions to the Inventory Management Expert (stock levels, trends, recommendations, forecasts)
        - For questions outside of inventory, politely explain those capabilities aren't available yet

        RESPONSE GUIDELINES:
        - Keep ALL responses brief and concise since they will be delivered via text message
        - Focus on the most important insight or recommendation first
        - Use specific numbers when possible
        - Maintain conversation context when switching between topics
        """
    )
    return supervisor

We’re making use of a library called LangGraph Supervisor to create a supervisor node. This small library just creates an agent, and gives it tools with which to call other agents. It simplifies the task of implementing the supervisor architecture. In the prompt, we tell the LLM what kind of agent it is, and give it guidelines on how to route certain requests and create responses.

In inventory_agent.py, add the following code:

from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
from .tools import create_inventory_tools

def create_inventory_agent(model: ChatOpenAI):
    tools = create_inventory_tools()
    return create_react_agent(
        name="inventory_agent",
        model=model,
        tools=tools,
        prompt="""
        You are an Inventory Management Analyst helping business owners understand their inventory and make better decisions.

        CAPABILITIES:
        - Analyze current inventory levels and identify potential issues
        - Calculate optimal reorder points and quantities
        - Recommend strategies for inventory optimization
        - Provide insights on seasonal trends and demand patterns

        RESPONSE GUIDELINES:
        - Keep ALL responses brief and concise for text message delivery
        - Begin with the key insight or recommendation
        - Use specific numbers and clear action items
        - Prioritize practical advice that can be implemented immediately

       Focus on providing high-value insights in as few words as possible while still maintaining clarity.

        When responding to questions, consider important business factors like:
        - Cash flow constraints
        - Storage limitations
        - Product shelf life
        - Sales patterns and seasonal demands

        Example Response to "Do we have enough inventory of product X?":
        "Product X will stockout in 18 days. Order 150 units this week (+15% from usual). Recent sales up 23% in this category."
        """
    )

This follows a very similar pattern to the supervisor, which reinforces the point that a supervisor is just a regular agent whose job it is to delegate tasks to other agents. In this file, we use a library provided function called create_react_agent where in the previous file we used create_supervisor.

In agent_system.py, add the following:

from langchain_openai import ChatOpenAI
from graphs.inventory_agent.inventory_agent import create_inventory_agent
from graphs.supervisor import create_supervisor_agent
from typing import Dict, Any
from langchain.schema import HumanMessage
from langgraph.store.memory import InMemoryStore
from langgraph.checkpoint.memory import MemorySaver

model = ChatOpenAI(model="gpt-4o")
in_memory_store = InMemoryStore()
checkpointer = MemorySaver()

def create_agent_system():
    inventory_agent = create_inventory_agent(model)

    workflow = create_supervisor_agent(
        agents=[inventory_agent],
        model=model
    )

    return workflow.compile(store=in_memory_store, checkpointer=checkpointer)

graph = create_agent_system()

async def process_message(message: str, thread_id: str = None) -> Dict[str, Any]:
    """Process an incoming message through the agent system"""

    config_dict = {
        "configurable": {
            "thread_id": thread_id
        }
    }

    human_message = HumanMessage(content=message)
    return await graph.ainvoke({"messages": [human_message]}, config=config_dict)

Here, we define a function to create the agent system. We create the inventory agent first, pass it into the constructor for our supervisor agent, and then call compile. We also define a function that invokes our graph whenever a new message comes in.

Note that we are using the InMemoryStore and MemorySaver to store our conversation state. You can think of this as a non-persistent cache of conversations that will be cleared whenever the server restarts. For a production application, you’d want to swap this out with a store and a checkpointer that utilize durable storage. See this for more details.

Now, we just have to wire up our graph to the rest of the application. Open router.py and add this utility function:

async def chat(message: str, thread_id: str):
    result = await process_message(
        message=message,
        thread_id=thread_id
    )

    # Get the last message's content (will always be an AI message)
    messages = result.get("messages", [])
    if not messages:
        return ("", "")

    last_message = messages[-1]
    content = last_message.get("content", "") if isinstance(last_message, dict) else getattr(last_message, "content", "")
    return (content, getattr(last_message, "id", ""))

Also in this file, replace the comment # This is where we will handle passing the message to the agent with the following:

agent_response = await chat(body, from_number)
response_message = f"{agent_response[0]}"

With these pieces, you can interact with the agent. Say “hello” and you will get greeted by the supervisor. Tell the supervisor that you want to check stock levels for your most popular product, and the inventory agent will respond that they don’t have enough information. That’s because we haven’t given it any tools!

Adding Tools

Tools are the building blocks that give LLM agents their ability to interact with the outside world. Think of them as the “hands” of your AI agent – they’re the functions that allow the agent to actually do things rather than just talk about them. For example, an agent might use tools to query your company’s database for real-time inventory levels, search the web for market trends, or analyze sales data to generate forecasts. It could send automated emails to suppliers when stock is low, create visualizations of inventory trends, or even execute complex calculations to optimize reorder points

To wrap up this tutorial, let’s give the inventory agent some tools to demonstrate how LLMs can navigate complex data. Rather than listing the entire implementation here (which is quite lengthy), you can find the complete tools.py file in our GitHub repository.

Here’s a brief overview of what this file contains:

INVENTORY_DATA: A dictionary containing sample inventory items, warehouse details, and business constraints that simulates data you might retrieve from a database in a real application.

Three key tool functions that our agent will use:

get_inventory_status: Returns current stock levels and alerts for products running low
get_product_details: Provides detailed information about specific products
get_reorder_recommendations: Suggests optimal reorder quantities based on current trends

Here’s a small sample of how one of these tools is implemented:

@tool("get_inventory_status")
def get_inventory_status() -> Dict[str, Any]:
    """Get current inventory status for all products including stock levels, alerts, and warehouse utilization."""
    today = datetime.now()
    status = {
        "overview": {
            "total_products": len(INVENTORY_DATA["products"]),
            "total_stock_value": sum(p["current_stock"] * p["cost_per_unit"] for p in INVENTORY_DATA["products"]),
            "warehouse_utilization": f"{(INVENTORY_DATA['warehouse']['current_utilization_units'] / INVENTORY_DATA['warehouse']['max_capacity_units']) * 100:.1f}%",
            "as_of_date": today.strftime("%Y-%m-%d")
        },
        "alerts": [],
        "product_status": []
    }
    
    # Additional logic to populate alerts and product status...
    
    return status

With the agent up and running, try asking the following questions:

Which products are running low?

Is there anything I need to reorder right now?
Which product has the highest growth rate?
What’s the total value of our inventory?
When will we run out of product P001 at current sales rates?

And with that, you’ve got a basic conversational agent that’s available at any time via text that can answer detailed questions and provide actionable insights about our inventory.

Where to go from here

You’ve now built a complete conversational agent that’s accessible through text messaging! Let’s recap what we’ve accomplished:

We set up a serverless environment on Modal that can receive webhook calls from Twilio
We created a Twilio SMS integration to send and receive text messages
We implemented a LangGraph agent system with a supervisor architecture
We built a specialized inventory management agent with tools to analyze product data
We connected everything together into a working system that maintains conversation state

What’s particularly powerful about this approach is how easily it can be extended. Since we’ve used the supervisor architecture, you could add new specialized agents to handle different business domains like:

A financial analysis agent for tracking revenue and expenses
A marketing agent for analyzing campaign performance
A customer service agent for handling common support queries

Each new agent would only need its own set of specialized tools and domain-specific instructions.

The beauty of this architecture is that the core agent system remains the same regardless of how you expand the functionality or where you deploy it.

By removing traditional user interface barriers and meeting users where they already are—on their phones via familiar text messaging—you’ve created a business intelligence tool that’s both powerful and accessible. This is just one example of how AI agents and natural language can transform complex digital experiences into simple conversations.

You can download the full project on Github.