Let’s build a restaurant reservation system to speak with a voice agent via a real-time phone call. The service will have three main features:
- Agent Outbound Call: The agent can act as both a customer helper and a restaurant assistant. For example, it can be configured as an AI restaurant employee that calls customers back to resolve orders or cancel reservations.
- Inbound Call to Agent: People can call from their mobile phones and use telephony services to speak with the agent to make reservations.
- Retrieval Augmented Generation (RAG): The system uses RAG to efficiently and reliably fetch restaurant data. For example, when a user requests a reservation cancellation or special offers for a particular day, it searches its knowledge bases for an exact match to the user’s query before responding.
Project Overview and Architecture
The restaurant reservation agent combines several AI technologies, models, services, and platforms to build an all-in-one integrated solution agent phone calling, messaging, and RAG. Since it integrates seamlessly with Twilio messaging, you can send an SMS to the service in addition to phone calls using the built-in phone app on your device.
The demos in this article were tested using the iPhone Phone App.
You will use:
- Vision Agents: For the agent orchestration and voice pipeline assembly. It provides all the building blocks for quickly creating voice, video, and vision AI apps in Python.
- Turbopuffer: To equip the Twilio telephony system with a knowledge base.
- Twilio: As the telephony service.
- ElevenLabs: For voice recognition.
- Deepgram: For voice synthesis.
- Gemini: The underlying AI model processes the speech conversations in the system.
Below are the demos you will build. The first demonstrates a phone call between a customer and the AI reservation service using the iPhone. The second demo is part of the first, but it illustrates the agent retrieving real-time information about restaurant opening hours and cancellation policy using a Turbopuffer knowledge base.
Download the two sample demos from GitHub and try them with the API credentials from the following:
The app will be served on your localhost through an NGROK public URL. Sign up for an NGROK account if you don't have one yet.
Initial AI Phone Call: A Customer Booking Reservation
Agent Retrieving Required Knowledgebase Document to Answer a Customer
Quick Start in Python
Ensure you have Python 3.12 or later installed on your machine, then configure your project with the following to get started.
123456789101112131415161718192021222324252627# Requirements: Python 3.12 or later # .env STREAM_API_KEY=... STREAM_API_SECRET=... # For vector and full-text search TURBOPUFFER_API_KEY=... # AI services GOOGLE_API_KEY=... ELEVENLABS_API_KEY=... DEEPGRAM_API_KEY=... # Telephony service TWILIO_ACCOUNT_SID=... TWILIO_AUTH_TOKEN=... # Put your local host on a public URL NGROK_URL=https://replace_with_your.ngrok-free.app # Create a new Python project with uv uv init # Install Vision Agents and required Python plugins uv add vision-agents uv add "vision-agents[getstream, turbopuffer, twilio, gemini, elevenlabs, deepgram]"
Basic Turbopuffer Usage in Vision Agents
With your Python project configured and Vision Agents installed along with its plugins, you can now start using Turbopuffer in your project. Here is the sample code for basic usage.
1234567891011121314151617181920from vision_agents.plugins import turbopuffer # Initialize RAG rag = turbopuffer.TurboPufferRAG(namespace="my-knowledge") await rag.add_directory("./knowledge") # Hybrid search (default) results = await rag.search("How does the chat API work?") # Vector-only search results = await rag.search("How does the chat API work?", mode="vector") # BM25-only search results = await rag.search("chat API pricing", mode="bm25") # Or use convenience function rag = await turbopuffer.create_rag( namespace="product-knowledge", knowledge_dir="./knowledge" )
In the example above, we initialize Turbopuffer for RAG and perform a hybrid and vector-only search.
Restaurant Information Retrieval Example With Turbopuffer
Add inbound_phone_call_turbopuffer_rag.py to your uv-based Python project and replace its content with the following. In the example below, we implement an inbound AI phone-calling system with a RAG backend using Turbopuffer and LangChain function calling.
Additionally, you should create a project knowledge base. Let’s add the knowledge directory and create three Markdown documents in it.
You can copy the content of each knowledge base document from the GitHub project.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152import asyncio import logging import os import traceback import uuid from pathlib import Path import uvicorn from uvicorn.middleware.proxy_headers import ProxyHeadersMiddleware from dotenv import load_dotenv from fastapi import Depends, FastAPI, Request, WebSocket from fastapi.responses import JSONResponse from vision_agents.core import User, Agent from vision_agents.plugins import ( getstream, gemini, twilio, elevenlabs, deepgram, turbopuffer, ) logger = logging.getLogger(__name__) logging.basicConfig(level=logging.INFO) load_dotenv() NGROK_URL = os.environ["NGROK_URL"].replace("https://", "").replace("http://", "").rstrip("/") KNOWLEDGE_DIR = Path(__file__).parent / "knowledge" # Global TurboPuffer RAG state (initialized on startup) rag = None app = FastAPI() # Trust proxy headers from ngrok so Twilio signature validation works (https vs http) app.add_middleware(ProxyHeadersMiddleware, trusted_hosts=["*"]) call_registry = twilio.TwilioCallRegistry() @app.exception_handler(Exception) async def global_exception_handler(request: Request, exc: Exception): logger.error(f"Unhandled exception: {exc}\n{traceback.format_exc()}") return JSONResponse(status_code=500, content={"detail": str(exc)}) @app.post("/twilio/voice") async def twilio_voice_webhook( _: None = Depends(twilio.verify_twilio_signature), data: twilio.CallWebhookInput = Depends(twilio.CallWebhookInput.as_form), ): """Twilio call webhook. Validates signature and starts the media stream.""" logger.info( f"📞 Call from {data.caller} ({data.caller_city or 'unknown location'})" ) call_id = str(uuid.uuid4()) async def prepare_call(): agent = await create_agent() await agent.create_user() phone_number = data.from_number or "unknown" sanitized_number = ( phone_number.replace("+", "") .replace(" ", "") .replace("(", "") .replace(")", "") ) phone_user = User( name=f"Call from {phone_number}", id=f"phone-{sanitized_number}" ) await agent.edge.create_user(user=phone_user) stream_call = await agent.create_call("default", call_id=call_id) return agent, phone_user, stream_call twilio_call = call_registry.create(call_id, data, prepare=prepare_call) url = f"wss://{NGROK_URL}/twilio/media/{call_id}/{twilio_call.token}" logger.info("twilio redirect to %s", url) return twilio.create_media_stream_response(url) @app.websocket("/twilio/media/{call_id}/{token}") async def media_stream(websocket: WebSocket, call_id: str, token: str): """Receive real-time audio stream from Twilio.""" twilio_call = call_registry.validate(call_id, token) logger.info(f"🔗 Media stream connected for {twilio_call.caller}") twilio_stream = twilio.TwilioMediaStream(websocket) await twilio_stream.accept() twilio_call.twilio_stream = twilio_stream try: ( agent, phone_user, stream_call, ) = await twilio_call.await_prepare() twilio_call.stream_call = stream_call await twilio.attach_phone_to_call(stream_call, twilio_stream, phone_user.id) async with agent.join(stream_call, participant_wait_timeout=0): await agent.llm.simple_response( text="Act as a restaurant assistant and help customers make a reservation. Greet the caller warmly and mention what menu is available and special offers for the day. Use your knowledge base to provide relevant booking or reservation information." ) await twilio_stream.run() finally: call_registry.remove(call_id) async def create_rag_from_directory(): """Initialize TurboPuffer RAG from the knowledge directory.""" global rag if not KNOWLEDGE_DIR.exists(): logger.warning(f"Knowledge directory not found: {KNOWLEDGE_DIR}") return logger.info(f"📚 Initializing TurboPuffer RAG from {KNOWLEDGE_DIR}") rag = await turbopuffer.create_rag( namespace="restaurant-knowledge-turbopuffer", knowledge_dir=KNOWLEDGE_DIR, extensions=[".md"], ) logger.info( f"✅ TurboPuffer RAG ready with {len(rag._indexed_files)} documents indexed" ) async def create_agent() -> Agent: """Create a phone call restaurant reservation agent with TurboPuffer RAG.""" instructions = """Read the instructions in @phone_call_rag_instructions.md""" llm = gemini.LLM("gemini-2.5-flash-lite") @llm.register_function( description="Search restaurant knowledge base for detailed information about the menu, special offers, and reservation information." ) async def search_knowledge(query: str) -> str: return await rag.search(query, top_k=3) return Agent( edge=getstream.Edge(), agent_user=User(id="ai-agent", name="AI"), instructions=instructions, tts=elevenlabs.TTS(voice_id="FGY2WhTYpPnrIDTdsKH5"), stt=deepgram.STT(eager_turn_detection=True), llm=llm, ) if __name__ == "__main__": asyncio.run(create_rag_from_directory()) logger.info("Starting with TurboPuffer RAG backend") uvicorn.run(app, host="localhost", port=8000)
For this demo, we built a voice AI agent that answers phone calls via Twilio with a TurboPuffer RAG support.
When the call is initiated, Twilio triggers a webhook to prepare the call. To see the result, start NGROK to expose your local server to a public URL and use an active Twilio phone number for testing. Refer to the next section for instructions on obtaining a Twilio phone number.
ngrok http 8000
Running the command above will display information in your Terminal similar to the image below.
Get a Twilio Phone Number and Configure
Navigate to your Twilio console account and buy a new phone number. Under Phone Numbers, select Manage → Active numbers, and set the A call comes in Webhook to the URL NGROK generated for you https://replace_with_yours.ngrok-free.app. To receive SMS on the same number, add the URL to the A message comes in Webhook under Messaging Services.
Keep the NGROK Terminal running, open a new instance, and execute the command below.
1NGROK_URL=https://c541-176-72-38-94.ngrok-free.app uv run inbound_phone_call_turbopuffer_rag.py --from +418758761952 --to +11413901752
This command consists of your generated NGROK URL, the phone number you want to call from, and the Twilio phone number to call. To place an outbound call, the phone numbers' positions must be swapped.
You should now see an output similar to this image.
The three indexed documents in this image are those knowledge base Markdown files we added in the previous section.
Make an Outbound Phone Call
You can make outbound calls from your purchased Twilio phone number and receive them on desktop and mobile devices. Add a new file outbound_phone_call.py to the project and substitute its content with this sample code.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129import asyncio import logging import os import uuid import click import uvicorn from dotenv import load_dotenv from fastapi import FastAPI, WebSocket from twilio.rest import Client from vision_agents.core import Agent, User from vision_agents.plugins import gemini, getstream, twilio load_dotenv() logger = logging.getLogger(__name__) logging.basicConfig(level=logging.INFO) NGROK_URL = os.environ["NGROK_URL"].replace("https://", "").replace("http://", "").rstrip("/") app = FastAPI() call_registry = twilio.TwilioCallRegistry() async def create_agent() -> Agent: return Agent( edge=getstream.Edge(), agent_user=User(id="ai-agent", name="AI Assistant"), instructions="Act as a restaurant assistant. Read the instructions in @phone_call_rag_instructions.md and respond to customers to make a reservation. Use your knowledge base to provide relevant booking or reservation information.", llm=gemini.Realtime(), ) async def initiate_outbound_call(from_number: str, to_number: str) -> str: """Initiate an outbound call via Twilio. Returns the call_id.""" twilio_client = Client( os.environ["TWILIO_ACCOUNT_SID"], os.environ["TWILIO_AUTH_TOKEN"] ) call_id = str(uuid.uuid4()) async def prepare_call(): agent = await create_agent() phone_user = User(name=f"Outbound call {call_id[:8]}", id=f"phone-{call_id}") # Create both users in a single API call await agent.edge.create_users([agent.agent_user, phone_user]) stream_call = await agent.create_call("default", call_id) logger.info("prepared the call, ready to start") return agent, phone_user, stream_call twilio_call = call_registry.create(call_id, prepare=prepare_call) url = f"wss://{NGROK_URL}/twilio/media/{call_id}/{twilio_call.token}" logger.info( f"Forwarding to media url: {url} \n %s", twilio.create_media_stream_twiml(url) ) twilio_client.calls.create( twiml=twilio.create_media_stream_twiml(url), to=to_number, from_=from_number, ) logger.info(f"📞 Initiated call {call_id} from {from_number} to {to_number}") return call_id @app.websocket("/twilio/media/{call_sid}/{token}") async def media_stream(websocket: WebSocket, call_sid: str, token: str): twilio_call = call_registry.validate(call_sid, token) logger.info(f"🔗 Media stream connected for call {call_sid}") twilio_stream = twilio.TwilioMediaStream(websocket) await twilio_stream.accept() twilio_call.twilio_stream = twilio_stream try: ( agent, phone_user, stream_call, ) = await twilio_call.await_prepare() twilio_call.stream_call = stream_call await twilio.attach_phone_to_call(stream_call, twilio_stream, phone_user.id) async with agent.join(stream_call, participant_wait_timeout=0): await agent.llm.simple_response( text="Act as a restaurant assistant. Read the instructions in @phone_call_rag_instructions.md and respond to customers to make a reservation. Use your knowledge base to provide relevant booking or reservation information." ) await twilio_stream.run() finally: call_registry.remove(call_sid) async def run_with_server(from_number: str, to_number: str): """Start the server and initiate the outbound call once ready.""" config = uvicorn.Config(app, host="localhost", port=8000, log_level="info") server = uvicorn.Server(config) # Start server in background task server_task = asyncio.create_task(server.serve()) # Wait for server to be ready while not server.started: await asyncio.sleep(0.1) logger.info("🚀 Server ready, initiating outbound call...") # Initiate the outbound call await initiate_outbound_call(from_number, to_number) # Keep running until server shuts down await server_task @click.command() @click.option( "--from", "from_number", required=True, help="The phone number to call from. Needs to be active in your Twilio account", ) @click.option("--to", "to_number", required=True, help="The phone number to call") def main(from_number: str, to_number: str): logger.info( "Starting outbound example. Note that latency is higher in dev. Deploy to US east for low latency" ) asyncio.run(run_with_server(from_number, to_number)) if __name__ == "__main__": main()
Next, start NGROK ngrok http 8000 and run the Python script to receive a voice call on your desktop or mobile device from your Twilio phone number.
1NGROK_URL=https://c541-176-72-38-94.ngrok-free.app uv run outbound_phone_call_turbopuffer.py --from +19417703556 --to +35845856995
Here’s what you should see:
How Twilio is Used
Twilio is used for real-time audio streaming and handles the voice conversational flow during a phone call. The plugin for Vision Agents helps to integrate VoIP/telephony services into apps.
In the context of this project, the Twilio plugin helps with:
- WebSocket Management: Handling webhooks for incoming calls and messages.
- Call Registry: Using phone call data and timestamps to track active calls.
- Media Streaming: Using Twilio Media Streams to transmit audio bidirectionally.
- Audio Conversion: Helping convert raw PCM audio to a compatible Twilio format.
The following step-by-step processes occur during a Twilio phone call. To initiate a phone call, a Twilio webhook is triggered to prepare the call. This action starts a bi-directional broadcast using Twilio’s start.stream method and sends the stream to /twilio/media. When the media stream establishes a connection, the call must be awaited and attached to the phone user. The agent’s call session then remains active until the call ends.
Extend the Restaurant Reservation Agent
We have covered one of the many use cases of combining fast, efficient information retrieval with AI phone calling in restaurant reservations. By integrating Turbopuffer with Twilio in Vision Agents, teams can build scalable, enterprise-ready smart solutions for food ordering, sales, marketing automation, 24/7 customer support, and more.
To go beyond what you created in this article, check out the Vision Agents documentation to discover more about RAG for agents, use cases, AI integrations, and sample demos on GitHub.
Also, you can follow the open-source project on X or get support on Discord.
Troubleshooting Guide
When you try to run the demos for inbound and outbound calls, you may encounter some errors. Here are common issues and how to fix them.
- Spaces in Phone Numbers: Your Twilio phone number may have spaces in the individual numbers when you copy them from the console dashboard. Having spaces in the phone numbers you specify in the NGROK command to run the Python script can result in an error. Ensure you have removed all spaces from the numbers. Also, do not enter the phone number as a string, such as “+19417703556”. The following shows the correct form of the phone numbers.
1NGROK_URL=https://c541-176-72-38-94.ngrok-free.app uv run outbound_phone_call_turbopuffer.py --from +19417703556 --to +35845856995
- Ngrok URL Not Working: When the NGROK URL is not working, you should kill the Terminal instance and generate the public URL again for your localhost
ngrok http 8000. - Twilio Phone Number Not Verified: When you get a Twilio phone number verification error, you go to your Twilio console and verify it under Verified Caller IDs.
- API Credentials Not Set: Ensure you export the Turbopuffer API key and other credentials in your Terminal, or store them permanently in your machine's shell profile. On macOS, for instance, these can be stored in the system’s
.zprofileorzshrchidden files.
