How to Add Function-Calling Bots with MCP

This how-to guide shows you how to give function calling capabilities to AI bots in your Stream video calls. We’ll be using the Model Context Protocol (MCP).

For a complete runnable example, see the GitHub sample.

Need an MCP overview first? See the MCP explanation.

You will create a small assistant that:

  1. Joins a Stream video call
  2. Listens to participants’ speech
  3. Converts speech → text with a Deepgram STT plugin
  4. Sends the transcript to an OpenAI model LLM that may call MCP tools
  5. Executes the function and speaks the result back with ElevenLabs TTS
flowchart LR
    User((User Audio)) -->|audio| Bot[Bot]
    Bot -->|PCM data| STT[STT]
    STT -->|transcript| LLM[LLM]
    LLM -->|MCP| Func[Python Function]
    Func -->|result| Out[Chat & TTS]
    Out -->|message & audio| User

Prerequisites

  • Stream account and Stream app - create a Stream app in the Stream Dashboard and copy your STREAM_API_KEY and STREAM_API_SECRET
  • API keys for Deepgram, OpenAI and ElevenLabs
  • Python 3.10+
  • uv package manager (recommended)

Create a project folder

In your terminal, create a project folder with

mkdir mcp-bot && cd mcp-bot

Add dependencies

uv add \
  getstream[webrtc]>=2.3.0a4 \
  getstream-plugins-deepgram>=0.1.1 \
  getstream-plugins-elevenlabs>=0.1.0 \
  fastmcp>=2.10.1 \
  openai>=1.93.0 \
  httpx>=0.28.1 \
  python-dotenv>=1.1.1

Set Environment Variables

Copy the template below to a file called .env and fill in your actual keys.

STREAM_API_KEY=your_stream_api_key
STREAM_API_SECRET=your_stream_api_secret
EXAMPLE_BASE_URL=https://pronto.getstream.io

DEEPGRAM_API_KEY=your_deepgram_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
OPENAI_API_KEY=your_openai_api_key

Create an MCP Server (server.py)

Create a file called server.py and add the following:

# server.py
from fastmcp import FastMCP
import httpx

mcp = FastMCP("Stream MCP Server")

# Define a tool that can be used by your MCP Client
@mcp.tool()
def get_forecast(city: str) -> str:
    """Return today's weather for <city>."""
    data = httpx.get(f"https://wttr.in/{city}?format=%C+%t").text
    return data.strip()

if __name__ == "__main__":
    mcp.run()

get_forecast will be the tool called by your agent.

Create an LLM helper (agent.py)

Create a file called agent.py. This will contain the logic for the MCP client and define how the LLM works with your tool.

# agent.py
import json, re, os, logging, openai, fastmcp

openai.api_key = os.environ["OPENAI_API_KEY"]
logging.basicConfig(level=logging.INFO)

# The system prompt that will be passed in to the LLM
SYSTEM = """
You are a helpful assistant. When the user needs real-world weather data, call the get_forecast tool.
If you call a tool, use tool_name["arg":"value"]  on a single line.
You can chain up to 3 tool calls.
"""

def _chat(messages):
    resp = openai.chat.completions.create(model="gpt-4o", messages=messages)
    return resp.choices[0].message.content

async def chat_with_tools(prompt: str, client: fastmcp.Client) -> str:
    tools = await client.list_tools()
    tool_names = {t.name for t in tools}
    history = [
        {"role": "system", "content": SYSTEM},
        {"role": "user", "content": prompt},
    ]

    while True:
        reply = _chat(history)
        m = re.match(r"^(\w+)\[(.*)\]$", reply.strip())

        if not (m and m.group(1) in tool_names):
            return reply

        name, args_json = m.groups()
        args = json.loads(f"{{{args_json}}}")
        result = await client.call_tool(name, args)
        history.append({"role": "assistant", "content": str(result.data)})

When your bot sends the audio transcript of your speech to the LLM, it will run this tool-calling logic as well, to give the LLM the option of calling the tool you defined above.

Write the Bot Logic (main.py)

Now we’re ready for the main bot. Create a file called main.py and add the following sections.

Import Relevant Objects

import os, uuid, asyncio, logging, webbrowser
from urllib.parse import urlencode
from dotenv import load_dotenv

from getstream.stream import Stream
from getstream.models import UserRequest
from getstream.video import rtc
from getstream.video.rtc import audio_track
from getstream.video.rtc.track_util import PcmData
from getstream.plugins.deepgram.stt import DeepgramSTT
from getstream.plugins.elevenlabs.tts import ElevenLabsTTS

from agent import chat_with_tools
import fastmcp

Set Logger and Create a Client

logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")

load_dotenv()

CLIENT = Stream.from_env()

Define Utility Methods

def _create_user(user_id: str, name: str):
    CLIENT.upsert_users(UserRequest(id=user_id, name=name))


def _open_browser(call_id: str, token: str):
    base = f"{os.getenv('EXAMPLE_BASE_URL')}/join/"
    params = {"api_key": CLIENT.api_key, "token": token, "skip_lobby": "true"}
    url = f"{base}{call_id}?{urlencode(params)}"
    webbrowser.open(url)

Create a Function to Run the Bot

async def _run_bot(call, bot_id):
    stt = DeepgramSTT()
    tts = ElevenLabsTTS()
    track = audio_track.AudioStreamTrack(framerate=16000)
    tts.set_output_track(track)

    async with await rtc.join(call, bot_id) as conn:
        await conn.add_tracks(audio=track)
        async with fastmcp.Client("server.py") as mcp_client:

            @conn.on("audio")
            async def on_audio(pcm: PcmData, user):
                await stt.process_audio(pcm, user)

            @stt.on("transcript")
            async def on_transcript(text, user, meta):
                if not meta.get("is_final", False):
                    return
                answer = await chat_with_tools(text, mcp_client)
                if answer:
                    await tts.send(answer)

            await conn.wait()

Create the Main Loop

async def main():
    call_id = str(uuid.uuid4())
    human_id = f"user-{uuid.uuid4()}"
    bot_id = f"mcp-bot-{uuid.uuid4()}"

    _create_user(human_id, "Human")
    _create_user(bot_id, "MCP Bot")

    token = CLIENT.create_token(human_id, expiration=3600)
    call = CLIENT.video.call("default", call_id)
    call.get_or_create(data={"created_by_id": bot_id})
    _open_browser(call_id, token)

    try:
        await _run_bot(call, bot_id)
    finally:
        CLIENT.delete_users([human_id, bot_id])


if __name__ == "__main__":
    asyncio.run(main())

Run Everything

Now we’re ready to run the code. Run with

uv run main.py    # starts the bot and opens the browser

You can now speak in the call. Ask the bot what the weather is like in a city, e.g. “What’s the weather like in London?”. The bot calls get_forecast, gets the weather and replies aloud.

Next Steps

You have now built an MCP-enabled assistant bot that runs entirely on Stream’s Python SDK. You can

  • Add more @mcp.tool() functions in server.py
  • Swap providers - any STT/TTS plugin will work!
© Getstream.io, Inc. All Rights Reserved.