OpenAI

OpenAI Realtime is a low-latency, speech-to-speech API that transforms live audio input into spoken responses using OpenAI models. It combines real-time transcription, language understanding, and text-to-speech synthesis into a single streamlined pipeline.

The OpenAI Realtime plugin in the Stream Python AI SDK wraps this functionality in a convenient interface, allowing you to send live audio and receive natural, voice-based responses with minimal delay. This is ideal for building conversational agents, voice bots, or AI-driven avatars that talk back—instantly and naturally.

Installation

Install the Stream OpenAI plugin with

uv add getstream-plugins-openai

Example

Check out our OpenAI example to see a practical implementation of the plugin and get inspiration for your own projects, or read on for some key details.

Initialisation

The OpenAI plugin for Stream exists in the form of the OpenAIRealtime class:

from getstream.plugins.openai import OpenAIRealtime

sts_bot = OpenAIRealtime()

Parameters

These are the parameters available in the OpenAIRealtime plugin:

Name	Type	Default	Description
`api_key`	`str` or `None`	`None`	Your OpenAI API key. If not provided, the SDK will look for it in env vars.
`model`	`str`	`"gpt-4o-realtime-preview"`	The OpenAI model to use for speech-to-speech. Supports real-time models only.
`voice`	`str` or `None`	`None`	The voice to use for spoken responses. If `None`, defaults to OpenAI’s standard.
`instructions`	`str` or `None`	`None`	Optional system prompt to guide GPT’s behavior and tone.

Functionality

Connect

The connect() method on the OpenAIRealtime plugin allows you to connect to the conversational OpenAI model. The method takes in the call to connect the bot to as well as the user ID for the bot:

sts_bot.connect(call, agent_user_id=bot_user_id)

Send User Message

In a call with a conversational bot, the user and the bot take turns talking to each other. The send_user_message() method allows you to send a message from the human side of the conversation:

sts_bot.send_user_message("Give a very short greeting to the user.")

Request Assistant Response

In a call with a conversational bot, the user and the bot take turns talking to each other. The request_assistant_response() method asks OpenAI to generate the next assistant turn:

sts_bot.request_assistant_response()

Update Session

The update_session() method allows you to update the current OpenAI session:

sts_bot.update_session(
    instructions="You are now a Shakespearean character."
)

Function Calling

You can give the model the ability to call functions in your code.

await sts_bot.update_session(
    turn_detection={
        "type": "semantic_vad",
        "eagerness": "low",
        "create_response": True,
        "interrupt_response": True,
    },
    tools=[
        {
            "type": "function",
            "name": "start_closed_captions",
            "description": "start closed captions for the call",
            "parameters": {
                "type": "object",
                "properties": {},
                "required": [],
            },
        }
    ],
)

If it chooses to do so, you will receive the function_call event. You can then call the function and pass the output back to the AI.

The send_function_call_output() mthods lets you provide the result of a function the AI asked to call back to the AI.

await sts_bot.send_function_call_output(tool_call_id, result.to_json())

OpenAI Events

The OpenAI Realtime API provides an event system that allows you to handle various events during conversations. When using the Stream SDK with OpenAI integration, you have access to the events from the OpenAI Realtime API. These events are different from Stream events which are emitted from the Stream side.

async with sts_bot.connect(call, agent_user_id=bot_user_id) as connection:
    # Process events using an async iterator
    async for event in connection:
        # Handle different event types
        if event.type == "conversation.updated":
            print(f"conversation.updated: {event}")

        elif event.type == "conversation.item.completed":
            print(f"conversation.item.completed: {event}")

        elif event.type == "error":
            print(f"error: {event}")

Moonshine

Silero