OpenAI

OpenAI Realtime is a low-latency, speech-to-speech API that transforms live audio input into spoken responses using OpenAI’s GPT-4o model.
It combines real-time transcription, language understanding, and text-to-speech synthesis into a single streamlined pipeline.

The OpenAI Realtime plugin in the Stream Python AI SDK wraps this functionality in a convenient interface, allowing you to send live audio and receive natural, voice-based responses with minimal delay.
This is ideal for building conversational agents, voice bots, or AI-driven avatars that talk back—instantly and naturally.

It supports configurable voice personas, custom system instructions to guide GPT’s behavior, and seamless integration with real-time audio streams.

Initialisation

The OpenAI plugin for Stream exists in the form of the OpenAIRealtime class:

sts_bot = OpenAIRealtime()

Parameters

These are the parameters available in the OpenAIRealtime plugin:

NameTypeDefaultDescription
api_keystr or NoneNoneYour OpenAI API key. If not provided, the SDK will look for it in env vars.
modelstr"gpt-4o-realtime-preview"The OpenAI model to use for speech-to-speech. Supports real-time models only.
voicestr or NoneNoneThe voice to use for spoken responses. If None, defaults to OpenAI’s standard.
instructionsstr or NoneNoneOptional system prompt to guide GPT’s behavior and tone.

Functionality

Connect

The connect() method on the OpenAIRealtime plugin allows you to connect to the conversational OpenAI model. The method takes in the call to connect the bot to as well as the user ID for the bot:

sts_bot.connect(call, agent_user_id=bot_user_id)

Send User Message

In a call with a conversational bot, the user and the bot take turns talking to each other. The send_user_message() method allows you to send a message from the human side of the conversation:

sts_bot.send_user_message("Give a very short greeting to the user.")

Request Assistant Response

In a call with a conversational bot, the user and the bot take turns talking to each other. The request_assistant_response() method asks OpenAI to generate the next assistant turn:

sts_bot.request_assistant_response()

Update Session

The update_session() method allows you to update the current OpenAI session:

sts_bot.update_session(
    instructions="You are now a Shakespearean character."
)

OpenAI Events

The OpenAI Realtime API provides an event system that allows you to handle various events during conversations. When using the Stream SDK with OpenAI integration, you have access to the events from the OpenAI Realtime API. These events are different from Stream events which are emitted from the Stream side.

async with sts_bot.connect(call, agent_user_id=bot_user_id) as connection:
    # Process events using an async iterator
    async for event in connection:
        # Handle different event types
        if event.type == "conversation.updated":
            print(f"conversation.updated: {event}")

        elif event.type == "conversation.item.completed":
            print(f"conversation.item.completed: {event}")

        elif event.type == "error":
            print(f"error: {event}")

Example

Check out our OpenAI example to see a practical implementation of the plugin and get inspiration for your own projects.

© Getstream.io, Inc. All Rights Reserved.