uv add getstream>=2.3.0a4 getstream-plugins-deepgram>=0.1.1
Add Audio Moderation
This how-to guide shows you how to build real-time audio moderation into a Stream Video call using the Stream Python AI SDK and the Stream Moderation API.
For a complete runnable example, see the GitHub sample.
Need an overview first? See the Moderating Video Calls explanation.
What You Will Build
A small “moderation bot” that
- Joins a call as a participant
- Listens to raw PCM audio from every speaker
- Transcribes speech with a Speech-to-Text (STT) plugin (in our case, Deepgram)
- Sends each utterance to the Moderation API
- Outputs the moderation verdict
Note: we’re using Deepgram as our speech-to-text provider here, but you can use any provider you like.
Set Up Your Environment
Install the SDK and STT Plugin
Note: we’re using uv for package/dependency management, but you can use another tool if you prefer.
Create a .env
file in your project root:
# Stream credentials
STREAM_API_KEY=your_stream_api_key
STREAM_API_SECRET=your_stream_api_secret
EXAMPLE_BASE_URL=https://pronto.getstream.io
# STT provider (example for Deepgram)
DEEPGRAM_API_KEY=your_deepgram_api_key
Initialise the Core Clients
Create a file called main.py
and add the following content:
from dotenv import load_dotenv
import os, uuid
from getstream.stream import Stream
from getstream.video import rtc
from getstream.plugins.deepgram import DeepgramSTT
load_dotenv()
client = Stream.from_env() # initialises with STREAM_API_KEY / SECRET
stt = DeepgramSTT() # uses DEEPGRAM_API_KEY from env
Create Users and Call
call_id = str(uuid.uuid4())
print(f"📞 Call ID: {call_id}")
user_id = f"user-{uuid.uuid4()}"
create_user(client, user_id, "My User")
logging.info("👤 Created user: %s", user_id)
user_token = client.create_token(user_id, expiration=3600)
logging.info("🔑 Created token for user: %s", user_id)
bot_user_id = f"moderation-bot-{uuid.uuid4()}"
create_user(client, bot_user_id, "Moderation Bot")
logging.info("🤖 Created bot user: %s", bot_user_id)
# Create the call
call = client.video.call("default", call_id)
call.get_or_create(data={"created_by_id": bot_user_id})
print(f"📞 Call created: {call_id}")
Create a Moderation Policy Configuration
Create a Moderation policy config with the key custom:python-ai-test
. The policy below will flag text deemed insulting.
client = Stream.from_env()
client.moderation.upsert_config(
key="custom:python-ai-test",
ai_text_config={
"rules": [{"label": "INSULT", "action": "flag"}],
},
)
You can also create a moderation policy config from the Stream dashboard, by creating an app, then navigating to Moderation > Policies > Add New.
Open a Web Browser for Testing
import webbrowser
base_url = f"{os.getenv('EXAMPLE_BASE_URL')}/join/"
params = {"api_key": api_key, "token": user_token, "skip_lobby": "true"}
url = f"{base_url}{call_id}?{urlencode(params)}"
print(f"Opening browser to: {url}")
webbrowser.open(url)
This will open your browser and authenticate you as the user you created earlier.
Add the Moderation Bot and Process Audio
Once the moderation bot is added to the call, it will start listening for audio events emitted by the Python AI SDK. Once it receives an audio
event, it will transcribe the audio via STT and send a transcript
event.
from getstream.models import ModerationPayload
import uuid
async with rtc.join(call, bot_user_id) as connection:
@connection.on("audio")
async def on_audio(pcm: PcmData, user):
# Process audio through STT
await stt.process_audio(pcm, user)
Add the Moderation Check
Add this event handler, which runs when the STT provider sends a transcript
event.
@stt.on("transcript")
async def on_transcript(text: str, user: any, metadata: dict):
timestamp = time.strftime("%H:%M:%S")
user_info = user.name if user and hasattr(user, "name") else "unknown"
print(f"[{timestamp}] {user_info}: {text}")
if metadata.get("confidence"):
print(f" └─ confidence: {metadata['confidence']:.2%}")
# Moderation check (executed in a background thread to avoid blocking)
moderation = await asyncio.to_thread(moderate, client, text, user_info)
print(
f" └─ moderation recommended action: {moderation.recommended_action} for transcript: {text}"
)
# Keep the connection alive and wait for audio
await connection.wait()
That’s it! Your bot now moderates speech in real-time.
Run the Code
Run the code with
uv run main.py
It will open up a web browser so you can join the call. You will see a bot participant joining. Now you can speak to it and test the moderation features by saying nice or insulting things!
You should see output like the following in your terminal:
An innocent sentence which is reviewed as safe:
An insulting sentence which is reviewed as potentially infringing the moderation policy:
Review the Content
When a sentence is recommended to be flagged by the moderation API, it’s also available in the review queue. You can access this via the Python SDK or the Dashboard.
Access it via the SDK with
from getstream.models import QueryReviewQueueResponse, ReviewQueueItemResponse
response: QueryReviewQueueResponse = client.moderation.query_review_queue().data
item: ReviewQueueItemResponse
for item in response.items:
print("--------------------------------")
print(f"Flagged audio transcript with ID: {item.id}")
print(f"Recommended action: {item.recommended_action}")
if item.recommended_action != "keep":
print(f"Transcript: {item.moderation_payload.texts[0]}")
print(f"Created at: {item.created_at.isoformat()}")
You can also see and review this content in the Dashboard, as well as doing things like banning and deleting users.
Next Steps
- You can find a full, working example on GitHub.
- There’s much more you can do with the Moderation API! Check out the Moderation API documentation
- I'm working with the Stream Video Python AI SDK and would like to ask questions about this documentation page: https://getstream.io/video/docs/python-ai/guides/add-audio-moderation.md
- View as markdown
- Open in ChatGPT
- Open in Claude