Moonshine

Moonshine is a high-performance Speech-to-Text (STT) engine optimized for local and edge deployments. Designed to work efficiently on low-resource environments like browsers and embedded devices, Moonshine is ideal for real-time, offline voice applications where speed, privacy, and low latency are critical.

The Moonshine plugin in the Stream Python AI SDK allows you to integrate STT services to your application.

Installation

Install the Stream Moonshine plugin with

uv add getstream-plugins-moonshine

You’ll also need to install the Moonshine ONNX version from GitHub:

uv add "useful-moonshine-onnx@git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx"

Example

Check out our Moonshine example to see a practical implementation of the plugin and get inspiration for your own projects, or read on for some key details.

Initialisation

The Moonshine plugin for Stream exists in the form of the MoonshineSTT class:

from getstream.plugins.moonshine import MoonshineSTT

stt = MoonshineSTT()

We recommend using the Moonshine plugin combined with a VAD plugin like Silero to avoid excessive local processing.

Parameters

These are the parameters available in the MoonshineSTT plugin for you to customise:

Name	Type	Default	Description
`model_name`	`str`	`"moonshine/base"`	The Moonshine model to use. Supported options are `"moonshine/tiny"` and `"moonshine/base"`.
`sample_rate`	`int`	`16000`	The sample rate (in Hz) of the audio input. Must match Moonshine’s expected rate.
`language`	`str`	`"en-US"`	Language code for transcription. Currently, only `"en-US"` is supported.
`min_audio_length_ms`	`int`	`100`	Minimum length (in milliseconds) of audio required before processing.
`target_dbfs`	`float`	`-26.0`	Target RMS loudness level (in dBFS) for audio normalization before transcription.

Functionality

Process Audio

Once you join the call, you can listen to the connection for audio events. You can then pass along the audio events for the STT class to process:

from getstream.video import rtc

async with rtc.join(call, bot_user_id) as connection:

    @connection.on("audio")
    async def on_audio(pcm: PcmData, user):
        # Process audio through Moonshine STT
        await stt.process_audio(pcm, user)

Events

Transcript Event

The transcript event is triggered when a final transcript is available from Moonshine:

@stt.on("transcript")
async def on_transcript(text: str, user: any, metadata: dict):
    # Process transcript event here

Error Event

If an error occurs, an error event is fired:

@stt.on("error")
async def on_stt_error(error):
    # Process error event here

Close

You can close the STT connection with the close() method:

stt.close()

Kokoro

OpenAI