DeepSeek recently released a powerful new model, DeepSeek-V3.2, that's now instantly accessible via OpenRouter. In under 5 minutes, you can turn it into a real-time, voice-enabled math and physics agent that not only solves problems but also explains its reasoning out loud.
DeepSeek's latest open-source reasoning and agent-AI model, V3.2, leverages the new DeepSeek Sparse Attention (DSA). This fine-grained mechanism boosts training and inference efficiency for long-context scenarios while delivering output quality on par with DeepSeek-V3.1-Terminus. It's gaining momentum for its superior reasoning and agentic tool-use capabilities.
In this demo, the agent solves a vector addition problem using the law of cosines, calculates the magnitude (≈15.26 units), and then verbally explains every step when asked "How did you get that?" — all in natural conversation.
Here's exactly how to build the same agent yourself.
What You'll Build
-
A real-time voice AI tutor specialized in Math & Physics
-
Powered by DeepSeek-V3.2 (via OpenRouter)
-
Text-to-speech and Speech-to-Text→ ElevenLabs
-
Real-time audio/video transport → Stream
-
Turn detection → Smart-Turn
-
Built with the open-source Vision Agents framework
Requirements (API Keys)
You'll need API keys from:
-
Stream (for video calls & WebRTC)
-
OpenRouter (hosts DeepSeek-V3.2)
-
ElevenLabs (STT component to speak to the agent and TTS to hear from the agent)
Step 1: Set Up the Project
1234567# Create a new project with uv (highly recommended) or pip uv init deepseek-physics-tutor cd deepseek-physics-tutor # Install Vision Agents + required plugins uv add vision-agents uv add "vision-agents[getstream, openrouter, elevenlabs, smart-turn]"
Step 2: Full Working Code
Replace the content of the uv project's main.py with this:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687""" DeepSeek V3.2 Maths and Physics Tutor This example demonstrates how to use the DeepSeek V3.2 model with the OpenRouter plugin with a Vision Agent. OpenRouter provides access to multiple LLM providers through a unified API. The DeepSeek V3.2 model is a powerful LLM that is able to solve Maths and Physics problems based on what the user shows you through their camera feed. Set OPENROUTER_API_KEY environment variables before running. """ import asyncio import logging from dotenv import load_dotenv from vision_agents.core import User, Agent, cli from vision_agents.core.agents import AgentLauncher from vision_agents.plugins import ( openrouter, getstream, elevenlabs, smart_turn, ) logger = logging.getLogger(__name__) load_dotenv() async def create_agent(**kwargs) -> Agent: """Create the agent with OpenRouter LLM.""" #model = "deepseek/deepseek-v3.2" # Can also use other models like anthropic/claude-3-opus/gemini model = "deepseek/deepseek-v3.2-speciale" # Determine personality based on model if "deepseek" in model.lower(): personality = "Talk like a Maths and Physics tutor." elif "anthropic" in model.lower(): personality = "Talk like a robot." elif "openai" in model.lower() or "gpt" in model.lower(): personality = "Talk like a pirate." elif "gemini" in model.lower(): personality = "Talk like a cowboy." elif "x-ai" in model.lower(): personality = "Talk like a 1920s Chicago mobster." else: personality = "Talk casually." agent = Agent( edge=getstream.Edge(), agent_user=User(name="OpenRouter AI", id="agent"), instructions=f""" You are an expert in Maths and Physics. You help users solve Maths and Physics problems based on what they show you through their camera feed. Always provide concise and clear instructions, and explain the step-by-step process to the user so they can understand how you arrive at the final answer. {personality} """, llm=openrouter.LLM(model=model), tts=elevenlabs.TTS(), stt=elevenlabs.STT(), turn_detection=smart_turn.TurnDetection( pre_speech_buffer_ms=2000, speech_probability_threshold=0.9 ), ) return agent async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None: """Join the call and start the agent.""" # Ensure the agent user is created await agent.create_user() # Create a call call = await agent.create_call(call_type, call_id) logger.info("🤖 Starting OpenRouter Agent...") # Have the agent join the call/room with await agent.join(call): logger.info("Joining call") logger.info("LLM ready") # Open demo page for the user to join the call await agent.edge.open_demo(call) # Wait until the call ends (don't terminate early) await agent.finish() if __name__ == "__main__": cli(AgentLauncher(create_agent=create_agent, join_call=join_call))
Step 3: Run It
Run the following commands in your Terminal to store the API credentials in your working environment.
12345export OPENROUTER_API_KEY=sk-... export ELEVENLABS_API_KEY=... export STREAM_API_KEY=... export STREAM_API_SECRET=... export EXAMPLE_BASE_URL=https://pronto-staging.getstream.io
Lastly, execute the Python script with uv run main.py.
A browser tab will open automatically with a simple video call interface, which automatically connects you to the voice agent. You can now enable your microphone and start talking!
Example interaction from the demo:
You: "Two forces of 8 N and 13 N act on an object at an angle of 25 degrees to each other. What is the magnitude of the resultant force?"
Agent: "The magnitude of the resultant force is approximately 15.26 units."
You: "How did you get that?"
Agent: "I used the law of cosines. First, the angle between the vectors is 180° - 25° = 155°. Then..."
Why We Love This Combination
This stack is powerful yet simple: Vision Agents handles all the heavy lifting, including turn detection, real-time streaming, interruptions, and orchestration with industry-leading voice AI services and LLMs.
OpenRouter gives you instant, no-wait-list access to the latest DeepSeek-V3.2 models (and hundreds more) with unified billing and routing.
Stream delivers mature WebRTC infrastructure that keeps end-to-end latency under 500 ms even on consumer connections, and the entire agent (except the API calls) is fully open-source and runs locally on your machine.
Links & Resources
-
""" DeepSeek V3.2 Maths and Physics Tutor This example demonstrates how to use the DeepSeek V3.2 model with the OpenRouter plugin with a Vision Agent. OpenRouter provides access to multiple LLM providers through a unified API. The DeepSeek V3.2 model is a powerful LLM that is able to solve Maths and Physics problems based on what the user shows you through their camera feed. Set OPENROUTER_API_KEY environment variables before running. """ import asyncio import logging from dotenv import load_dotenv from vision_agents.core import User, Agent, cli from vision_agents.core.agents import AgentLauncher from vision_agents.plugins import ( openrouter, getstream, elevenlabs, smart_turn, ) logger = logging.getLogger(__name__) load_dotenv() async def create_agent(**kwargs) -> Agent: """Create the agent with OpenRouter LLM.""" #model = "deepseek/deepseek-v3.2" # Can also use other models like anthropic/claude-3-opus/gemini model = "deepseek/deepseek-v3.2-speciale" # Determine personality based on model if "deepseek" in model.lower(): personality = "Talk like a Maths and Physics tutor." elif "anthropic" in model.lower(): personality = "Talk like a robot." elif "openai" in model.lower() or "gpt" in model.lower(): personality = "Talk like a pirate." elif "gemini" in model.lower(): personality = "Talk like a cowboy." elif "x-ai" in model.lower(): personality = "Talk like a 1920s Chicago mobster." else: personality = "Talk casually." agent = Agent( edge=getstream.Edge(), agent_user=User(name="OpenRouter AI", id="agent"), instructions=f""" You are an expert in Maths and Physics. You help users solve Maths and Physics problems based on what they show you through their camera feed. Always provide concise and clear instructions, and explain the step-by-step process to the user so they can understand how you arrive at the final answer. {personality} """, llm=openrouter.LLM(model=model), tts=elevenlabs.TTS(), stt=elevenlabs.STT(), turn_detection=smart_turn.TurnDetection( pre_speech_buffer_ms=2000, speech_probability_threshold=0.9 ), ) return agent async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None: """Join the call and start the agent.""" # Ensure the agent user is created await agent.create_user() # Create a call call = await agent.create_call(call_type, call_id) logger.info("🤖 Starting OpenRouter Agent...") # Have the agent join the call/room with await agent.join(call): logger.info("Joining call") logger.info("LLM ready") # Open demo page for the user to join the call await agent.edge.open_demo(call) # Wait until the call ends (don't terminate early) await agent.finish() if __name__ == "__main__": cli(AgentLauncher(create_agent=create_agent, join_call=join_call))
Give it a try and play around with the type of tutor: calculus, mechanics, electricity, or something completely different. 🤓
Happy coding!