Build low-latency Vision AI applications using our new open-source Vision AI SDK. ⭐️ on GitHub ->

Build an AI Math & Physics Agent with DeepSeek v3.2

New
4 min read
Amos G.
Amos G.
Published December 10, 2025

DeepSeek recently released a powerful new model, DeepSeek-V3.2, that's now instantly accessible via OpenRouter. In under 5 minutes, you can turn it into a real-time, voice-enabled math and physics agent that not only solves problems but also explains its reasoning out loud.

DeepSeek's latest open-source reasoning and agent-AI model, V3.2, leverages the new DeepSeek Sparse Attention (DSA). This fine-grained mechanism boosts training and inference efficiency for long-context scenarios while delivering output quality on par with DeepSeek-V3.1-Terminus. It's gaining momentum for its superior reasoning and agentic tool-use capabilities.

In this demo, the agent solves a vector addition problem using the law of cosines, calculates the magnitude (≈15.26 units), and then verbally explains every step when asked "How did you get that?" — all in natural conversation.

Here's exactly how to build the same agent yourself.

What You'll Build

Outline of the build for the AI math and physics agent with DeepSeek v3.2
  • A real-time voice AI tutor specialized in Math & Physics

  • Powered by DeepSeek-V3.2 (via OpenRouter)

  • Text-to-speech and Speech-to-Text→ ElevenLabs

  • Real-time audio/video transport → Stream

  • Turn detection → Smart-Turn

  • Built with the open-source Vision Agents framework

Requirements (API Keys)

You'll need API keys from:

  • Stream (for video calls & WebRTC)

  • OpenRouter (hosts DeepSeek-V3.2)

  • ElevenLabs (STT component to speak to the agent and TTS to hear from the agent)

Step 1: Set Up the Project

bash
1
2
3
4
5
6
7
# Create a new project with uv (highly recommended) or pip uv init deepseek-physics-tutor cd deepseek-physics-tutor # Install Vision Agents + required plugins uv add vision-agents uv add "vision-agents[getstream, openrouter, elevenlabs, smart-turn]"

Step 2: Full Working Code

Replace the content of the uv project's main.py with this:

python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
""" DeepSeek V3.2 Maths and Physics Tutor This example demonstrates how to use the DeepSeek V3.2 model with the OpenRouter plugin with a Vision Agent. OpenRouter provides access to multiple LLM providers through a unified API. The DeepSeek V3.2 model is a powerful LLM that is able to solve Maths and Physics problems based on what the user shows you through their camera feed. Set OPENROUTER_API_KEY environment variables before running. """ import asyncio import logging from dotenv import load_dotenv from vision_agents.core import User, Agent, cli from vision_agents.core.agents import AgentLauncher from vision_agents.plugins import ( openrouter, getstream, elevenlabs, smart_turn, ) logger = logging.getLogger(__name__) load_dotenv() async def create_agent(**kwargs) -> Agent: """Create the agent with OpenRouter LLM.""" #model = "deepseek/deepseek-v3.2" # Can also use other models like anthropic/claude-3-opus/gemini model = "deepseek/deepseek-v3.2-speciale" # Determine personality based on model if "deepseek" in model.lower(): personality = "Talk like a Maths and Physics tutor." elif "anthropic" in model.lower(): personality = "Talk like a robot." elif "openai" in model.lower() or "gpt" in model.lower(): personality = "Talk like a pirate." elif "gemini" in model.lower(): personality = "Talk like a cowboy." elif "x-ai" in model.lower(): personality = "Talk like a 1920s Chicago mobster." else: personality = "Talk casually." agent = Agent( edge=getstream.Edge(), agent_user=User(name="OpenRouter AI", id="agent"), instructions=f""" You are an expert in Maths and Physics. You help users solve Maths and Physics problems based on what they show you through their camera feed. Always provide concise and clear instructions, and explain the step-by-step process to the user so they can understand how you arrive at the final answer. {personality} """, llm=openrouter.LLM(model=model), tts=elevenlabs.TTS(), stt=elevenlabs.STT(), turn_detection=smart_turn.TurnDetection( pre_speech_buffer_ms=2000, speech_probability_threshold=0.9 ), ) return agent async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None: """Join the call and start the agent.""" # Ensure the agent user is created await agent.create_user() # Create a call call = await agent.create_call(call_type, call_id) logger.info("🤖 Starting OpenRouter Agent...") # Have the agent join the call/room with await agent.join(call): logger.info("Joining call") logger.info("LLM ready") # Open demo page for the user to join the call await agent.edge.open_demo(call) # Wait until the call ends (don't terminate early) await agent.finish() if __name__ == "__main__": cli(AgentLauncher(create_agent=create_agent, join_call=join_call))

Step 3: Run It

Run the following commands in your Terminal to store the API credentials in your working environment.

bash
1
2
3
4
5
export OPENROUTER_API_KEY=sk-... export ELEVENLABS_API_KEY=... export STREAM_API_KEY=... export STREAM_API_SECRET=... export EXAMPLE_BASE_URL=https://pronto-staging.getstream.io

Lastly, execute the Python script with uv run main.py.

A browser tab will open automatically with a simple video call interface, which automatically connects you to the voice agent. You can now enable your microphone and start talking!

Example interaction from the demo:

You: "Two forces of 8 N and 13 N act on an object at an angle of 25 degrees to each other. What is the magnitude of the resultant force?"
Agent: "The magnitude of the resultant force is approximately 15.26 units."
You: "How did you get that?"
Agent: "I used the law of cosines. First, the angle between the vectors is 180° - 25° = 155°. Then..."

Why We Love This Combination  

This stack is powerful yet simple: Vision Agents handles all the heavy lifting, including turn detection, real-time streaming, interruptions, and orchestration with industry-leading voice AI services and LLMs.

OpenRouter gives you instant, no-wait-list access to the latest DeepSeek-V3.2 models (and hundreds more) with unified billing and routing.

Stream delivers mature WebRTC infrastructure that keeps end-to-end latency under 500 ms even on consumer connections, and the entire agent (except the API calls) is fully open-source and runs locally on your machine. 

  • """
    DeepSeek V3.2 Maths and Physics Tutor 
    
    This example demonstrates how to use the DeepSeek V3.2 model with the OpenRouter plugin with a Vision Agent.
    OpenRouter provides access to multiple LLM providers through a unified API. The DeepSeek V3.2 model is a powerful LLM that is able to solve Maths and Physics problems based on what the user shows you through their camera feed.
    
    Set OPENROUTER_API_KEY environment variables before running.
    """
    
    import asyncio
    import logging
    
    from dotenv import load_dotenv
    
    from vision_agents.core import User, Agent, cli
    from vision_agents.core.agents import AgentLauncher
    from vision_agents.plugins import (
        openrouter,
        getstream,
        elevenlabs,
        smart_turn,
    )
    
    
    logger = logging.getLogger(__name__)
    
    load_dotenv()
    
    
    async def create_agent(**kwargs) -> Agent:
        """Create the agent with OpenRouter LLM."""
        #model = "deepseek/deepseek-v3.2"  # Can also use other models like anthropic/claude-3-opus/gemini
        model = "deepseek/deepseek-v3.2-speciale"
    
        # Determine personality based on model
        if "deepseek" in model.lower():
            personality = "Talk like a Maths and Physics tutor."
        elif "anthropic" in model.lower():
            personality = "Talk like a robot."
        elif "openai" in model.lower() or "gpt" in model.lower():
            personality = "Talk like a pirate."
        elif "gemini" in model.lower():
            personality = "Talk like a cowboy."
        elif "x-ai" in model.lower():
            personality = "Talk like a 1920s Chicago mobster."
        else:
            personality = "Talk casually."
    
        agent = Agent(
            edge=getstream.Edge(),
            agent_user=User(name="OpenRouter AI", id="agent"),
            instructions=f"""
            You are an expert in Maths and Physics. You help users solve Maths and Physics problems based on what they show you through their camera feed. Always provide concise and clear instructions, and explain the step-by-step process to the user so they can understand how you arrive at the final answer.  
            {personality}
            """,
            llm=openrouter.LLM(model=model),
            tts=elevenlabs.TTS(),
            stt=elevenlabs.STT(),
            turn_detection=smart_turn.TurnDetection(
                pre_speech_buffer_ms=2000, speech_probability_threshold=0.9
            ),
        )
        return agent
    
    
    async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
        """Join the call and start the agent."""
        # Ensure the agent user is created
        await agent.create_user()
        # Create a call
        call = await agent.create_call(call_type, call_id)
    
        logger.info("🤖 Starting OpenRouter Agent...")
    
        # Have the agent join the call/room
        with await agent.join(call):
            logger.info("Joining call")
            logger.info("LLM ready")
    
            # Open demo page for the user to join the call
            await agent.edge.open_demo(call)
            
            # Wait until the call ends (don't terminate early)
            await agent.finish()
    
    
    if __name__ == "__main__":
        cli(AgentLauncher(create_agent=create_agent, join_call=join_call))
    
  • Vision Agents site

  • Vision Agents GitHub

  • DeepSeek models on OpenRouter

  • Stream Video & Voice

Give it a try and play around with the type of tutor: calculus, mechanics, electricity, or something completely different. 🤓

Happy coding!

Integrating Video With Your App?
We've built a Video and Audio solution just for you. Check out our APIs and SDKs.
Learn more ->