WebRTC For The Brave

Implementing Real-Time Video Streaming in Unity with WebRTC

Video Streaming with WebRTC in Unity

This tutorial dives into WebRTC's practical implementation for Unity Developers. Leveraging the Unity WebRTC library, we will guide you through setting up a peer-to-peer (P2P) connection that facilitates video sharing between two devices on a local network. We'll start off with setting up a simple signaling server, utilizing the websocket-sharp library for signaling. We'll then transition to Unity Engine, where we'll be using the Unity WebRTC package for the video streaming functionality. By the end of this tutorial, you will have a foundational understanding of WebRTC and a functional video streaming application in Unity.

If you're new to WebRTC, starting with our introduction to WebRTC for Unity Developers is a good idea. This guide will help you understand the code we're going to write. But if you prefer to jump straight into coding, that's okay, too. We've added extra comments in the code to make learning easier, and we explain each part of the code in detail after showing it.

The Signaling Server

A signaling server is essential for initiating a P2P connection through WebRTC, as it facilitates the exchange of connectivity information.

In this tutorial, we'll set up a basic signaling server that will simply relay each received message to all connected clients, excluding the sender. This rudimentary functionality will suffice for connecting two peers, which is our current objective. However, for a real-world application, you would typically implement user authentication (possibly using APIs like Steam or Google Play Services) and ensure that only peers within the same session or call are connected.

The websocket-sharp library conveniently includes a WebSocket server feature, which we'll utilize for our signaling server.

Be aware that running the signaling server locally usually restricts access to devices within the same private network. Configuring access from external networks is beyond this tutorial's scope.

The service to resend every received message to all other participants:

The main program:

That’s it!

You can download the complete project from this repository.

Run the project in your IDE (like Visual Studio or Rider). A console window will appear to confirm that the server is running:

The Unity Project

For this tutorial, we'll be working with Unity version 2022.3.18f1 LTS. If you're using a different version, just ensure it's compatible with the Unity WebRTC package.

Open the Starting Unity Project

We've set up a simple project for you that includes a basic user interface, which will serve as our foundation for this tutorial:

You can find the starter project in the following GitHub repository. Download it and open the Unity_StartingProject folder in the Unity Editor to get started.

Add the websocket-sharp package

We’ve already covered this step, and the websocket-sharp DLL is placed in theAssets/Plugins/websocket-sharp directory.

Test WebSocket connection

Before proceeding, it's important to confirm that our Unity application can successfully connect to the signaling server.

Start by creating a WebSocketClient.cs script in the Assets/Scripts directory. Open this script in your IDE, and enter the following code:

Let’s go through this script step by step:

  • In the Awake method, it initializes a WebSocket instance, subscribes to relevant events, and attempts to connect to our signaling server.
  • Event handlers, such as OnMessage and OnError, enqueue messages and errors to **ConcurrentQueue**s to be processed on the main thread. This ensures compatibility with Unity's threading model, as WebSocketSharp operates on separate threads.
  • The Update method processes the queues, triggering the MessageReceived event for messages, and logging errors.
  • A public SendWebSocketMessage method is provided to send messages to the server.
  • The _serverIp field, decorated with the SerializeField attribute, can be set in the Unity Inspector, allowing us to set the server's IP address.
  • A test message is sent upon establishing a connection to verify the WebSocket communication.
  • The OnDestroy method disconnects and cleans up the WebSocket client when the object is destroyed or when the application is closed.

Open the Main scene. Create a new Game Object named WebSocket in the scene hierarchy, and attach the WebSocketClient.cs script to it.

The Server IP

For the purposes of this tutorial, we'll work within a local network. Run the WebSocket server on your local machine and use its local network IPv4 address to connect other devices within the same network.

To find your machine's local IP address, run ipconfig in the Windows Command Prompt or ifconfig on Mac or Linux terminals. Look for the IPv4 address associated with your primary network interface.

For example, the output of ipconfig on a Windows machine might look like this:

In this example, the machine's local IP address is 192.168.1.104.

Once you determine your local IPv4 address, paste it into the WebSocketClient component in Unity:

Save the scene and run the project.

Your signaling server's console should display a new connection log, indicating that a WebSocket client has connected and received a test message:

Once you verify the WebSocket connection between your Unity client and the signaling server, the OnOpen method that sends the test message can be removed from the WebSocketClient.cs script:

Add Unity’s WebRTC package

Next, add the unity’s WebRTC package to the project. You can follow the official documentation for detailed installation steps.

Transferring the data between the peers

In the signaling phase, we exchange [SDP](https://getstream.io/resources/projects/webrtc/basics/sdp-messages/) Offer/Answer and ICE Candidates. It's practical to use object serialization to JSON for this purpose

However, serialization can be challenging with objects from certain libraries, such as WebRTC, due to their complex structures. To resolve this, we implement Data Transfer Objects (DTOs). These simple, lightweight containers carry data types in a format that's easy to serialize.

Let's apply this concept. In the Assets/Scripts directory, create a new subdirectory named DTO. Then, in the Assets/Scripts/DTO/ directory, create the following scripts:

SdpDTO.cs will serialize SDP Offers & Answers:

IceCandidateDTO.cs will handle the serialization of ICE Candidates during ICE Trickling:

For ease of deserialization, we'll use a wrapper DTO that includes the serialized data and indicates its type.

Add these scripts to the Assets/Scripts/DTO/ directory as well:

DtoType.cs defines the possible DTO types:

DTOWrapper.cs simplifies the deserialization process by wrapping the serialized data along with its type indicator:

The usefulness of these DTOs and their wrapper will become clear as we proceed with serialization and deserialization.

The Video Manager

Now, let's focus on the primary script that will manage the WebRTC connection.

In the Assets/Scripts/ directory, create a new script named VideoManager.cs and paste the initial structure:

We will develop this class step-by-step to ensure clarity in each segment.

Begin by defining two private fields:

The _webSocketClient is a reference to the WebSocketClient we created earlier. The _peerConnection is an [RTCPeerConnection](https://getstream.io/resources/projects/webrtc/basics/rtcpeerconnection) from the WebRTC library, representing the connection between the two peers.

Now, add the Awake method:

In Awake, we:

  • Retrieve the WebSocketClient component using FindObjectOfType, which searches the entire scene for this component. While this method is sufficient for our demo, it's generally not recommended for production due to performance implications.
  • Set up an RTCConfiguration with a STUN server URL provided by Google.
  • Instantiate RTCPeerConnection and subscribe to several WebRTC events:
    • OnNegotiationNeeded: Invoked when a session renegotiation is necessary.
    • OnIceCandidate: Fires upon the discovery of a potential network path.
    • OnTrack: Occurs when a new media track is received from the peer.
  • Subscribe to the MessageReceived event of WebSocketClient, which activates upon the arrival of a new WebSocket message.

Next, let’s define event handler methods.

Add the following code to the class:

In this tutorial, we'll initiate the negotiation when the user clicks the Connect button. However, in a real-life application, negotiation process should be repeated whenever OnNegotiationNeeded event is triggered.

We'll now define code for serializing and sending messages to other peers.

Add the following method:

This generic method handles the serialization of any DTO, along with a DtoType. The DTO is serialized to JSON, placed into the Payload of a DTOWrapper, and the Type is set accordingly. This structured approach simplifies the deserialization process on the receiving end by indicating the exact type of the encapsulated data.

These methods extend the SendMessageToOtherPeer generic method to handle SDP messages and ICE Candidate serialization and sending:

Let's proceed to handle incoming WebSocket messages with the OnWebSocketMessageReceived method:

Upon receiving a message, we deserialize it into the DTOWrapper and use the dtoWrapper.Type to know the exact type to which payload should be deserialized. For ICE Candidates, we call the _peerConnection.AddIceCandidate method to update the WebRTC peer connection. When we encounter an SDP message, we initiate the corresponding offer or answer processing sequence by calling OnRemoteSdpOfferReceived or OnRemoteSdpAnswerReceived, which we'll define in the upcoming steps.

Next, we'll implement a coroutine that begins the WebRTC connection process by creating and sending an SDP offer to the other peer:

This method:

  1. Invokes _peerConnection.CreateOffer() to generate the local SDP offer.
  2. Sets the generated offer as the local description of the connection using _peerConnection.SetLocalDescription.
  3. Sends the SDP offer to the other peer using SendSdpToOtherPeer, which serializes and transmits it over the WebSocket.

Now, let's construct the coroutine responsible for processing an incoming SDP offer:

Upon receiving an SDP offer, this coroutine will:

  1. Set the offer as the remote description of the WebRTC connection by calling _peerConnection.SetRemoteDescription .
  2. Generate an SDP answer by calling _peerConnection.CreateAnswer.
  3. Apply the created answer as the local description via _peerConnection.SetLocalDescription .
  4. Send the SDP answer to the initiating peer through SendSdpToOtherPeer.

Next, we’ll define a coroutine to handle an incoming SDP Answer:

This method simply sets the received SDP answer as the remote description with _peerConnection.SetRemoteDescription.

To summarize the whole process:

  1. Peer A creates an SDP Offer.
  2. Peer A sets the SDP offer as its local description.
  3. Peer A sends the offer to Peer B.
  4. Peer B receives the offer and sets it as its remote description.
  5. Peer B generates and sets the SDP answer
  6. Peer B sets the SDP Answer as its local description.
  7. Peer B sends the answer back to Peer A.
  8. Peer A receives the answer and sets it as its remote description.

This exchange of offer and answer is the core of the WebRTC signaling process, allowing both peers to agree on the media and connection details before starting the actual media transfer

Next, we'll set up an event handler for ICE Candidates generated by the WebRTC peer connection:

This handler will transmit each ICE Candidate discovered by the WebRTC engine to the other peer.

Now, let's tackle the handling of received media tracks:

  • OnTrack is triggered each time a track is received from another peer. Although we're only processing video tracks in this tutorial, audio tracks can be managed similarly.
  • Upon receiving a video track, we subscribe to the OnVideoReceived event to be notified when the video stream's texture becomes available.
  • RemoteVideoReceived is an event to which the UI layer will subscribe and set up the rendering of the received texture representing the video stream.

We'll also define a few public members to allow UI interaction with the VideoManager:

  • CanConnect and IsConnected are properties that will inform the UI about the connection's state, controlling the interactability of the Connect and Disconnect buttons.
  • SetActiveCamera method configures the active camera device, utilizing Unity's WebCamTexture class.
  • Connect method initiates the WebRTC connection setup sequence upon user interaction.
  • Disconnect method closes and cleans up the WebRTC connection when the user clicks on the Disconnect button to end the session.

The VideoManager class is completed.

The last step is to attach the VideoManager to a game object. Go to the Main scene, create a new game object, and add the VideoManager script to it:

The UI layer

Let's now focus on constructing the UI layer logic.

Begin by editing the PeerView.cs script to add the following method:

This method will be invoked to display the video from the other participant. The Texture argument, coming from WebRTC's OnTrack event, will contain the incoming video stream. Additionally, the method adjusts the dimensions of the RawImage component to maintain the correct aspect ratio of the incoming video.

With PeerView class complete, we can move on to the UIManager.

Open the UIManager script and declare these fields:

The _activeCamera will reference the currently selected camera, while _videoManager references the VideoManager we've previously created.

Next, add the Awake method:

  • We retrieve the VideoManager instance using Unity’s FindObjectOfType method. This method is known to be slow, but since we’re focusing on WebRTC in this tutorial, it’ll suffice.
  • We check if any camera devices are available and print error to the console if not. Ensure your camera is accessible by Unity and the error does not show up, otherwise the video streaming feature will not function.
  • The OnConnectButtonClicked and the OnDisconnectButtonClicked callback methods are added as listeners to the Connect and Disconnect buttons onClick event.
  • The camera selection dropdown is populated with names from the WebCamTexture.devices array, which enumerates the cameras recognized by the operating system. Keep in mind that a camera might still be inaccessible to Unity. For example, on Windows, Unity will not be able to access the camera device if it's already being used by another application.
  • The camera dropdown's onValueChanged event is linked to the SetActiveCamera method to switch the active camera.
  • Finally, we subscribe to VideoManager's RemoteVideoReceived event to handle incoming video tracks as they are received.

Next, we’ll define the SetActiveCamera method:

  • This method is subscribed to the camera dropdown's onValueChanged event, which provides the index of the selected camera. We use _cameraDropdown.options[deviceIndex].text to retrieve the device's name.
  • If a camera is already active, we stop it before starting the new one.
  • We create an instance of WebCamTexture and call Play() to start capturing the video from the camera device.
  • We then check the _activeCamera.isPlaying property to verify that the camera device indeed starts. Starting the camera may sometimes fail. For example, an application cannot access a camera device on Windows if another application already uses it.
  • Finally, the camera is passed to the VideoManager using the PassActiveCameraToVideoManager coroutine.

Now, add the following coroutine:

This method passes the camera to the PeerView, representing the local peer, and to the VideoManager. This is delayed until the camera is updated for the first time. Stating a camera is an asynchronous operation, and it is possible that the first frames will return an invalid texture. To get a sense of why we wait for the camera to start before passing it further, consider the PeerView.SetVideoTexture method - if the passed video texture is invalid, the calculated texture size will most likely be invalid as well.

Next, add the OnRemoteVideoReceived method:

This method is an event handler subscribed to the _videoManager.RemoteVideoReceived event. It’ll pass the received video texture to the remote peer view.

Now, define the click event handlers for the Connect and Disconnect buttons:

Next, add this code:

This ensures the first camera in the dropdown is activated by default upon starting the application.

Finally, implement the Update method:

The Update method dynamically updates the interactability of the Connect and Disconnect buttons based on the current connection state. While suitable for our tutorial, a more event-driven approach might be preferable for production to avoid checking the state every frame.

With these additions, the UIManager is fully functional and ready to manage the user interface for our video streaming application.

Complete Project

You can download the complete project from this repository.

Run the project

Everything is now ready to test our application. Make sure the signaling server is running and that the server's IPv4 address is correctly set in the WebSocket component of both clients, as explained in the The Server IP section. Launch the Unity project on two devices and press the Connect button on either device. You should see a video stream being successfully exchanged between them:

Keep in mind that even within a home network, a peer-to-peer connection might not be established due to specific network configurations. Addressing such connectivity issues falls beyond the scope of this tutorial.

Stream's Video SDK for Unity

While this guide provides a hands-on approach to building video streaming capabilities from scratch using WebRTC and Unity, we understand the complexity and time investment required to create a robust real-time communication solution. For developers seeking a more streamlined, ready-to-use option that minimizes integration efforts, Stream's Video SDK for Unity might be a suitable option.

To kickstart your development with Stream's Video SDK for Unity, we recommend exploring the following tutorials tailored to specific use cases:

Conclusion

In this tutorial, we've taken a deep dive into implementing real-time video streaming in Unity using WebRTC, focusing on the practical steps necessary to set up a peer-to-peer video-sharing connection. We started by establishing a basic signaling server using the websocket-sharp library, which is crucial for initiating the WebRTC connection between peers. We then transitioned to Unity, utilizing the Unity WebRTC package to handle the video streaming functionality. With this knowledge, you're now equipped to explore the vast potential of interactive media in your Unity projects.