Unity Video Calling Tutorial

Introduction

This tutorial teaches you how to build Zoom/Whatsapp style video calling for your app.

  • Calls run on Stream's global edge network for optimal latency & reliability.
  • Permissions give you fine-grained control over who can do what.
  • Video quality and codecs are automatically optimized.
  • Powered by Stream's Video Calling API.

Prerequisites

To follow this tutorial, you must install a Unity Editor. We'll use the 2021.3.26f1 LTS version, but any supported Unity version will do just fine. You can check the complete list of supported Unity Editor versions here.

Step 1 - Download Starting Project

To speed you up, we've prepared a starting Unity Project that you can download from our GitHub repository. You can download it using git or click here to download the project in a zip file.

Download the project and open it in Unity Editor.

The starting project contains the UI components that we will use throughout this tutorial.

Step 1 - Add Stream's Video SDK for Unity

Follow the installation section to see how to import Stream's Video SDK into a Unity Project.

After completing this step, you should see the Stream Video & Audio Chat SDK package under Packages in the Project window.

Imported SDK package

Step 3 - Setup Video Manager

In the Project window:

  1. Open the VideoCallingTutorial scene from the Scenes folder
  2. Navigate to Assets -> Scripts folder
  3. Create a new script file and name it VideoManager.cs
  4. Open the VideoManager.cs in your code editor and replace it with the following script:

Let's go through this script step by step to understand what's happening.

These fields:

define the authorization variables: api key, user id and the user token - essential to connect a User to the Stream Video API. The SerializeField attribute exposes them in Unity's Inspector and allows us to set them in the Unity Editor.

The _joinCallId variable will be used to store the ID of the call that you will be joining.

Here, we instantiate the client for the Stream's Video & Audio SDK:

StreamVideoClient is the main component that you'll be using to access features of the Stream Video API.

Next, we wrap the authorization credentials in a convenient struct:

And finally, we call the ConnectUserAsync method that will attempt to establish a connection:

We're using .NET's modern async/await syntax, which makes writing asynchronous code that waits for server response very clean & concise.

After the await completes, we should be connected to the Stream server.

We've wrapped the asynchronous ConnectUserAsync method in a try/catch block. If you're unfamiliar with handling exceptions in asynchronous methods: we advise you to always wrap awaited methods in a try/catch block to catch any thrown exceptions. This way, you won't miss any exceptions thrown during an async operation.

Next, go to the Hierarchy Window (Make sure the VideoCallingTutorial scene is opened), create an empty game object and name it VideoManager:

Created AudioRoomsManager empty Game Object

Next, drag the newly created VideoManager.cs script onto the VideoManager game object and save the scene.

You should now have a game object with the VideoManager.cs script attached.

GameObject with attached AudioRoomsManager.cs script

Once you select the VideoManager game object, you should see the Api Key, User Id, User Token, and Join Call Id fields visible in the Inspector window.

GameObject with attached AudioRoomsManager.cs script

Step 4 - Connect a user to Stream server

You need a valid set of authorization credentials to run this script that will connect a user to Stream Video API. In an actual project, you'd obtain the api key assigned to your application from the Stream's Dashboard. The user id and the user token would typically be generated by your backend service using one of our server-side SDKs. To keep things simple, you can use the following set of credentials for this tutorial:

Here are credentials to try out the app with:

PropertyValue
API KeyWaiting for an API key ...
Token Token is generated ...
User IDLoading ...
Call IDCreating random call ID ...
For testing you can join the call on our web-app: Join Call

Note: Test credentials provided above will be unique for each browser session. To join the call from multiple devices, ensure you copy and use the same Call ID

Copy the values from the window above and paste them into the corresponding fields in the VideoManager game object:

  • API KEY -> Api Key
  • Token -> User Token
  • User ID -> User Id
  • Call ID -> Join Call Id

Filled Credentials

Save the scene.

Once you run the project (Press the Play button in Unity Editor), you should see a log in Unity Console confirming that the user is connected to the Stream server:

Filled Credentials

The following sections will use the Join call id you've also copied.

Step 5 - Add Methods to Join and Leave a call

In this step, we'll add methods to Join and Leave the call - they'll be called when the user clicks on the Join and Leave buttons in the UI.

Open the VideoManager.cs script and apply the following changes:

First, add the _activeCall field to the class:

The fields part of the class should now look like this:

Next, add the following methods to the VideoManager class:

Step 6 - Create UI Manager script

Next, we'll add a UI manager script that will keep references to all interactable UI elements and handle user interactions.

UX tips:

  • On standalone platforms (Windows, macOS, Linux), it's typical to show a dropdown with all available devices from which the user can pick which one to use
  • On mobile platforms (Android, IOS), the OS controls the camera & microphone devices. A typical pattern for camera handling is to pick the front camera by default and allow users to toggle between the front and the back camera. The standard approach for the microphone is to select the default microphone.

In this tutorial, we're going with the dropdowns.

  1. Go to the Scripts folder, create a new C# script, and name it UIManager.cs
  2. Open the script in your code editor and paste the following content:

Next, in the Unity Editor:

  1. Go to the Hierarchy window, create a new empty game object, and name it UIManager.
  2. Select the UIManager game object and attach the UIManager.cs script to it.
  3. Save the scene

You should now have a UIManager game object with UIManager.cs script attached to it: UI Manager script attached to a game object

Step 7 - Setup UIManager references in the inspector

The UIManager will use scene UI elements and the prefab asset that we've provided with the starting project. We'll now attach all references visible in the Inspector window when you select the UIManager game object:

Game Objects from the scene hierarchy

In case you're wondering how to do this quickly, you can follow these steps:

  1. Select the UIManager game object in the Hierarchy window and lock the window (Read Unity Docs on how to lock Inspector window)
  2. Use the Hierarchy's window search input to search for each game object by name and drag it into the corresponding field in the locked Inspector window.
  3. Unlock the Inspector window

Here's a full list of UIManager fields and the corresponding game objects that we want to attach:

  • Participants Container - Search for ParticipantsContainer in the Hierarchy window
  • Participant Prefab - This is the only one that we'll drag in from the Project window, drag in the Prefabs/ParticipantPanel prefab file
  • Microphone Dropdown - Search for the MicrophoneDropdown in the Hierarchy window
  • Camera Dropdown - Search for the CameraDropdown in the Hierarchy window
  • Join Button - Search for the JoinButton in the Hierarchy window
  • Leave Button - Search for the LeaveButton in the Hierarchy window
  • Video Manager - Search for the VideoManager in the Hierarchy window

Save the scene after assigning all the inspector references.

These are the game objects, from the Scene Hierarchy, that need to be attached to the UIManager script: References Scene Hierarchy Game Object

And from the project files, we've attached the Prefabs/ParticipantPanel prefab:

Prefab from the project files

After completing this step, the UIManager should have all references set up as shown below: UI Manager with all references assigned

Step 8 - Add the UI logic

Next, add the following code to the UIManager.cs script:

The OnJoinButtonClickedAsync method calls the VideoManager's JoinCallAsync method. Once the JoinCallAsync method await completes, it means that we have joined a call. We then update the UI to reflect this state -> hide the join button and show the leave button.

The OnLeaveButtonClickedAsync method calls the VideoManager's LeaveCallAsync method and once the await completes, it will also update the UI and show the join button and hide the leave button.

Next, add the following code to the UIManager.cs script:

The first two fields will store references to Stream's Video and Audio Managers that facilitate interactions with the camera and microphone. The lists will store available devices to choose from. We'll see this used in the following steps.

Next, add the following Start method to the UIManager.cs script:

Let's walk through this code step by step.

First, we get a reference to the StreamVideoClient instance exposed by the StreamClient property of the VideoManager. We then grab references to Stream's Audio and Video Manager classes that handle all interactions with the microphone and camera.

Note that we're executing this code using the Start method instead of the Awake method. This is because the StreamClient property is initialized by the VideoManager in the Awake method, which Unity calls before the Start method. This way, we ensure that the VideoManager has created the instance of Stream Client before the UIManager tries to access it.

Next, we handle the microphone dropdown:

  • ClearOptions is called to remove default options from the dropdown.
  • _audioDeviceManager.EnumerateDevices() fetches available microphone devices. We then add them to the _microphoneDevices list, so we can later retrieve the selected option by index.
  • _microphoneDevices.Select(d => d.Name).ToList() uses LINQ to get the Name property from each microphone struct.
  • _microphoneDropdown.AddOptions(microphoneLabels) adds the list of microphone names as dropdown entries.

You need to have at least 1 microphone device available. Otherwise, the _audioDeviceManager.EnumerateDevices() will be empty and the code will fail.

Next, we do the same for the camera dropdown:

  • ClearOptions is called to remove default options from the dropdown.
  • _videoDeviceManager.EnumerateDevices() fetches available camera devices. We then add them to the _cameraDevices list, so we can later retrieve the selected option by index.
  • _cameraDevices.Select(d => d.Name).ToList() uses LINQ to get the Name property from each camera object.
  • _cameraDevices.AddOptions(cameraLabels) adds the list of camera names as dropdown entries.

You need to have at least 1 camera device available. Otherwise, the _videoDeviceManager.EnumerateDevices() will be empty and the code will fail.

Next, we subscribe to buttons onClick event:

And finally, we set the leave button to be hidden by default. We'll show it once we're connected to a call.

Step 8 - Test joining the call

Here are credentials to try out the app with:

PropertyValue
API KeyWaiting for an API key ...
Token Token is generated ...
User IDLoading ...
Call IDCreating random call ID ...
For testing you can join the call on our web-app: Join Call

You should now have the Api Key, User Id, User Token, and the Join CallId all filled: VideoManager references all filled

Remember to save the scene after you update the values in the inspector window.

Now run the project, and after clicking the Join button, you should see a log confirming that we've got connected to a call: Log confirming that we're connected to the call

Step 9 - Handle call participants

Next, we'll add handling call participants. Call participants are usually other users who joined the same call. It's worth noting that a single user can join through multiple devices and therefore join as multiple participants. If the same user joins through multiple devices, each participant object will have the same UserId field but will have a different SessionId.

Open VideoManager.cs in your code editor and apply the following changes.

First, add the following code to the class:

Next, add those 2 methods:

Next, replace the JoinCallAsync method with the following code:

What we did here:

  • Iterate using a foreach loop over the call.Participants array that contains participants present in the call when we joined. We skip the participant representing the local user because usually, we want to show other participants only.
  • Subscribe to the call.ParticipantJoined event - triggered whenever a new participant joins the call
  • Subscribe to the call.ParticipantLeft event - triggered whenever a participant leaves the call

Next, replace the LeaveCallAsync method with the following code:

We've extended the LeaveCallAsync method to unsubscribe from events when we're leaving the call.

Our current implementation doesn't handle the ParticipantJoined and the ParticipantLeft events yet - it only propagates them. This is a common pattern that provides a separation of concerns between the business logic and the UI layer. In later steps, we'll handle these events in the UIManager class.

Step 10 - Setup UI view for a call participant

Now let's prepare the logic that will handle every call participant and their tracks. Participant tracks represent either video or audio data they're streaming during the call.

Open the Scripts/ParticipantPanel.cs script in your code editor and replace it with the following code:

Now let's go through all of this code to understand what it's doing.

On the top of the class we have the public Init method. We'll call it after we spawn the participant panel prefab to initialize the view for a specific participant.

What the Init method does:

  • Iterates with a foreach loop over the available tracks. The OnTrackAdded method is called for each track - this method will handle receiving video or audio data from the track.
  • Subscribes to the TrackAdded event - this way we'll also handle tracks that were added later in the call.

Next, is the OnTrackAdded method that handles receiving data from the track:

  • If the track is of StreamAudioTrack type, we get the AudioSource reference and call streamAudioTrack.SetAudioSourceTarget(audioSource); in order to start receiving audio data into this AudioSource component.
  • If the track is of StreamVideoTrack type, we get the RawImage reference and call streamVideoTrack.SetRenderTarget(rawImage); in order to start receiving video data into this RawImage component.

Lastly, we've defined the OnDestroy method in which we unsubscribe from the TrackAdded event. The OnDestroy is a special method called by Unity Engine when the object is about to get destroyed.

Step 11 - Add UI logic to handle call participants

Open the UIManager.cs in your code editor and make the following changes:

Add this field to the class:

It'll be used to store spawned participant UI panels. We've used dictionary because whenever a participant leaves the call, we can easily retrieve a corresponding UI panel in order to destroy it. We'll use Session ID as the key because this is the unique identifier of each participant in a call.

Next, add the following OnParticipantJoined method:

This method will be called for every participant in the call. It does 3 things:

  • Spawns a new instance of the ParticipantPanel prefab.
  • Calls the ParticipantPanel.Init method and passes the IStreamVideoCallParticipant object.
  • Adds the newly spawned panel to the _participantPanels dictionary using the participant's SessionId as key.

Next, add the following OnParticipantLeft method:

The OnParticipantLeft method will be called whenever a participant leaves the call - it retrieves the ParticipantPanel object that corresponds to the participant that left the call and destroys it.

Next, add these 2 lines anywhere in the Start method:

This will subscribe to the VideoManager events that we've added earlier and trigger the logic to handle participants joining or leaving the call.

Lastly, replace the OnLeaveButtonClickedAsync with the following code:

What was added is this part:

This will destroy all participant panels after we leave the call and clear the dictionary.

Step 12 - Setup Camera and Microphone

At this point - we're able to receive video & audio streamed by others. The last feature that we're missing is being able to send video & audio from our local device. Let's get right to it!

Open the UIManager.cs script.

Next, add the following method:

The deviceIndex argument is provided by the dropdown's onValueChanged event and will represent the index of the selected dropdown option. Previously, we've stored all microphone devices in the _microphoneDevices list, so we can retrieve the selected microphone by index. We can now call the _audioDeviceManager.SelectDevice(selectedMicrophone, enable: true); to select the device in Stream's Audio Device Manager. The additional enable argument defines whether the microphone should capture audio input immediately.

Next, add this method that will handle the camera dropdown:

We use the deviceIndex argument, provided by the dropdown's onValueChanged event, to get the CameraDeviceInfo representing the selected camera. We then supply it into the _videoDeviceManager.SelectDevice method to choose this camera for video capturing. This method accepts additional arguments, allowing you to control the maximum resolution of the sent video stream and whether the video capturing should start immediately.

Lastly, append this code to the Start method:

Let's go step by step through these changes.

Here:

we subscribe to the onValueChanged events in order to switch the active microphone or camera device whenever user selects a different device from a dropdown menu.

Lastly:

we call the OnMicrophoneDeviceChanged & OnCameraDeviceChanged methods providing 0 as a deviceIndex argument so that we start capturing the input from the first device from the list. This is a reasonable default behaviour.

Step 13 - Android/iOS

An additional step is needed if you wish to test this app on an Android or iOS. You can skip this step if you only test on a standalone platform (Win, Linux, macOS). Mobile devices often require applications to explicitly request permissions for the user's camera or microphone devices, which the user must grant. Otherwise, the app may be unable to access the camera or microphone.

Handling permissions on Android

Add this namespace using UnityEngine.Android; to the top of the UIManager.cs file.

And add this code to the Start method.

Handling permissions on iOS

Add this code to the Start method of the UIManager class.

Step 14 - Test

We're now ready to test our app!

It's best to use multiple devices for testing. Otherwise, you may run into conflict when multiple applications attempt to use the microphone or a camera on a single device.

A quick testing setup:

  1. Right-click the Join Call button in the box below and copy the URL.
  2. Send this URL to another device you have, e.g., your smartphone. You can send it to multiple devices if you want.
  3. Run the project in the Unity Editor and click the Join button to join the call.
  4. Open the Join Call URL you've sent to other devices and join the call.

There are a few things to double-check:

  • Ensure that you're using the same App ID and the Call ID on all devices you wish to connect
  • On standalone platforms (Win, macOS, Linux), headphones with a microphone can be represented as two devices: Headphones and the Headset. Select the Headset in the Unity app to use the microphone in the application and as the audio output device in the OS.

Here are credentials to try out the app with:

PropertyValue
API KeyWaiting for an API key ...
Token Token is generated ...
User IDLoading ...
Call IDCreating random call ID ...
For testing you can join the call on our web-app: Join Call

Note: Test credentials provided above will be unique for each browser session. To join the call from multiple devices, ensure you copy and use the same Call ID

Congrats! You should now be able to join the video call from multiple devices.

Mobile platforms

If you're primarily interested in mobile platforms, you can follow our docs on building for Android or iOS and test the app we've just created on your smartphone as a bonus step.

Troubleshooting

In order for this tutorial app to work, Unity needs to be able to access the microphone and the camera devices.

If you're noticing cryptic errors as below: Unity not being able to access microphone or camera

This means that Unity is not able to access the selected device. In such case, you need to ensure that no other application is using the devices and try again. It may also be needed to restart the Unity Editor.

Recap

Please do let us know if you ran into any issues while building a video calling app with our Unity SDK. Our team is also happy to review your UI designs and offer recommendations on how to achieve it with Stream.

We've used Stream's Video Calling API, which means calls run on a global edge network of video servers. By being closer to your users the latency and reliability of calls are better. The Unity SDK enables you to build in-app video calling, audio rooms and livestreaming very easily.

We hope you've enjoyed this tutorial and please do feel free to reach out if you have any suggestions or questions.

Final Thoughts

In this video app tutorial we built a fully functioning Unity Video Calling app with our Video SDK for Unity library.

Both the video SDK for Unity and the API have plenty more features available to support more advanced use-cases.

Give us Feedback!

Did you find this tutorial helpful in getting you up and running with Unity for adding video to your project? Either good or bad, we’re looking for your honest feedback so we can improve.

Next Steps

Create your free Stream account to start building with our Video & Audio SDKs, or contact our team if you have additional questions.

Chat Messaging

Build any kind of chat messaging experience without scalability or reliability issues.

Learn more about $ Chat Messaging

Enterprise

Available 99.999% uptime SLAs and industry-leading security to power the world's largest apps.

Learn more about $ Enterprise