Build a 1-on-1 Video Chat with SwiftUI and Dolby.io

In this tutorial, we'll integrate video chat into an iOS application. To do this, we integrate Dolby.io's Interactivity APIs, formally known as Voxeet, into our application. Video chat can easily be integrated with Stream Chat for a seamless communication experience.

Note: the library is still named Voxeet.

For this part, the application will support 1-on-1 private chat. Since Dolby is a pure client-side library, we only configure our ios application. However, to facilitate the UI for indicating whether a user has a call waiting, we use a few endpoints in the backend.

Note: Because these are minor and largely stub implementations, we don't go into them in this tutorial. Please refer to the source code on GitHub if you're curious. Also, ensure the backend is running if following along with this tutorial.

The app performs these steps:

  • Initialize Voxeet libraries
  • When a user navigates to the "People" screen, show a video icon next to a user's name.
  • When a user clicks this icon, start and join a Voxeet conference with a unique alias. The user waits for the other party to join. The application informs the backend of the new call.
  • When the other user enters following the previous steps, they'll be placed in a 1-on-1 conference.
  • When either user leaves, the call is ended in the application. The backend is informed of the call ending.

Voxeet's UXKit takes care of the connection and presentation for this 1-on-1 call. Our application needs to create and join the call, and the UXKit will overlay the video call UI.

Let's dive in.

Step 1: Create a Dolby Account and Install Voxeet Dependencies

Go to dolby.io and create an account. Once registered, navigate to the Dashboard (you can click in the top-right if you're not there). Find the list of applications:

Dolby Applications

If you already have an app ("my first app"), click into it. If you don't, create an app by hitting "Add New App". Now grab the API key and secret for our application. The keys we need are under the "Interactivity APIs" section. :

Dolby Keys

Click on "Show Secret" to view the secret.

Next, add VoxeetUIKit to our Podfile and pod install:

https://gist.github.com/nparsons08/4f91d14cd8baca194de16e4920d6001c

Note: Since the WebRTC dependency is larger than GitHub's file limit, we did not include the Pods in the repo. Please ensure you run pod install before attempting to run!

Step 2: Configure the Voxeet UXKit Library

Since the Voxeet library is not tied to a backend user account, we can configure it on the application load. We do this in the AppDelegate:

https://gist.github.com/nparsons08/feda96aeb508eff058664691c3caf379

Change and to the values you retrieved in Step 1. We configure Voxeet not to have any push notifications, turn on the speaker and video by default, and appear maximized. We also set telecom to true. When this is set, the conference will behave like a cellular call, meaning when either party hangs up or declines the request (decline not implemented in this tutorial), it will end the call.

If you'd like to use push notifications to notify a user when a call is incoming, check out CallKit combined with VoxeetSDK.shared.notification.push.type = .callKit. This is out of scope for this tutorial.

Step 3: Starting a Call

Next, we add a video action to the list to people, in between the start chat and follow icons from previous parts of this series. Here's what our view will look like:

People List

To do this, we just add another image view in our ListView in PeopleView:

https://gist.github.com/nparsons08/f1ef26ab57fb8d9a7b33aa131f699b70

This is a simple system icon that we load via systemName. With the icon in place, we can add a click handler via onTapGesture to start our 1-on-1 call. Ignore the icon foregroundColor call for now. We'll get to that in a bit.

Let's look at startConferenceCall:

https://gist.github.com/nparsons08/e0c174a13cd7b6365d34d51020080b16

Here we create our conference call using Voxeet with an alias. We use this alias as an identifier, so the other user's application knows how to join the same call. The call to .create yields us a conference object. First, we call our backend via startCall to register the call, so the other user knows there's a call waiting. This is simply a POST request:

Note: We use Voxeet's conference implementation as it's perfect for facilitating a video chat between two people. The conference object is more powerful than this (multiple users, broadcast streams, screen sharing, etc.), but here only use it for a 1-on-1 call. The terms conference and calls are used interchangeably in this tutorial, given the scope of our application.

https://gist.github.com/nparsons08/9fabd98f5b5f7d4ddd1ee5c81276cc2a

Once we've notified the backend of the call, we join the conference via .join. Since we're using Voxeet's UXKit a video chat UI slides up from the bottom automatically:

Video Waiting

Step 4: Joining a Call

Now that the user has started a call with someone, we want the other user to see a call started so they can join. To keep things simple, we just turn the video icon red if there's a call active. Recall from above that we are changing the video icon color via foregroundColor via a call to .videoIconColor:

https://gist.github.com/nparsons08/0687ba0e1c02eda4d5f0ad7e0590168f

Here we'll check a @State var calls for a call from the other user. If we do find one, we color the icon red. The calls var gets initialized when PeopleView appears via fetch:

https://gist.github.com/nparsons08/0d8c16e323f966b1bebad962f3ade037

We call to account.fetchCalls to retrieve a list of active calls for the current user:

https://gist.github.com/nparsons08/78f87c35de4784a976f8216266c23f5a

This is simply a GET request against our mock backend (refer to source). The response object will have a from field, indicating who the user started the call.

Now that we know a user is calling the user will see a screen like this:

People List - Active

If the user joins the call, the UXKit UI will change to show the call has started:

Video Active

Step 5: Leaving a Call

When a user is done, they hang up using the end call icon. We don't need to do anything special on our side to end the call, Voxeet takes care of that. We need to listen for the conference call ending so we can notify the backend of the change.

Note: In a production application, you'd likely want to use Voxeet's push notifications and CallKit bindings or look at binding Dolby's Interactivity API WebSockets to the backend. While the approach below works, it is less robust than using the full Notification system built into Dolby's Interactivity APIs.

We'll bind to the notification center and listen for the conference call to end. Upon ending, we'll notify the backend the call has finished:

https://gist.github.com/nparsons08/0e78a521def60816711fca5c2ac93bba

We use SwiftUI's convenient .onReceive method to bind to the NotificationCenter event .VTConferenceDestroyed. When this happens, we know the call has finished. Since we configured VoxeetUXKit.shared.telecom = true this coincides on both sides when either party hangs up. When we get the notification, we simply call to the backend to stop the call via account.stopCall and fetch the new set to refresh the view (effectively removing the red video indicator). The stopCall is a simple DESTROY call to the backend:

https://gist.github.com/nparsons08/33b94bbd8d45a762ccf13fedb8ff17a4

And we're done! We now have a 1-on-1 chat between two parties integrated into our application using Dolby's Voxeet offering. For more information on how to integrate Stream Chat within your application, have a look at the GitHub repo.

TutorialsChat