Did you know? All Video & Audio API plans include a $100 free usage credit each month so you can build and test risk-free. View Plans ->

WebRTC For The Brave

WebRTC for Swift Developers

In this article, we will explore what it takes to render a video chat using WebRTC in Swift.

When looking into implementing video calling, the WebRTC (Web Real-Time Communication) protocol is one of the best options available. While you are most likely better off using an existing implementation of a WebRTC based stack, it is helpful for your understanding of WebRTC to implement a barebones implementation yourself. In this article, we will explore what it takes to render a video chat using WebRTC in Swift.
We have written more about the basics of WebRTC in our WebRTC for the Brave series. If you are unfamiliar with the concept, read our introduction to WebRTC first.

WebRTC on iOS

Google maintains the open-source project WebRTC. For iOS, a prebuilt library is available. It allows you to integrate WebRTC with iOS-specific features such as camera, audio, video access network handling, and much more.
However, if you look through the guides made available by Google, they rely on CocoaPods, and Google’s version on CocoaPods is deprecated.
To make the current WebRTC available as a Swift package, we keep a Swift package definition up to date with the current WebRTC binaries. At Stream, we use this framework daily to build our video service and support our customers.
By using our build of WebRTC binaries as a Swift Package, it is easy to integrate WebRTC into your codebase.
In Xcode, you need to search for a git package.

<https://github.com/GetStream/stream-video-swift-webrtc.git>

Or when adding to a Package.swift file.

swift
            …
.package(url: "<https://github.com/GetStream/stream-video-swift-webrtc.git>", branch: "main")
…
.product(name: "StreamWebRTC", package: "stream-video-swift-webrtc")
…
        

We’ve made a customized version of stasel/WebRTC-iOS available.
You can find our version using our WebRTC example in this GitHub repository.

We’ve added StreamWebRTC as a Swift package on this project and, from then on, made a few minor adjustments to compile the code with our WebRTC framework build.

Let’s now take a quick tour of the WebRTC-iOS-Sample.
The core component in the entire codebase is the WebRTCClient. You can find the complete 300+ lines of code here.

PeerConnection

When looking at this code, the structure of the WebRTCClient is an implementation of the RTCPeerConnectionDelegate protocol.

The WebRTC framework provides this protocol, and a PeerConnection is one of the core concepts related to connecting with a remote peer. It handles the Session Description Protocol (SDP) and allows you to respond to the changes associated with ICE (Interactive Connectivity Establishment).

Through the WebRTCClient, the SDP offering is initiated and handled. The WebRTCClient uses the SignalingClient to send and receive SDP messages to and from the SignallingServer.

RTCPeerConnectionFactory

The WebRTCClient maintains an instance of a RTCPeerConnectionFactory. It is responsible for creating a PeerConnection and building and keeping track of audio/video tracks and sources. It’s a key component bridging the C++ Native code of the core WebRTC library.

If you look at the creation of the WebRTCClient, you will also notice that the servers used for ICE are Google’s STUN servers.

swift
            private static let factory: RTCPeerConnectionFactory = {
  RTCInitializeSSL()
  let videoEncoderFactory = RTCDefaultVideoEncoderFactory()
  let videoDecoderFactory = RTCDefaultVideoDecoderFactory()
  return RTCPeerConnectionFactory(encoderFactory: videoEncoderFactory, decoderFactory: videoDecoderFactory)
}()
        

The Signaling Client

As we described earlier, the WebRTCClient initiates signaling, but the SignallingClient does all the encoding, decoding, and handling of SDP messages.

The SignallingClient needs to connect to a signaling server. And fortunately, there’s a basic implementation of a signaling server in our repository as well. The signaling server is nothing more than a WebSocket server that forwards every message it receives to all other connected clients—a simple but essential task enabling things like NAT traversal.

swift
            import Foundation
import StreamWebRTC

protocol SignalClientDelegate: AnyObject {
    func signalClientDidConnect(_ signalClient: SignalingClient)
    func signalClientDidDisconnect(_ signalClient: SignalingClient)
    func signalClient(_ signalClient: SignalingClient, didReceiveRemoteSdp sdp: RTCSessionDescription)
    func signalClient(_ signalClient: SignalingClient, didReceiveCandidate candidate: RTCIceCandidate)
}

final class SignalingClient {

    private let decoder = JSONDecoder()
    private let encoder = JSONEncoder()
    private let webSocket: WebSocketProvider
    weak var delegate: SignalClientDelegate?

    init(webSocket: WebSocketProvider) {
        self.webSocket = webSocket
    }

    func connect() {
        self.webSocket.delegate = self
        self.webSocket.connect()
    }

    func send(sdp rtcSdp: RTCSessionDescription) {
        let message = Message.sdp(SessionDescription(from: rtcSdp))
        do {
            let dataMessage = try self.encoder.encode(message)

            self.webSocket.send(data: dataMessage)
        }
        catch {
            debugPrint("Warning: Could not encode sdp: \\(error)")
        }
    }

    func send(candidate rtcIceCandidate: RTCIceCandidate) {
        let message = Message.candidate(IceCandidate(from: rtcIceCandidate))
        do {
            let dataMessage = try self.encoder.encode(message)
            self.webSocket.send(data: dataMessage)
        }
        catch {
            debugPrint("Warning: Could not encode candidate: \\(error)")
        }
    }
}

extension SignalingClient: WebSocketProviderDelegate {
    func webSocketDidConnect(_ webSocket: WebSocketProvider) {
        self.delegate?.signalClientDidConnect(self)
    }

    func webSocketDidDisconnect(_ webSocket: WebSocketProvider) {
        self.delegate?.signalClientDidDisconnect(self)

        // try to reconnect every two seconds
        DispatchQueue.global().asyncAfter(deadline: .now() + 2) {
            debugPrint("Trying to reconnect to signaling server...")
            self.webSocket.connect()
        }
    }

    func webSocket(_ webSocket: WebSocketProvider, didReceiveData data: Data) {
        let message: Message
        do {
            message = try self.decoder.decode(Message.self, from: data)
        }
        catch {
            debugPrint("Warning: Could not decode incoming message: \\(error)")
            return
        }

        switch message {
        case .candidate(let iceCandidate):
            self.delegate?.signalClient(self, didReceiveCandidate: iceCandidate.rtcIceCandidate)
        case .sdp(let sessionDescription):
            self.delegate?.signalClient(self, didReceiveRemoteSdp: sessionDescription.rtcSessionDescription)
        }

    }
}
        

The Signaling Server

We will use the server described in our lesson about Signaling Servers in our WebRTC tutorials for signaling.

Connecting Audio and Video

After establishing a PeerConnection, it is time to create a few things:

  • A Video Source
  • Video Capturer to obtain audio and video through a Video Source
  • Local and Remote Video Track
  • Local and Remote Data Channel

These components give you enough control over the camera capabilities to select a maximum resolution and control video rendering.

In the background, the WebSocket with the signaling server remains intact. When network conditions require adjusting or changing an ICE candidate, the WebRTCClient gets informed and responds accordingly.

swift
            private func createMediaSenders() {
    let streamId = "stream"

    // Audio
    let audioTrack = self.createAudioTrack()
    self.peerConnection.add(audioTrack, streamIds: [streamId])

    // Video
    let videoTrack = self.createVideoTrack()
    self.localVideoTrack = videoTrack
    self.peerConnection.add(videoTrack, streamIds: [streamId])
    self.remoteVideoTrack = self.peerConnection.transceivers.first { $0.mediaType == .video }?.receiver.track as? RTCVideoTrack

    // Data
    if let dataChannel = createDataChannel() {
        dataChannel.delegate = self
        self.localDataChannel = dataChannel
    }
}

private func createAudioTrack() -> RTCAudioTrack {
    let audioConstrains = RTCMediaConstraints(mandatoryConstraints: nil, optionalConstraints: nil)
    let audioSource = WebRTCClient.factory.audioSource(with: audioConstrains)
    let audioTrack = WebRTCClient.factory.audioTrack(with: audioSource, trackId: "audio0")
    return audioTrack
}

private func createVideoTrack() -> RTCVideoTrack {
    let videoSource = WebRTCClient.factory.videoSource()

    #if targetEnvironment(simulator)
    self.videoCapturer = RTCFileVideoCapturer(delegate: videoSource)
    #else
    self.videoCapturer = RTCCameraVideoCapturer(delegate: videoSource)
    #endif

    let videoTrack = WebRTCClient.factory.videoTrack(with: videoSource, trackId: "video0")
    return videoTrack
}
        

The linked GitHub repository contains all the parts needed to set up a working WebRTC connection.

Run the WebRTC iOS Sample Project.

Conclusion

In this lesson, you've worked with an iOS based implementation of WebRTC. Several of the core concepts were discussed such as peer-to-peer communication, the signaling server, SDP (Session Description Protocol), and ICE (Interactive Connectivity Establishment). You've also learned how to implement real-time video streaming on iOS using Swift. For further exploration, you can access the complete source code on our GitHub repository.