Capturing Snapshots

Capturing a snapshot of a call it’s a very common usecase for video-call products. Luckily StreamVideo provides you with the means to implement it easily in yours. Below you will find the few simple steps required:

Overview

Capturing snapshots during call: Capture snapshots from any CallView.
Capturing local participant photo: Capture photos from the local participant’s camera.

Capturing snapshots during call

The snapshot ViewModifier

The StreamVideoSwiftUI SDK ships with a ViewModifier that can should be attached on the container of the view hierarchy that we would like to capture in our snapshots. You can simply attach the modifier as in the example below:

YourHostView()
    .snapshot(
        trigger: snapshotTrigger,
        snapshotHandler: { snapshot in
            // Further processing ...
        }
    )

The snapshot ViewModifier requires few parameters:

trigger: This is the object controls when the capture should occur. Usually the button that triggers the snapshot and view that will be captured are in different viewHierarchies, which makes passing bindings from a rootView to another, difficult. For this reason the viewModifier expects a trigger of type SnapshotTriggering in order to bridge this communication. Below we will see a simple implementation for the trigger.
snapshotHandler: The closure that will be called once the snapshot’s capturing completes. The SDK will pass the snapshot in the handler for further processing.

The snapshot trigger

As we discussed below, in a common scenario that button that triggers the snapshot capturing and the view that we need to be snapshotted, will be in different ViewHierarchies. The trigger exists to bridge the communication between those and make capturing a snapshot easy and performant. Here is a simple implementation for the button that triggers the capture:

final class StreamSnapshotTrigger: SnapshotTriggering {
    lazy var binding: Binding<Bool> = Binding<Bool>(
        get: { [weak self] in
            self?.currentValueSubject.value ?? false
        },
        set: { [weak self] in
            self?.currentValueSubject.send($0)
        }
    )

    var publisher: AnyPublisher<Bool, Never> { currentValueSubject.eraseToAnyPublisher() }

    private let currentValueSubject = CurrentValueSubject<Bool, Never>(false)

    init() {}

    func capture() {
        binding.wrappedValue = true
    }
}

In order to use it in our Views easily we are going to insert it in the SDK’s DependencyInjection system for easy fetcing:

/// Provides the default value of the `StreamSnapshotTrigger` class.
struct StreamSnapshotTriggerKey: InjectionKey {
    @MainActor
    static var currentValue: StreamSnapshotTrigger = .init()
}

extension InjectedValues {
    /// Provides access to the `StreamSnapshotTrigger` class to the views and view models.
    var snapshotTrigger: StreamSnapshotTrigger {
        get {
            Self[StreamSnapshotTriggerKey.self]
        }
        set {
            Self[StreamSnapshotTriggerKey.self] = newValue
        }
    }
}

With the trigger in the DI system we can easily create a SnapshotButtonView that we can add in our controls (or anywhere in our app) that will trigger a snapshot:

struct SnapshotButtonView: View {
    @Injected(\.snapshotTrigger) var snapshotTrigger

    var body: some View {
        Button {
            snapshotTrigger.capture()
        } label: {
            Label {
                Text("Capture snapshot")
            } icon: {
                Image(systemName: "circle.inset.filled")
            }

        }
    }
}

The snapshotHandler

Once the snapshot is ready, the StreamVideo SDK will call the snapshotHandler we passed to the snapshot ViewModifier. At this point we have control over the snapshot and what we want to do with it.

In the example below, we want to send the snapshot the CallPartipants View and using the WebRTC’s event channel, send it to all participants in the call. When a new snapshot is being received then each participant will display a simple ToastView with the new snapshot.

Firstly we need to attach the snapshot ViewModifier to the desired View. The ViewFactory comes in handy as we can update it as following:

class CustomViewFactory: ViewFactory {

    func makeVideoParticipantsView(
        viewModel: CallViewModel,
        availableFrame: CGRect,
        onChangeTrackVisibility: @escaping @MainActor(CallParticipant, Bool) -> Void
    ) -> some View {
        DefaultViewFactory.shared.makeVideoParticipantsView(
            viewModel: viewModel,
            availableFrame: availableFrame,
            onChangeTrackVisibility: onChangeTrackVisibility
        )
        .snapshot(trigger: snapshotTrigger) { [weak viewModel, weak self] in
            guard
                let resizedImage = self?.resize(image: $0, to: CGSize(width: 30, height: 30)),
                let snapshotData = resizedImage.jpegData(compressionQuality: 0.8)
            else { return }
            Task {
                do {
                    try await viewModel?.call?.sendCustomEvent([
                        "snapshot": .string(snapshotData.base64EncodedString())
                    ])
                    log.debug("Snapshot was sent successfully ✅")
                } catch {
                    log.error("Snapshot failed to  send with error: \(error)")
                }
            }
        }
    }

    private func resize(
        image: UIImage,
        to targetSize: CGSize
    ) -> UIImage? {
        guard
            image.size.width > targetSize.width || image.size.height > targetSize.height
        else {
            return image
        }

        let widthRatio = targetSize.width / image.size.width
        let heightRatio = targetSize.height / image.size.height

        // Determine the scale factor that preserves aspect ratio
        let scaleFactor = min(widthRatio, heightRatio)

        let scaledWidth = image.size.width * scaleFactor
        let scaledHeight = image.size.height * scaleFactor
        let targetRect = CGRect(
            x: (
                targetSize.width - scaledWidth
            ) / 2,
            y: (targetSize.height - scaledHeight) / 2,
            width: scaledWidth,
            height: scaledHeight
        )

        // Create a new image context
        UIGraphicsBeginImageContextWithOptions(targetSize, false, 0)
        image.draw(in: targetRect)

        let newImage = UIGraphicsGetImageFromCurrentImageContext()
        UIGraphicsEndImageContext()

        return newImage
    }
}

As we want to send the snapshot using WebRTC’s internal event channel, we are affected by some limitations. The size of each event cannot surpass the 100KB limit. For this reason, we are reducing the snapshot’s size & quality before sending it.

In order then to present the snapshot for every participant, we need to subscribe on CallEvents of type CustomVideoEvent and then extract the snapshot before passing it to the UI for presentation. For simplicity, we are going to encapsulate all this logic around a DemoSnapshotViewModel object like below:

@MainActor
final class DemoSnapshotViewModel: ObservableObject {

    private let viewModel: CallViewModel
    private var snapshotEventsTask: Task<Void, Never>?

    @Published var toast: Toast?

    init(_ viewModel: CallViewModel) {
        self.viewModel = viewModel
        subscribeForSnapshotEvents()
    }

    private func subscribeForSnapshotEvents() {
        guard let call = viewModel.call else {
            snapshotEventsTask?.cancel()
            snapshotEventsTask = nil
            return
        }

        snapshotEventsTask = Task {
            for await event in call.subscribe(for: CustomVideoEvent.self) {
                guard
                    let imageBase64Data = event.custom["snapshot"]?.stringValue,
                    let imageData = Data(base64Encoded: imageBase64Data),
                    let image = UIImage(data: imageData)
                else {
                    return
                }

                toast = .init(
                    style: .custom(
                        baseStyle: .success,
                        icon: AnyView(
                            Image(uiImage: image)
                                .resizable()
                                .frame(maxWidth: 30, maxHeight: 30)
                                .aspectRatio(contentMode: .fit)
                                .clipShape(Circle())
                        )
                    ),
                    message: "Snapshot captured!"
                )
            }
        }
    }
}

Finally, we can use the snapshotViewModel in CallView (or any other View) in order to present a toast when a new snapshot arrives, as you can see below:

YourRootView()
    .toastView(toast: $snapshotViewModel.toast)

Capturing local participant photo

In order to capture photos of the local participant, we can leverage the AVCaptureSession. The Call object allows us to attach AVCapturePhotoOutput & AVCaptureVideoDataOutput on the active AVCaptureSession and AVCaptureDevice.

let photoOutput: AVCapturePhotoOutput = .init()
let videoOutput: AVCaptureVideoDataOutput = .init()

do {
    guard call?.cId != oldValue?.cId else { return }
    do {
        if #available(iOS 16.0, *) {
            try await call?.addVideoOutput(videoOutput)
            /// Following Apple guidelines for videoOutputs from here:
            /// https://developer.apple.com/library/archive/technotes/tn2445/_index.html
            videoOutput.alwaysDiscardsLateVideoFrames = true
        } else {
            try await call?.addCapturePhotoOutput(photoOutput)
        }
    } catch {
        log.error("Failed to setup for localParticipant snapshot", error: error)
    }
} catch {
    log.error("Failed to setup for localParticipant snapshot", error: error)
}

A maximum of one output of each type may be added. For applications linked on or after iOS 16.0, this restriction no longer applies to AVCaptureVideoDataOutputs. When adding more than one AVCaptureVideoDataOutput, AVCaptureSession.hardwareCost must be taken into account. Given that WebRTC adds a videoOutput for frame processing, we cannot accept videoOutputs on versions prior to iOS 16.0.

To capture a photo we can choose one of the following ways, depending on which output we would like to use:

Capture photo using AVCapturePhotoOutput

func capturePhoto() {
    guard !photoOutput.connections.isEmpty else { return }
    photoOutput.capturePhoto(with: .init(), delegate: self)
}

// MARK: - AVCapturePhotoCaptureDelegate

func photoOutput(
    _ output: AVCapturePhotoOutput,
    didFinishProcessingPhoto photo: AVCapturePhoto,
    error: Error?
) {
    if let error {
        log.error("Failed to capture photo.", error: error)
    } else {
        if let data = photo.fileDataRepresentation() {
            Task { await sendImageData(data) }
        }
    }
}

Capture photo using AVCaptureVideoDataOutput

func captureVideoFrame() {
    guard !videoOutput.connections.isEmpty else { return }
    Task { await state.setIsCapturingVideoFrame(true) }
}

// MARK: - AVCaptureVideoDataOutputSampleBufferDelegate

func captureOutput(
    _ output: AVCaptureOutput,
    didOutput sampleBuffer: CMSampleBuffer,
    from connection: AVCaptureConnection
) {
    Task {
        guard await state.isCapturingVideoFrame else { return }

        if  let imageBuffer = sampleBuffer.imageBuffer {
            let ciImage = CIImage(cvPixelBuffer: imageBuffer)
            if let data = UIImage(ciImage: ciImage).jpegData(compressionQuality: 1) {
                await sendImageData(data)
            }
        }

        await state.setIsCapturingVideoFrame(false)
    }
}

We are using a State actor to control the photo’s triggering.

actor State {
    private(set) var isCapturingVideoFrame = false

    func setIsCapturingVideoFrame(_ value: Bool) {
        isCapturingVideoFrame = value
    }
}

Both examples are using the same method to send the Image data:

func sendImageData(_ data: Data) async {
    defer { videoOutput.setSampleBufferDelegate(nil, queue: nil) }
    guard
        let snapshot = UIImage(data: data),
        let resizedImage = resize(image: snapshot, to: .init(width: 30, height: 30)),
        let snapshotData = resizedImage.jpegData(compressionQuality: 0.8)
    else {
        return
    }

    do {
        try await call?.sendCustomEvent([
            "snapshot": .string(snapshotData.base64EncodedString())
        ])
    } catch {
        log.error("Failed to send image.", error: error)
    }
}

We are removing the videoOutput’s delegate so we stop receiving frames and avoid video delays due to unnecessary processing.

Conclusion

By using the snapshot ViewModifier, the trigger and passing your logic inside the snapshotHandler, you get control over when and what will happen during snapshot capturing.

Long Press to Focus

Camera zoom