In this tutorial, we'll build encrypted chat on iOS using Swift. We'll combine Stream Chat and Virgil Security. Both Stream Chat and Virgil make it easy to create a solution with high security with all the features you expect. These two services allow developers to integrate chat that is zero-knowledge. The application embeds Virgil Security's E3Kit with Stream Chat's Swift components.
Note: All source code for this application is available on GitHub.
What is end-to-end encryption?
End-to-end encryption means that two people can have trade messages via the internet without anyone else being able to read them, even if the transmission or storage is compromised. To do this, the app encrypts the message before it leaves a user's device, and only the intended recipient can decrypt the message.
Virgil Security is a vendor that allows us to create end-to-end encryption via public/private key technology. Virgil provides a platform that allows us to create, store, and offer robust end-to-end encryption securely.
During this tutorial, we will create a Stream Chat app that uses Virgil's encryption to prevent anyone except the intended parties from reading messages. No one in your company, nor any cloud provider you use, can read these messages. Even if a malicious person gained access to the database containing them, all they would see is encrypted text, called ciphertext.
Building an Encrypted Chat Application
To build this application, we'll mostly rely on a few libraries from Stream Chat and Virgil (please check out the dependencies in the source to see what versions). Our final product will encrypt text on the device before sending a message. Decryption and verification will both happen in the receiver's device. Stream's Chat API will only see ciphertext, ensuring our user's data is never seen by anyone else, including you.
To accomplish this, the app performs the following process:
- A user authenticates with your backend.
- The iOS app requests a Stream auth token and API key from the
backend
. The Swift app creates a Stream Chat Client for that user. - The mobile app requests a Virgil auth token from the
backend
and registers with Virgil, which generates their private and public key. The app stores the private key locally, and the public key in Virgil Cloud. - Once the user decides who they want to chat with, the app creates and joins a Stream Chat Channel.
- The app asks Virgil's API, via E3Kit, for the receiver's public key.
- The user types a message and encrypts it with the receiver's public key using E3Kit, then sends it to Stream. After that, Stream Chat relays the message to the receiver. Stream only receives ciphertext, meaning they will never see the original message.
- When the message is received, the app decrypts and verifies it is using E3Kit and passes it along to Stream Chat's iOS components for display.
While this looks complicated, Stream and Virgil do most of the work for us. We'll use Stream's out-of-the-box UI components to render the chat UI and Virgil to do all of the cryptography and key management. We simply combine these services.
The code is split between the iOS frontend contained in the ios
folder, and the Express (Node.js) backend is found in the backend
folder. See the README.md
in each folder to see installing and running instructions. If you'd like to follow along with running code, make sure you get both the backend
and ios
running before continuing.
Let's walk through and look at the critical code needed for each step.
Prerequisites
Basic knowledge of iOS (Swift) and Node.js is expected. This code is intended to run locally on your machine.
You will need an account with Stream and Virgil. Once you've created your accounts, you can place your credentials in backend/.env
if you'd like to run the code. You can use backend/.env.example
as a reference for the required credentials. Please see the README in the backend
directory for more information.
Step 0. Set up the Backend
For our Swift app to securely interact with Stream and Virgil, the backend
provides four endpoints:
-
POST /v1/authenticate
: This endpoint generates an auth token that allows the iOS application to communicate with the other endpoints. To keep things simple, this endpoint enables the client to be any user. The frontend tells the backend who it wants to authenticate as. In your application, this should be replaced with real authentication appropriate for your app. -
POST /v1/stream-credentials
: This returns the data required for the iOS app to establish a session with Stream. In order return this info we need to tell Stream this user exists and ask them to create a valid auth token:
// backend/src/controllers/v1/stream-credentials.js exports.streamCredentials = async (req, res) => { const data = req.body; const apiKey = process.env.STREAM_API_KEY; const apiSecret = process.env.STREAM_API_SECRET; const client = new StreamChat(apiKey, apiSecret); const user = Object.assign({}, data, { id: `${req.user.sender}`, role: 'admin', image: `https://robohash.org/${req.user.sender}`, }); const token = client.createToken(user.id); await client.updateUsers([user]); res.status(200).json({ user, token, apiKey }); }
The response payload has this shape:
{ "apiKey": "<string>", "token": "<string>", "user": { "id": "<string>", "role": "<string>", "image": "<string>" } }
apiKey
is the Stream account identifier for your Stream instance. It is needed to identify what account your frontend is trying to connect with.token
JWT token to authorize the frontend with Stream.user
: This object contains the data that the frontend needs to connect and render the user's view.
POST /v1/virgil-credentials
: This returns the authentication token used to connect the frontend to Virgil. We use the Virgil Crypto SDK to generate a valid auth token for us:
// backend/src/controllers/v1/virgil-credentials.js async function getJwtGenerator() { await initCrypto(); const virgilCrypto = new VirgilCrypto(); // initialize JWT generator with your App ID and App Key ID you got in // Virgil Dashboard. return new JwtGenerator({ appId: process.env.VIRGIL_APP_ID, apiKeyId: process.env.VIRGIL_KEY_ID, apiKey: virgilCrypto.importPrivateKey(process.env.VIRGIL_PRIVATE_KEY), accessTokenSigner: new VirgilAccessTokenSigner(virgilCrypto) }); } const generatorPromise = getJwtGenerator(); exports.virgilCredentials = async (req, res) => { const generator = await generatorPromise; const virgilJwtToken = generator.generateToken(req.user.sender); res.json({ token: virgilJwtToken.toString() }); };
In this case, the frontend only needs the auth token.
GET /v1/users
: Endpoint for returning all users, which exists just to get a list of people to chat with. Please refer to the source if you're curious. Please note thebackend
uses in-memory storage, so if you restart, it will forget all of the users.
Step 1. User Authenticates With Backend
The first step is to authenticate a user and get our Stream and Virgil credentials. To keep thing simple, we have an insecure form that allows you to log in as anyone:
This is a simple form that takes any arbitrary name, effectively allowing us to log in as anyone (please use an appropriate authentication method for your application). First, let's add to Main.storyboard
. We add a "Login View Controller" scene that's backed by a custom controller LoginViewController
(to be defined). This controller is embedded in a Navigation Controller. Your storyboard should look something like this:
The form is a simple Stack View
with a username
field and a submit button. Let's look at our custom LoginViewController
:
// ios/EncryptedChat/LoginViewController.swift:3 class LoginViewController: UIViewController { @IBOutlet weak var usernameField: UITextField! @IBAction func login(_ sender: Any) { guard let userId = usernameField.text, !userId.isBlank else { usernameField.placeholder = " ⚠️ User id" return } Account.shared.login(userId) { DispatchQueue.main.async { self.performSegue(withIdentifier: "UsersSegue", sender: self) } } } }
The usernameField
is bound to the storyboard's Username Field
, and the login method is bound to the Login
button. When a user clicks login, we check if there's a username and if so, we log in via Account.shared.login
. Account
is a shared object that will store our credentials for future backend interactions. Once the user logs in, we initiate the UsersSegue
, which boots our next scene. We'll see how this is done in a second, but first, let's see how the Account
object logs in.
Here's how we define Account
:
// ios/EncryptedChat/Account.swift:5 class Account { public static let shared = Account() let apiRoot = "http://localhost:8080" var authToken: String? = nil var userId: String? = nil public func login(_ userId: String, completion: @escaping () -> Void) { AF .request("\(apiRoot)/v1/authenticate", method: .post, parameters: ["user" : userId], encoder: JSONParameterEncoder.default) .responseJSON { response in let body = response.value as! NSDictionary let authToken = body["authToken"]! as! String self.authToken = authToken self.userId = userId self.setupStream(completion) } } //... }
First, we set up our shared object that will store our login state in the authToken
and userId
properties. Note, apiRoot
is the IP address where our backend is running. Please follow the instructions in the backend
to run it. Our login
method uses Alamofire (AF
) to make a post
request to our backend with the user to log in. Upon success, we store the authToken
and userId
, and then we call setupStream
.
The method setupStream
initializes our Stream Chat client. Here's the implementation:
// ios/EncryptedStream/Account.swift:39 private func setupStream(_ completion: @escaping () -> Void) { AF .request("\(apiRoot)/v1/stream-credentials", method: .post, headers: ["Authorization" : "Bearer \(authToken!)"]) .responseJSON { response in let body = response.value as! NSDictionary let token = body["token"]! as! String let apiKey = body["apiKey"]! as! String Client.config = .init(apiKey: apiKey, logOptions: .info) Client.shared.set( user: User(id: self.userId!), token: token ) self.setupVirgil(completion) } }
We call to the backend
with our credentials from login
. We get back a token
, which is a Stream Chat token. This token allows our mobile application to communicate directly with Stream without going through our backend
. We also get an apiKey
, which identifies the Stream account we're using. We need this data to initialize our Stream Client
instance and set the user. Last, we initialize Virgil via setupVirgil
:
// ios/EncryptedChat/Account.swift:59 private func setupVirgil(_ completion: @escaping () -> Void) { AF .request("\(apiRoot)/v1/virgil-credentials", method: .post, headers: ["Authorization" : "Bearer \(authToken!)"]) .responseJSON { response in let body = response.value as! NSDictionary let token = body["token"]! as! String VirgilClient.configure(identity: self.userId!, token: token) completion() } }
This method requests our Virgil credentials and configures a VirgilClient
class we defined, which wraps Virgil's E3Kit library. Once that's done, we call the completion
to indicate success and allow the application to move on.
Let's see what the beginning of our VirgilClient
implementation looks like:
// ios/EncryptedChat/VirgilClient.swift:4 class VirgilClient { public static let shared = VirgilClient() private var eThree: EThree? = nil // ... public static func configure(identity: String, token: String) { let tokenCallback: EThree.RenewJwtCallback = { completion in completion(token, nil) } let eThree = try! EThree(identity: identity, tokenCallback: tokenCallback) eThree.register { error in if let error = error { if error as? EThreeError == .userIsAlreadyRegistered { print("Already registered") } else { print("Failed registering: \(error.localizedDescription)") } } } shared.eThree = eThree } // ... }
VirgilClient
is a singleton that is set up via configure
. We take the identity
(which is our username) and token, and then we generate a tokenCallback
. The EThree
client uses this callback to get a new JWT token. In our case, we've kept it simple by just returning the same token, but in a real application, you'd likely want to replace this with the rest call to the backend.
We use this token callback and identity to initialize eThree
. We then use this instance to register the user.
Now we're set up to start chatting!
Step 2: Listing Users
Next, we'll create a view to list users. In the Main.storyboard
we add a UITableView
and back it by our custom UsersViewController
:
And here's the first few lines of UsersViewController
:
// ios/EncryptedChat/UsersViewController.swift:7 class UsersViewController: UITableViewController { var users = [String]() override func viewDidLoad() { super.viewDidLoad() loadUsers() } func loadUsers() { Account.shared.users { users in self.users = users self.tableView.reloadData() } } // ... }
First, we fetch the users when the view loads. We do this via the Account
instance configured during login. This action simply hits the /v1/users
endpoint. Refer to the source if you're curious. We store the users in a users
property and reload the table view.
Let's see how we configure the table cells:
// ios/EncryptedChat/UsersViewController.swift:7 override func tableView(_ tableView: UITableView, numberOfRowsInSection section: Int) -> Int { return users.count } override func tableView(_ tableView: UITableView, cellForRowAt indexPath: IndexPath) -> UITableViewCell { let cell = tableView.dequeueReusableCell(withIdentifier: "DefaultCell", for: indexPath) cell.textLabel!.text = users[indexPath.row] return cell }
To render the table correctly we indicate the number of rows via users.count
and set the text of the cell to the user at that index. With this list of users, we can set up a click action on a user's row to start a private 1-on-1 chat with them.
Step 3: Starting an Encrypted Chat Channel
First, add a new blank view to Main.storyboard
backed by a new custom class EncryptedChatViewController
(we'll see its implementation in a minute):
Add a segue between from the user's table row to show the new controller:
With that setup, we can hook into the segue via the UITableViewController
's prepare
method:
// ios/EncryptedChat/UsersViewController.swift:32 override func prepare(for segue: UIStoryboardSegue, sender: Any?) { let userId = Account.shared.userId! let userToChatWith = users[tableView.indexPathForSelectedRow!.row] let channelId = [userId, userToChatWith].sorted().joined(separator: "-") let viewController = segue.destination as! EncryptedChatViewController let channelPresenter = ChannelPresenter( channel: Client.shared.channel( type: .messaging, id: channelId, members: [User(id: userId), User(id: userToChatWith)] ) ) viewController.user = userId viewController.otherUser = userToChatWith viewController.presenter = channelPresenter }
Before transitioning to the new view, we need to set the view controller up. We generate a unique channel id using the user ids. We initialize a ChannelPresenter
from the Stream library, set the type to messaging, and restrict the users. We grab the view controller from the segue and back it with the ChannelPresenter
. This will tell the controller which channel to use. We also tell it what users are communicating.
Let's see how the controller creates this view:
// ios/EncryptedChat/EncryptedChatViewController.swift:6 class EncryptedChatViewController: ChatViewController { var user: String? var otherUser: String? // ... }
Luckily, Stream comes with excellent UI components out of the box. We'll simply inherit from ChatViewController
to do the hard work of displaying a chat. The presenter we set up tells the Stream UI component how to render. All we need to do now is hook into the message lifecycle to encrypt and decrypt messages on the fly.
Step 4: Sending an Encrypted Message
Now we're ready to send our first encrypted message. Since we're using Stream's built-in UI, all we need to do is hook into the message sending cycle. We'll do this via the messagePreparationCallback
on the ChannelPresenter
:
// ios/EncryptedChat/EncryptedChatViewController.swift:10 override func viewDidLoad() { super.viewDidLoad() guard let presenter = presenter else { return } VirgilClient.shared.prepareUser(otherUser!) presenter.messagePreparationCallback = { var message = $0 message.text = VirgilClient.shared.encrypt(message.text, for: self.otherUser!) return message } }
Upon loading the view, we set up the callback, which is a hook provided by Stream, to make any modifications we'd like to the message before sending it over the wire. Since we want to encrypt the message fully, we grab it and modify the text with the VirgilClient
object.
For this to work, we need to look up the public key of the other user. Since this action requires a call to the Virgil API, it's asynchronous. We don't want to do this during message preparation, so we do it ahead of time via VirgilClient.shared.prepareUser(otherUser!)
. Let's see how that method is implemented:
// ios/EncryptedChat/VirgilClient.swift:29 public func prepareUser(_ user: String) { eThree!.findUser(with: user) { [weak self] result, _ in self?.userCards[user] = result! } }
This is relatively simple with Virgil's E3Kit. We find the user's Card
which stores all of the information we need to encrypt and decrypt messages. We'll use this Card
in the encrypt
method during message preparation:
// ios/EncryptedChat/VirgilClient.swift:35 public func encrypt(_ text: String, for user: String) -> String { return try! eThree!.authEncrypt(text: text, for: userCards[user]!) }
Once again, this is made easy by Virgil. We simply pass the text and the correct user card to authEncrypt
, and we're done! Our message is now ciphertext ready to go over the wire. Stream's library will take care of the rest.
Step #5: Decrypting a Message
Since we're using the ChatViewController
to render the view, we only need to hook into the cell rendering. We need to decrypt the message text and pass it along. We override the messageCell
method:
// ios/EncryptedChat/EncryptedChatViewController.swift:26 override func messageCell(at indexPath: IndexPath, message: Message, readUsers: [User]) -> UITableViewCell { var modifyMessage = message modifyMessage.text = message.user.id == user ? VirgilClient.shared.decryptMine(message.text) : VirgilClient.shared.decryptTheirs(message.text, from: otherUser!) return super.messageCell(at: indexPath, message: modifyMessage, readUsers: readUsers) }
We make a copy of the message and set the message's text to the decrypted value. In this case, we need to know if it's our message or theirs. First, we'll look at how to decrypt ours:
// ios/EncryptedChat/VirgilClient.swift:39 public func decryptMine(_ text: String) -> String { return try! eThree!.authDecrypt(text: text) }
Since the message was encrypted initially on the device, we have everything we need. We simply ask Virgil E3Kit to decrypt it via authDecrypt
. Decrypting the other user's messages is a bit more work:
// ios/EncryptedChat/VirgilClient.swift:43 public func decryptTheirs(_ text: String, from user: String) -> String { return try! eThree!.authDecrypt(text: text, from: userCards[user]!) }
In this case, we use the same method, but we pass the user card that we retrieved earlier, which verifies the message's authenticity. Now we can see our full chat:
And that's all! We now have an application that uses end-to-end encryption to protect a user's conversation. Happy coding!