Storage

By default, transcriptions are stored in an AWS S3 bucket managed by Stream, located in the same region as your application. Transcription files are retained for two weeks before being automatically deleted. If you need to keep transcriptions longer or prefer not to store this data with Stream, you can opt to use your own storage solution.

Use your own storage

Stream supports the following external storage providers:

If you need support for a different storage provider, you can participate in the conversation here.

To use your own storage you need to:

  1. Configure a new storage for your Stream application
  2. (Optional) Check storage configuration for correctness. Calling check endpoint will create a test markdown file in the storage to verify the configuration. It will return an error if the file is not created. In case of success, the file withstream-<uuid>.md will be uploaded to the storage. Every time you call this endpoint, a new file will be created.
  3. Configure your call type(s) to use the new storage

Once the setup is complete, call recordings and transcription files will be automatically stored in your own storage.

// 1. create a new storage with all the required parameters
await serverSideClient.createExternalStorage({
  bucket: 'my-bucket',
  name: 'my-s3',
  storage_type: 's3',
  path: 'directory_name/',
  aws_s3: {
    s3_region: 'us-east-1',
    s3_api_key: 'my-access-key',
    s3_secret: 'my-secret',
  },
});

// 2. (Optional) Check storage configuration for correctness
// In case of any errors, this will throw a ResponseError.
await serverSideClient.checkExternalStorage({
  name: 'my-s3',
});

// 3. update the call type to use the new storage
await serverSideClient.updateCallType({
  name: 'my-call-type',
  external_storage: 'my-s3',
});

Multiple storage providers and default storage

You can configure multiple storage providers for your application. When starting a transcription or recording, you can specify which storage provider to use for that particular call. If none is specified, the default storage provider will be used.

When transcribing or recording a call, the storage provider is selected in this order:

  1. If specified at the call level, the storage provider chosen for that particular call will be used.
  2. If specified at the call type level, the storage provider designated for that call type will be used.
  3. If neither applies, Stream S3 storage will be used.

Note: All Stream applications have Stream S3 storage enabled by default, which you can refer to as "stream-s3" in the configuration.

// update the call type to use Stream S3 storage for recordings
await serverSideClient.updateCallType({
  name: 'my-call-type',
  external_storage: 'stream-s3',
});

// specify Stream S3 storage when starting call transcribing
await call.startTranscription({
  transcription_external_storage: 'my-storage',
});

Storage configuration

All storage providers have these 4 shared parameters:

NameDescriptionRequired
nameThe name of the provider, this must be uniqueyes
storage_typeThe type of storage to use, allowed values are: s3, gcs and absyes
bucketThe name of the bucket on the service provideryes
pathThe path prefix to use for storing filesno

Amazon S3

To use Amazon S3 as your storage provider, you have two authentication options: IAM role or API key.

If you do not specify the s3_api_key parameter, Stream will use IAM role authentication. In that case make sure to have the correct IAM role configured for your application.

NameDescriptionRequired
s3_regionThe AWS region where the bucket is hostedyes
s3_api_keyThe AWS API keyno
s3_secretThe AWS API Secretno

Example S3 policy

With this option you omit the key and secret, but instead you set up a resource-based policy to grant Stream SendMessage permission on your S3 bucket. The following policy needs to be attached to your queue (replace the value of Resource with the fully qualified ARN of you S3 bucket):

{
  "Version": "2012-10-17",
  "Id": "StreamExternalStoragePolicy",
  "Statement": [
    {
      "Sid": "ExampleStatement01",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::185583345998:root"
      },
      "Action": ["s3:PutObject"],
      "Resource": ["arn:aws:s3:::bucket_name/*", "arn:aws:s3:::bucket_name"]
    }
  ]
}

Google Cloud Storage

To use Google Cloud Storage as your storage provider, you need to send your service account credentials as they are stored in your JSON file. Stream only needs permission to write new files, it is not necessary to grant any other permission.

Note:: Note: We recommend reading the credentials from the file to avoid issues with copying and pasting errors.

await serverSideClient.createExternalStorage({
  bucket: 'my-bucket',
  name: 'my-gcs',
  storage_type: 'gcs',
  path: 'directory_name/',
  gcs_credentials: 'content of the service account file',
});

Example policy

{
  "bindings": [
    {
      "role": "roles/storage.objectCreator",
      "members": ["service_account_principal_identifier"]
    }
  ]
}

Azure Blob Storage

To use Azure Blob Storage as your storage provider, you need to create a container and a service principal with the following parameters:

NameDescriptionRequired
abs_account_nameThe account nameyes
abs_account_keyThe account keyyes
abs_client_secretThe client secretyes
abs_tenant_idThe tenant IDyes

Stream only needs permission to write new files, it is not necessary to grant any other permission.

await serverSideClient.createExternalStorage({
  name: 'my-abs',
  storage_type: 'abs',
  bucket: 'my-bucket',
  path: 'directory_name/',
  abs_account_name: '...',
  abs_account_key: '...',
});

© Getstream.io, Inc. All Rights Reserved.