Importing Data

Stream offers built-in tooling to help you migrate from your current chat provider while keeping the process smooth.

To import data into Stream, create an import file and upload it either through the dashboard or by using the CLI.

You can refer to the File Format section below for details about the expected format.

To get started, you can download a sample file to familiarize yourself with the expected structure, and then import it using the CLI.

Import with the CLI

1. Install the CLI

The easiest way to install the Stream CLI is via Homebrew:

$ brew tap GetStream/stream-cli https://github.com/GetStream/stream-cli
$ brew install stream-cli

For other installation methods, see the CLI Introduction.

2. Configure Authentication

Before using the CLI, you need to authenticate with your Stream credentials:

$ stream-cli config new

This will prompt you for your API key and secret, which can be found on the Stream Dashboard.

3. Upload a file

Once validated, upload your file to start the import:

$ stream-cli chat upload-import my-data.jsonl
{
  "created_at": "2022-05-16T09:02:37.991181Z",
  "path": "s3://stream-import/1171432/7e7fbaf4-e266-4877-96da-fbacf650d0a1/my-data.jsonl",
  "mode": "upsert",
  "id": "79502357-3f4b-486e-9a78-400a184a1088",
  "state": "uploaded",
  "size": 1230
}

You can also specify the import mode using --mode insert to only insert new items and skip existing ones.

4. Check Import Status

Monitor the status of your import using the import ID returned from the upload:

$ stream-cli chat get-import 79502357-3f4b-486e-9a78-400a184a1088

Use the --watch flag to continuously poll the import status until completion:

$ stream-cli chat get-import 79502357-3f4b-486e-9a78-400a184a1088 --watch

To list all imports for your application:

$ stream-cli chat list-imports

For more detailed descriptions of all CLI import commands, please refer to the Stream CLI import docs.

Upsert vs Insert Mode

Upsert

The Upsert mode will import all the data on the file, including data that already exists on our system.

  • If an item exists, the fields provided will be overwritten.
  • Custom data will be replaced.

Since some omitted fields may be overwritten, it is safest to include all the data you want to persist for each item.

Insert

The Insert mode will skip import items if they already exist.

  • It will check for existence of an item by its id. If it exists it will be skipped, even if the fields provided differ from what exists in the database.
  • If it does not exist, the whole object will be inserted.
  • This mode is only available on the Stream CLI.

Import from the Dashboard

While we work on improving the import experience, importing from the dashboard is temporarily disabled.
Please use the CLI in the meantime.

File Format

As you prepare your import file, keep the following requirements in mind—otherwise the import may fail during validation.

  • File Structure: The file should be generated using the JSON Lines format, where each line in the file should be a valid JSON object. Each object represents one item of type user, device, channel, member, message, reaction, or future_channel_ban that will be imported into your application.

  • Item order: Items in the file should be in a specific order, otherwise the validation step will fail. This order makes reference validation more efficient.
    For a Chat import, the items in the file should be defined in the following order:

    • users
    • devices
    • future_channel_ban
    • channels
    • members
    • messages
    • reactions
  • Object References: Every object you reference in the import should either appear as its own object in the file, or the record should already exist in your Stream application.
    For example, if you import a message that references a user_id, your import file may include a separate user with the same ID, or this user should already exist in your application with the same ID.
    This part of the validation process can be tricky, especially with large files. Be cautious with this as it can cause the import to fail.

  • Read State: By default, all imported messages will be marked as read. If you provide the last_read timestamp on a member item, then that member’s unread count will be determined based on the amount of messages that have been created after the last_read timestamp.

  • Distinct Channels: Distinct channels are channels that are created by providing a list of member IDs rather than a channel ID. Under the hood, our API generates a unique channel ID from the member IDs. To import a distinct channel, include the member_ids field and omit the id field entirely.

  • Timestamps: All timestamps must use the same format as the API (RFC 3339), for example: 1985-04-12T23:20:50.52Z.

Object format

As mentioned earlier, each line in the file should be a valid JSON object with the following format:

NameValueDescription
typestringthe item type for this object, allowed values: user, device, channel, member, message, reaction
dataobjectthe data for this object, see below for the format of each type

Here's an example of a valid file:

{"type":"user","data":{"id":"user_001","name":"Jesse","image":"http://getstream.com","created_at":"2017-01-01T01:00:00Z","role":"moderator","invisible":true,"description":"Taj Mahal guitar player at some point"}}
{"type":"device","data":{"id":"device_001","user_id":"user_001","push_provider_type":"firebase","push_provider_name":"firebase"}}
{"type":"channel","data":{"id":"channel_001","type":"messaging","created_by":"user_001","name":"Rock'n Roll Circus"}}
{"type":"member","data":{"channel_type":"messaging","channel_id":"channel_001","user_id":"user_001","is_moderator":true,"created_at":"2017-02-01T02:00:00Z"}}
{"type":"message","data":{"id":"message_001","channel_type":"messaging","channel_id":"channel_001","user":"user_001","text":"Learn how to build a chat app with Stream","type":"regular","created_at":"2017-02-01T02:00:00Z","attachments":[{"type":"video","asset_url":"https://www.youtube.com/watch?v=o-am4BY-dhs","image_url":"https://i.ytimg.com/vi/o-am4BY-dhs/mqdefault.jpg","thumb_url":"https://i.ytimg.com/vi/o-am4BY-dhs/mqdefault.jpg"}]}}
{"type":"reaction","data":{"message_id":"message_001","type":"love","user_id":"user_001","created_at":"2019-03-02T15:00:00Z"}}

Data format

All time fields should be in RFC 3339 format

Note that you can add custom fields to users, channels, members, messages (including attachments) and reactions. The limit is 5KB of custom field data per object.

User Type

The user type fields are shown below:

nametypedescriptionrequired
blocked_user_idsarraylist of blocked users (only user IDs)
channel_mutesarraylist of muted channels (only channel CIDs)
created_atstringcreation time (default to import time)
deactivated_atstringdeactivation time
deleted_atstringdeletion time
idstringunique user ID (required)
invisiblebooleanvisibility state (default to false)
languagestringlanguage
privacy_settingsobjectcontrol user's privacy settings: delivery receipts, typing indicators and read receipts
push_preferencesobjectpush preferences
rolestringthe user's role (default to user)
teamsarraylist of teams the user is part of
teams_roleobjectmapping of teams to user roles (see more)
user_mutesarraylist of muted users (only user IDs)
*string/array/objectadd as many custom fields as needed (up to 5 KiB)
{"type":"user","data":{"id":"user_001","name":"Jesse","image":"http://getstream.com","created_at":"2017-01-01T01:00:00Z","role":"moderator","invisible":true,"teams":["admins"],"teams_role":{"admins":"team_moderator"},"description":"Taj Mahal guitar player at some point"}}

Device Type

Importing devices is the equivalent of registering devices with Stream.

This is useful when migrating from another chat provider to Stream because:

  • Users already have devices registered for push notifications
  • You want to preserve these registrations so users continue receiving notifications immediately after migration
  • Without importing devices, users would miss notifications until they open the app again

The device type fields are shown below:

nametypedescriptionrequired
created_atstringcreation time (default to import time)
idstringunique device id
push_provider_typestringmust be one of the following: firebase, apn, huawei orxiaomi
push_provider_namestringname that matches the Push Configuration on your app
user_idstringuser ID
{"type":"device","data":{"id":"device_001","user_id":"user_001","created_at": "2019-01-11T02:00:00Z","push_provider_type":"firebase","push_provider_name":"production-firebase-config"}}

Channel Type

The channel type fields are shown below:

nametypedescriptionrequired
banned_usersarraylist of banned users (only user IDs)
created_atstringcreation time (default to import time)
created_bystringuser who created the channel (user ID)
disabledbooleandisabled status (default to false)
frozenbooleanfrozen status (default to false)
idstringunique channel ID (required only if member_ids is not provided)
member_idsarrayuser IDs used for distinct channels (required only if id is not provided)
teamstringchannel team
truncated_atstringtruncation time
typestringchannel type (required)
*string/list/objectadd as many custom fields as needed (up to 5 KiB)
// with channel ID
{"type": "channel","data": {"id": "channel_001","type": "livestream","created_by": "user_001","name": "Rock'n Roll Circus"}}

Member Type

Channel members store the mapping between users and channels. The fields are shown below:

nametypedescriptionrequired
archived_atstringtime when the channel was archived
channel_idstringchannel ID (required only if channel_member_ids is not provided)
channel_rolestringmember role (default to channel_member)
channel_typestringchannel type
created_atstringcreation time (default to import time)
channel_member_idsarrayuser IDs for distinct channels (required only if id is not provided)
hide_channelbooleanhidden status (default to false)
hide_messages_beforestringmessages will be hidden before this time
invitedbooleanwhether the user was invited (default to false)
invited_accepted_atstringtime when the user accepted the invite
invited_rejected_atstringtime when the user rejected the invite
last_readstringlast time the member read the channel
user_idstringuser ID
*string/list/objectadd as many custom fields as needed (up to 5 KiB)

If your app uses multi-tenancy, the referenced channel and user items must have a matching team.

{"type":"member","data":{"channel_id":"channel_001","channel_type":"livestream","user_id":"user_001","channel_role":"channel_member","created_at":"2017-02-01T02:00:00Z"}}

Message Type

The message type fields are shown below:

nametypedescriptionrequired
attachmentsarraymessage attachments, see the attachment section below
channel_idstringchannel ID (required only if channel_member_ids is not provided)
channel_member_idslist of stringsuser IDs for distinct channels (required only if id is not provided)
channel_typestringchannel type
created_atstringcreation time (default to import time)
deleted_atstringdeletion time
htmlstringsafe HTML generated from the text
idstringunique message ID
mentioned_users_idsarraymentioned user IDs
parent_idstringparent message ID (type should be "reply")
pin_expiresstringtime when pin expires (requires pinned_at and pinned_by_id)
pinned_atstringtime when message was pinned (requires pin_expires and pinned_by_id)
pinned_by_idstringpinned_by user ID (requires pinned_at and pin_expires)
quoted_message_idstringquoted message ID
restricted_visibilityarrayuser IDs that can see this message (see documentation for more information)
show_in_channelbooldefine if reply should be shown in the channel as well (default to false)
textstringmessage text
typestringmessage type (available type: regular, reply, deleted or system)
userstringuser ID who posted the message
*string/list/objectadd as many custom fields as needed (up to 5 KiB)
{"type":"message","data":{"id":"message_001","channel_type":"livestream","channel_id":"channel_001","user":"user_001","text":"Such a great song, check out my solo at 2:25","type":"regular","created_at":"2017-02-01T02:00:00Z"}}

Message Attachments

The attachments are a great way to extend Stream's functionality. If you want to have a custom product attachment, location attachment, checkout, etc., attachments are the way to go.
The fields below are automatically picked up and shown by our component libraries.

Note that all attachment URLs must be publicly accessible, otherwise the import will fail.

nametypedescriptionrequired
asset_urlstringURL to the audio, video, or image resource
image_urlstringURL to the attached image
migrate_resourcesbooleanif true, attachment will be migrated to our CDN (default to false)
thumb_urlstringURL to the attachment thumbnail (recommended for images and videos)
typestringattachment type (built-in types: audio, video, image and text)
*string/list/objectadd as many custom fields as needed (up to 5 KiB)
{"type":"message","data":{...,"attachments":[{"type":"image","image_url":"https://my.domain.com/image.jpg","thumb_url":"https://my.domain.com/image-thumb.jpg"},{"type":"video","asset_url":"https://my.domain.com/video.mp4","thumb_url":"https://my.domain.com/video-thumb.jpg"}]}}

For attachment migration, only image_url, thumb_url and asset_url fields will be migrated to our CDN and the original URL will be replaced with the new one. The files should not be empty. The import will fail if resource migration fails. In the error you can see the URL and message ID for the failed migration.

Reaction Type

The reaction type fields are shown below:

nametypedescriptionrequired
created_atstringcreation time
message_idstringmessage ID
typestringreaction type
user_idstringuser ID
*string/list/objectadd as many custom fields as needed (up to 5 KiB)
{"type":"reaction","data":{"message_id":"message_001","type":"love","user_id":"user_001","created_at":"2019-03-02T15:00:00Z"}}

Future Channel Ban Type

The future_channel_ban type fields are shown below:

nametypedescriptionrequired
created_atstringcreation time
created_bystringuser ID who initiated this future channel ban
target_idstringuser ID who will be ban for all future channels
{"type":"future_channel_ban","data":{"created_by":"user_001","target_id":"user_002","created_at":"2019-03-02T15:00:00Z"}}

Validation

We use JSON Schema to define and validate the structure of our data.
The Chat schema files are available here.

These schema files cover approximately 99% of our validation rules. Some validations depend on your specific configuration, such as custom permission policies or custom channel type configurations.

To validate that your data is in the correct format, you can either:

  • validate the data on the fly (while the data is generated), or
  • validate the data once the file is generated

There are many JSON Schema validators available for different programming languages that you can use to validate your data.

Error Messages

When problems occur during analysis, they will show up in the dashboard. A list of errors will be shown in JSON format. Where applicable, the offending item will be included, for example:

{
  "errors": [
    {
      "error": "Validation error: max channelID length exceeded (64)",
      "item_type": "channel",
      "item": {
        "id": "waytoolongwaytoolongwaytoolongwaytoolongwaytoolongwaytoolongwaytoolong",
        "type": "messaging",
        "created_by": "userA-7D3CA510-CB3C-479E-B5FA-69FC2D48410F",
        "created_at": "0001-01-01T00:00:00Z",
        "updated_at": "0001-01-01T00:00:00Z"
      }
    }
  ],
  "stats": {
    "total": {
      "messages": 0,
      "members": 0,
      "reactions": 0,
      "channels": 1,
      "users": 0
    }
  }
}
ErrorDescription
Validation error: max "field" length exceeded (field-length)Maximum length of field exceeded
Validation error: either channel.id or channel.member_ids should be provided, but not bothEither define channel as a regular channel or a distinct channel, but not both
Validation error: channel.id or channel.member_ids required, but not bothAt least one of channel.id or channel.member_ids must be provided
Validation error: "field" requiredMissing required field
Validation error: "field" is a reserved fieldField provided is reserved
Validation error: duplicated item "id"Item and id combination is duplicated
Validation error: created_by user id doesn't exist (channel "messaging:abc"). please include all users as separate user entriesAll users referenced by all objects, for example in channel.created_by, should be included in the import file
Validation error: 'value' is not a valid fieldThe value provided for a particular field is not valid. For example, a channel.id contains invalid characters
Validation error: user id with teams X cannot be a member of channel Y with team ZThe member item references a user and channel that do not have a matching team
Parse error: invalid item type "foobar"An item was included with an invalid item type, only: user, device, channel, member, message and reaction are allowed
Parse error: invalid character ',' looking for beginning of valueThe import contains malformed JSON

This is not an exhaustive list of possible errors, but these are the most common ones.