Skip to the content.

Design instant messaging system

This is the notes for designing a facebook messager, Tencent wechat, Whatsapp, Slack, Alibaba Dingding like instant messaging system.

User stories

Out of scope for future research

Assumptions

Data model

Client

// POST /api/v1/rtm.send
{
    "sourceID": "xxxxxx",  // message producer
    "targetID": "xxxxxx",  // clientID or groupID of the message consumer
    "text": "hello"
}

{
    "sourceID": "xxxxxx",  // message producer
    "targetID": "xxxxxx",  // clientID or groupID of the message consumer
    "encodedMedia": "4jsxied=="  // binary encoded media data
}
// GET /api/v1/rtm.receive
{
    "sourceID": "xxxxxx",  // message consumer
    "targetID": "xxxxxx",  // message producer
    "lastPos": "jD9ace6==" // the hash indicates the last read from message queue
}

Message Server

    type Message struct {
        // channelID:timestamp to make it global unique, channelID could be mapped from sourceID and targetID if 1:1
        // messaging, or channelID could be the targetID if group messaging
        sequenceID string
        text       string  // text content of the message
        mediaURL   string  // the URL
    }

Message processing backend

    type MessageQueue struct {
        messages []Message
    }

    // In memory buffer to handle the incoming messages
    // wait worker to process the message
    type MessageBuffer struct {
        MessageQueue
    }

    // A collection of MessageQueue
    type MessageQueues struct {
        messageQueues []MessageQueue
    }

MessageQueue:

message-queue

Architecure

architecture

Conventional architecture

  1. If target is online, then messages are directly synced without storing in DB
  2. If target is offline, the messages will be stored in offline DB
  3. When target is back online, it could read from the offline DB

Analysis:

Modern architecture

  1. Messages are stored in MessagePersistentStore first
  2. Once messages are stored successfully, they will be pushed to MessageSyncStore
  3. A notification will be sent to target which indicates there are new messages
  4. Target pull messages from MessageSyncStore

Analysis:

How messages are synced

Pull model

pull-model

For example, I was chatting with 10 of my friends, there will be 10 message queues between my friends and I. Each time I open the app, it loops on all message queues to pull the messages.

The cons of this approach are:

Push model

push-model

The cons of this approach are:

Conclusion

Most of the IM systems are using push model for 1:1 messaging and pull model for group messaging.

How messages are persisted

What are the requirements

Options

Debates between Leader based or Leaderless. Amazon Chime uses DynamoDB which is leader based. And Slack uses MySQL cluster which is active-active with strong consistency. Alibaba Dingding uses Table Store which is also a leader based system. Using Cassandra(leaderless) should also be fine which has good performance but needs some mechanism to coordinate the message consistency on all nodes and the ordering (which could be easily handled in leader based).

What other companies are using

Overall architecture

design-architecture

Dingding’s architecture

Conventional architecture

Slack’s architecture

How slack works

Questions

Miscellaneous

Refereces