Skip to the content.

Design news feeds system

Requirements


Assumptions

Data model

User profiles

type UserProfile struct {
    UserID string
    Name   string
    Age    int
    ...
}

User relations

type UserRelations struct {
    UserID    string
    Follows   []string // a slice of user ids current user follows
    Followers []string // a slice of user ids who are the followers of current user
}

Feeds

type Feed struct {
    UserID   string
    FeedID   string // could be constructed by using userID + timestamp which could be used as a sequential ID
    Text     string
    MediaURL string // usually is generated by the object storage service with a uuid as a suffix
    Comments []Comment
    Likes    []Like // the user ids who send the likes
    // OR the likes could be a count
    // Likes int
    ...
}

Comment

type Comment struct {
    FeedID   string
    UserID   string // who writes the comment
    Text     string
    ...
}

Like

type Like struct {
    FeedID   string
    UserID   string // who add the like
    ...
}

Media

type Media struct {
    MediaID     string
    UserID      string
    MediaURL    string
    Description string
    ...
}

Storage

To persist data models

We need to persist the data models to fit the requirements of data persistent.

There are tons of the video out there talking about the diffs betwen sql and nosql databases, like this one.

We would consider the following when making the decisions:

To persist media

We need a seperate data store for pictures and videos. A good candidate could be an object storage which has HTTP API supports.

Click here to know more on differences between file storage, block storage, object storage

MessageQueue/Kafka to support Push/Pull model

In order to stream the feeds to consumers, the post requests could be sent to the message queue waiting to be processed, and then dispatched to Kafka to buffer the feeds for online consumers. This definitely could help to reduce the database access frequencies.

Architecture

architecture

Above is the very brief high level architecture based on push model. The service does not necessarily to be push based. The blue lines are the workflow of posting a news feed, and the green lines are the workflow of getting a news feed.

Components design

API Gateway

API Gateway could help providing the following:

Feeds generation service

Feeds retrieval service

Count likes

count-likes

Above workflow is from this youtube video

Streaming likes on Live video

More details could be found here

Add comments

Counting likes could be done in a batch, because it does not have to be real time. However adding comments needs to be handled as quickly as possible. A user could send an request with {'feedId': 'xxxx', 'userId':'bob123', 'text': 'this is awsome'} to server for processing. The idea is similar as messaging that server side has a message queue handles the requests and using worker threads to process the requests. The order of comments are based on the order when the requests are inserted into database.

Scaling

https://www.youtube.com/watch?v=hnpzNAPiC0E&ab_channel=InfoQ

Tools to use

tools-to-use

References