Skip to the content.

Distributed counter

Requirements and User stories

Functional requirements

Non functional requirements

Should ask:


Calculation

Data model

type Post struct {
	Post_ID  int64
	User_ID  int32
	Title    String
	Content  String
	VideoRef String
}
type Counter struct {
	Post_ID    int64 // The id for posts, page, videos. Cannot use int32, because unsigned 32bits integer can only handle
	                 // 4 billion as its max value. Signed 32bits integer can handle 2 billion
	Count      int32 // Is it realistic to  2 biilion counts ? (int32 signed can handle 2 billion)
}

APIs

curl -X PUT https://endpoint/v1/post_id/counter \
-d {action: "increment"}
# Above endpoint can be https://endpoint/v1/post_id/views or likes

# Response
status code: 200 OK
content-encoding: gzip
content-type: application/json
{
  count: 812,
  updated_at: "2030-10-10T12:11:42Z"
}

---
curl -X PUT https://endpoint/v1/post_id/counter \
-d {action: "decrement"}

# Response
status code: 200 OK
content-encoding: gzip
content-type: application/json
{
  count: 811,
  updated_at: "2030-10-10T12:14:42Z"
}

---
curl -X GET https://endpoint/v1/post_id/counter

# Response
status code: 200 OK
content-encoding: gzip
content-type: application/json
{
  count: 811,
  updated_at: "2030-10-10T12:16:42Z"
}

Architecture

Database

Relational Database Solution

Challenges:

Potential solutions and enhancements:

Conclusion: No relational database. There still are some fundamental problems about relational database.

NoSQL Database solution

NoSQL database is popular because of its highly scalability. A key-value store is good fit for our use case, or we can also consider wide-column database like Cassandra.

A modern NoSQL database can easily handle ~600 kOp/s with 3-5 nodes cluster (operations per second: including File IO, Network traffic, API requests, System Calls, Task completions). Ref: https://benchant.com/blog/mongodb-vs-scylladb-benchmark

Where is the count of a post stored

Leader based:

The biggest problems with this leader based solution are 1) write/read performance 2) leader throughput bottleneck. If we don’t want both read and write go through leader, will leader based with partition work ?

Leader based with partition:

Even we do partition, all writes still have to be redirected to partition’s leader. Read performance is not improved if we have Post A-1, Post A-2 kind of partition.

Leaderless based:


Cache

From the analysis in above, NoSQL database usually can handle ~600 kOps/s. Imagine we have 100 million users concurrently watch the same post/video or click on like, we have to handle 100 million concurrent writes in worst case scenario. ~600 kOps/s is way less than that.

Between database and backend server, can we have a cache layer? So that we can offload some traffic directly hits the backend database. The answer is yes.

Using message broker as the buffer

Cons:

Redis

Cons: Redis’s INCR and DECR commands directly modify the value stored at a key without any built-in mechanisms to track previous values or prevent multiple updates from having cumulative effects. Repeatedly executing INCR on the same key will always increment the value, even if the same request is sent multiple times. This lack of idempotency can lead to over-counting or under-counting in certain scenarios. If network partition happens between backend server and Redis, we have to be very careful on server “retries”, because it might cause inaccurate counts.

CRDT based cache

We can have a CRDT(PN-Counter) based in-memory counter as the cache. See here for more details on how does CRDT solution work. Redis does have CRDT supported.

It can be implemented as look-aside with write-around architecture.

Push real-time count updates to all users(subscribed users)

Please see this notes for more details on how Linkedin handles the live likes updates. Also the Green boxes in architecture diagram shows the workflow.

Failure handling

Monitoring and health checks on the distributed counter should be implemented. The following entities should be monitored:

Chaos engineering can be used to test the resiliency of the distributed counter and identify any single point of failure (SPOF). The CRDT database should be tested against the partitioned network scenario. A sample test case for the partitioned network in the CRDT database is the following:

Scaling

// TODO: What will happen if adding a new node to Redis cluster or adding a new node in DB cluster ? How data migration // is handled ? // TODO: How to scale front-server for real-time like streaming ?

Miscellaneous

References