Skip to the content.

Scribe: Transporting petabytes per hour via a distributed, buffered queueing system

TL;DR: This system is mainly for transporting logs from Point A to Point B within Facebook. Point A is where logs are produced, and Point B is where logs will be processed.

Why Scribe

The same reason as other open source solutions, i.e., Fluentd, that we need a solution to collect, aggregate, deliver those large amount of logs from source to another place for log processing. This kind of solution needs to handle high data throughput, needs to be high available and scalable.

User stories

Design details

Producer libs

Scribed

Write Service

Storage back-end

Read Service

Availability

Scalability

Compare with Fluentd/Fluentbit

Follow up

The original blog just has some high level introduction on what this system is without too many details. We could deep dive further on the following aspects:

References