Skip to the content.

Caching

Why needs caching

When local caches are used, they only benefit the local application consuming the data. In a distributed caching environment, the data can be replicated to multiple cache servers in order to benefit of all the consumers across multiple geolocations.

There are different levels of caching:

Cache write policies

Reference: Open CAS cache mode

Types of cache architecture

Look-aside

look-aside-cache

Look-aside On read


Pros:

Cons:

Look-aside On write


Pros:

Cons:

Look-through

Look-through On read

look-through-cache


Pros:

Cons:

Look-through On write

Sync writes

look-through-cache-sync-write


Pros:

Cons:

Async writes

look-through-cache-async-write


Pros:

Cons:

Granularity of caching key

We know that the caching usually is a key-value store, and what data to be stored in the cache is case by case. Usually there are two types I could think of:

How cache eviction work

Distributed cache

How distributed caching(hash table) works

Having the local cache cluster within each region, and have it replicated to all other regions.

Netflix EVCache Reading

reading

Netflix EVCache writing

writing

Facebook Memcache Reading

reading

Facebook Memcache Writing

Write from master region

mcsquel-pipeline

Write from non-master region

How distributed cache replica cross region

Netflix cross region replication

cross-region-replication-netflix

How to scale cache

To have more shards

Not all keys reside in a single node of cache cluster, keys are sharded into multiple nodes and each of the node holds a subset of keys. Redis has 16384 hash slots, it uses CRC16 of the key modulo 16384 to find the location of a key. When a new shard is added with empty, we need to transfer keys from other nodes to the new nodes. Once the new shard catches up we could clean up them from original nodes. Redis has to run resharding cmd manually to move keys, we could also use consistent hash ring to minimize the data move.

To have more replicas of a particular shard

Adding a new replica of a shard is easy, we just need to copy data to it from the main/master shard. Once it catches up we could allow that node to do consensus voting.

To put some cold data on SSD

use-ssd-for-cache

How K8S client-go caching works

How custom controller works

client-go-controller-interaction

There could be multiple controllers reconcile a same set of resources, it would be a huge load if all controllers talk to api server to ask for the state of resources. So caching is really important.

The Thread Safe Store is the cache where informer will add the object.

Comparison between Redis and Memcached

More details could be found from the links below in references section.

References