Skip to the content.

Design i18n service

What is i18n

Internationalization is the process of adapting software to accommodate for different languages, cultures, and regions (i.e. locales), while minimizing additional engineering changes required for localization.

Requirements

Functional requirements

Non-functional requirements

Data models

Phrase

A phrase is consist of a word or a sentence, it is the base unit in a translation service.

type Phrase struct {
	ID string // A global unique ID which identifies a phrase
	Locale string // A string used to identify the locale, i.e., en, fr, cn
	Version string // Used to track different versions of a phrase. Could also use timestamp as the version.
	Default string // The default text of the phrase
	Singular string // The text of a phrase in singular, i.e., he or 他
	Plural string // The text of a phrase in plural, i.e., they or 他们
	...
}

We could use ID+Locale as the key to identify all phrases including the translated phrase, i.e., 123_en, 123_fr. We could also use ID+Locale+Version as the key to identify all versioned phrases, it is also fine to use timestamp for versioning.

Data persistent

We need to store the base phrase when it is created, and also need to store the translated phrases.

SQL or NoSQL

Which NoSQL to use

Considering the extensibility, using wide-column data store like DynamoDB, BigTable, Cassandra would be a good idea.

Leader based or leaderless

Cassandra has the leaderless distributed architecture, however DynamoDB, BigTable are leader based.

APIs

type I18n interface {
	// Add a phrase, return phrase ID and error status
	AddPhrase(defaultText, singular, plural, locale string) (string, error)
	// Get a phrase, return the phrase and error status
	GetPhrase(id, locale string) (string, error)
	// Translate a phrase from one locale to another
	// externalTranslator is a hook point which could be used to link to external translation service
	Translate(id, sourceLocale, targetLocale string, externalTranslator func(phrase, sourceLocale, targetLocale string) (string, error)) (string, error)
}

Architecture

Airbnb i18n platform

img.png

My own architecture

architecture

Components

Content service

Translation service

Workflow

Design details

Row structure if using wide-column

  default singular plural
123_cn 你好    
123_en hello    

  cn_default cn_singular cn_plural en_default en_singular en_plural
123 你好     Hello    

  default singular plural
123 encode(cn:你好,en:Hello)    

We could have three ways to structure a row, each of them has pros and cons and there are few things need to be considered:

Failure handling

Content Service failure

Translation Service failure

If Translation Service needs to persist the offset, there are two options to handle the failures:

Database failures

Distributed datastore has its own solution to handle the failures like node crash etc. The basic idea is to have replicas with leader selection(coordination and concurrency handling).

Cache layer failures

Client local store

This is why Airbnb’s i18n platform will have client lib and i18n agent installed on client side:

img.png

The goal is to make the client local store to be in sync with server side. The server side is the source of truth.

References