Skip to the content.

Geolocation based service

Use cases

Requirements

Functional requests

Non-functional requests

Some ideas of indexing geolocation

GeoHash

GeoHash problems

geohash-edge-case

geohash-non-linearity

Query might result in searching so many blocks because of above known issues. So we want to improve: a) continuity in search, b) make sure there are minimum blocks to be queried.

Quadtree + GeoHash

Quadtree wiki page

Now the search nearby becomes searching all sibling leaf nodes or the leaf nodes under a particular common ancestor. If we convert the tree structure to be key-value pairs (key is the geohash, value is the serialized node properties), it becomes searching the key-value pairs having the same prefix.

Quadtree + GeoHash problem

The problem is also from the limitations of GeoHash. GeoHash values of two adjacent nodes might not be continuous, like what is shown in non-linearity, so just searching the common prefix nodes is not enough.

Quadtree + Hilbert Curve

hilbert-curve

quadtree-hilbert-curve-line

Data models and APIs

type QuadTreeNode struct {
	ID        string   // Generated by Hilbert Curve algorithm based on the coordinates
	Level     int      // The depth of current cell in quadtree
	Parent    Cell     // The pointer to the parent cell
	Children  Cell     // Four children under current cell
	Latitude  string   // Latitude current cell represent
	Longitude string   // Longitude current cell represent
	Objects   map[string]string // List of object IDs and their serialized properties, the object could be restaurant or driver or user
}

type Restaurant struct {
	ID string
  Latitude  string   // Latitude current cell represent
  Longitude string   // Longitude current cell represent
  Dishes []Dish
}

type Order struct {
	ID string
	RestaurantID string
	CustomerID string
	Dishes []Dish
}

type Person struct {
	ID string
	Type string // driver or customer
  Latitude  string   // Latitude current cell represent
  Longitude string   // Longitude current cell represent
  Profile map[string]string
}

// E.g. add a new restaurant on map
func AddObject(latitude string, longitude string, level int, properties map[string]string) bool {}
// E.g. delete a restaurant on map
func DeleteObject(latitude string, longitude string, level int) bool {}

func SearchNearby(latitude string, longitude string, radius int, maxCount) []Object {}

// More TBA

Data store

Static data

All kinds of “metadata” follow the “write-once” pattern, they could be stored in database for durability. It does not matter SQL or NoSQL.

Ephemeral data store

Those ephemeral data could be stored as key-value in memory. Lyft uses Redis for real time driver locations. Kafka is not a datastore, it is also ok if you want to push driver location heartbeat events to it.

Architecture

architecture

Components design

Restaurant profile service

Index service

Index service itself is stateless, so that it could be easily horizontally scaled. The location data is persistent in cache and database.

Query service

Checkout service

Dispatch service

Notification service

How to generate RegionID

All regions will be “Hilbert Curve” linked if running through above process. The longer the region ID is the more granular the region would be.

how-to-generate-region-id

How to add/update/delete a new landmark on map

How to register a new restaurant

How to delete a new restaurant

For deletion, the restaurantID is passed in as part of the payload, so backend service can easily delete the corresponding data entries from cache and database.

There is a special case we might want to handle is that if customer places order and restaurant deletion happen concurrently:

Deletion succeeds when above steps succeeds. Or we could have another check right before checkout to make sure the restaurant does still exist.

How does query nearby work

How to partition

How does Uber solve hotspot

Uber Ringpop

How drivers and customer share real-time location

Above is how lyft implements the driver-customer location sharing.

However, it could also be implemented by using the pub-sub streaming mechanism that both driver and customer subscribe to the same temporary topic. Driver publish location update events to topic, a backend consumer thread consumes the events and send updates to customer. OR the customer’s client lib consumes the events by querying from the topic.

How is driver notified an order

Notification service deserves a dedicated article. This will be added soon.

References