• Biweekly Engineering
  • Posts
  • Large-Scale Stream Processing Platform at Grab - Biweekly Engineering - Episode 19

Large-Scale Stream Processing Platform at Grab - Biweekly Engineering - Episode 19

How Grab built a large-scale stream processing platform - Indexing ads at Pinterest - Designing idempotent APIs: A Stripe Discussion

Good greetings! Welcome to the 19th episode of your favourite Biweekly Engineering! Hope you have had a wonderful weekend, and all cheered up for reading this episode. :)

If you have missed past episodes, feel free to read them here.

Cabo da Roca - westernmost point of mainland Europe

Large-scale Stream-Processing Platform at Grab

Grab, South-East Asia’s giant super-app, grew rapidly from a startup to a public company within a decade. As you can imagine, hyper growth in a company brings in hyper growth of engineering challenges. Today, we will get to know how Grab solved one of their recurring engineering challenge - processing gazillions of events efficiently. From the article-

We saw common patterns of event processing emerge across our Go backend ecosystem, including:

- Filtering and mapping stream events of one type to another

- Aggregating events into time windows and materialising them back to the event log or to various types of transactional and analytics databases

Platform Use-Cases

Just like any well-written engineering article, the author explains the use cases for a stream processing system:

  • Asynchronous state management - when changes occur in a system and states are logged as events in a centralised log, multiple interested backend systems can subscribe to the event stream without blocking each other.

  • Time windowed aggregations - what happened in a fixed time window is useful information to optimise user interface and incentives as well as attributing bookings to some other specific user actions.

  • Stream joining, filtering, mapping - multiple streams can be mapped to a new stream, filtered, and joined to create sub-streams and materialise enriched data.

  • Real-time business intelligence - what’s happening in real-time to a product in the app is important information for business analysts.

  • Archival - storing data in cold storage for offline analysis and disaster recovery are also important for building data driven products.

Platform Requirements

To build a seamless stream processing platform, the team needed to resolve some fundamental platform requirements, like scalability and elasticity, zero operational burden on product teams, multi-tenancy, fault tolerance for data completeness, and tunable latency in processing.

A Stream Processing Framework

In order to build a successful platform, the team at Grab also built a stream processing framework, which made it easier for the product teams to meet their streaming use-cases. The framework supports deduplication, filtering, mapping, aggregating, joining, error handling, etc.

Architecture

In terms of architecture, as you can possibly guess already, the team used Apache Kafka, the de-facto technology for handling events at a high scale. They have a bunch of critical Kafka clusters that are polled by streaming pipelines written by product teams. These pipelines connect to external systems called sinks where processed data is sent. A sink can be anything - another Kafka topic, a database like ScyllaDB or Aurora, cold storage like S3, etc.

Each pipeline is an isolated deployment. The pipelines are stateless, and supported by metastore cluster (built on ScyllaDB) to hold the states.

During my time at Grab, we had our own streaming pipelines in our team. We frequently solved complex problems using the stream processing framework, and the experience was quite interesting.

- A personal anecdote

How Pinterest Built an Ads Indexing System

The Need of Performant Ads Serving

Online advertising requires performance. Think of your feed on Facebook. The ads that you see on the page are served in real-time, based on your recent searches, interests, location, age, etc. If it takes seconds for an ad to load in your feed, you will surely scroll through, which is something advertisers wouldn’t want.

Being able to serve ads in real-time while also being efficient and optimised is an interesting engineering challenge. All the companies with an ads business need to build systems that can support the efficiency and optimisation.

The core business of Pinterest is ads. As you can imagine, they want their ads serving to be fast and relevant. This is why ads indexing is needed. Indexing ads enables lighting-fast and correct lookup of ads based on the targeting criteria, which in turn enables optimised ads serving and performance.

So how did Pinterest built an ads indexing system? Let us briefly discuss.

Data Types in Ads

Each ad that you see on Pinterest has two kinds of data-

  • Ads control data - metadata like targeting specifications or bidding criteria.

  • Ads content data - data like image signature, historical performance, or text annotations.

The system has specific SLAs:

  • An update in the ads control data should reflect in serving within 15 minutes.

  • An update in the ads content data can be available while serving within hours.

Architecture

Given this two different SLAs, the system was designed like a Lambda Architecture: a real-time pipeline to react to any changes in ads control data, and a batch pipeline to periodically update and reconcile both ads control and ads content data.

The system is built using several major components:

  • Gateway - a streamer based on Kafka Streaming that subscribes to any changes to the binlog of ads database and streams the changes to Updater.

  • Updater - a service that receives the changes and ingests them to Storage Repo.

  • Storage Repo - a columnar key-value storage that stores all the data (raw updates, intermediate data, servable documents) and notifies downstream system, Argus, of these changes.

  • Argus - a service that subscribes to the Storage Repo, and runs complex computation to generate servable documents, and stores them back to the Storage Repo.

Apart from the above components, there are batch processing workflows that run periodically to ensure data integrity and consistency.

The achievements of the effort are pretty good-

Ads control data update-to-serve p90 latency < 60 seconds in 99.9% time

Ads control data update-to-serve max latency < 24 hours in 100% time

Single-digit daily number of dropped messages in incremental pipeline

Designing APIs with Idempotency

Unreliable Networks

Networks that power up the internet are, unfortunately, unreliable (who would have thought that, eh?). And in distributed systems, every communication happens over networks. So, things will just go wrong all the time, as mentioned in the article:

Consider a call between any two nodes. There are a variety of failures that can occur:

- The initial connection could fail as the client tries to connect to a server.

- The call could fail midway while the server is fulfilling the operation, leaving the work in limbo.

- The call could succeed, but the connection break before the server can tell its client about it.

When a failure occurs during a request, what to do? In some cases, simply retrying is enough. But in others, retrying can have significant implications. For example, in payment requests, double payment might occur if safeguards are not in place.

Idempotency

This is where idempotency comes into play. Delicate APIs like these have to be idempotent, meaning, a request can be sent to the API multiple times but the effects will occur only once.

When an API is idempotent, clients can safely retry until a success response is received, and handle errors without bothering much about what happened on the server side. This makes life easier for the clients.

Idempotency Keys

In payment systems, idempotency keys are used to ensure idempotency of a request. Every time clients make a request to the server, this key is attached to the request. On the server side, the key is stored along with relevant data. Whenever a request comes, the server checks whether the key has been seen before and processed or it is seen for the first time. Based on that, the server executes its operation.

This is the most common strategy used by not only payment systems, but also other systems where retrying requests without such safeguard is problematic.

That marks the end of today’s episode! I hope you enjoyed the discussion, but don’t forget to read them for yourself. :)

Keep an eye in the horizon. See you soon!

Reply

or to participate.