Biweekly Engineering
Posts
Metrics at Scale - An Airbnb Case Study | Biweekly Engineering | Episode 22

Metrics at Scale - An Airbnb Case Study | Biweekly Engineering | Episode 22

How Airbnb built Minerva - a high-scale unified metrics platform

Biweekly Engineering
November 08, 2023

Aha, look who’s back!

After a break, your favourite Biweekly Engineering is back with its 22nd episode! Hope my readers are doing absolutely fine, and managed to catch up with the curated blog posts shared in the past episodes.

Today, we will learn about Minerva, Airbnb’s metrics platform at scale. I personally had quite a bit of fun reading through the posts and hopefully, you will, too.

Let’s begin!

Fire in the forest - fall colours in Canada

A Humble Start

How Airbnb Achieved Metric Consistency at Scale

Part-I: Introducing Minerva — Airbnb’s Metric Platform

medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70

Airbnb had a rather humble start in its data journey. All the critical data was placed in a bunch of tables called ‘core_data’. Over time, core_data gathered popularity, and more and more teams started to use its tables to generate different derived tables and run complex queries and analytics on top of them.

But there were growing pains. Data warehouse became more complex, same data was generated by multiple teams, fixes in upstream didn’t guarantee that downstream jobs were also rerun and fixed, and debugging became increasingly complex. Moreover, for the same metrics, different teams would often have inconsistent data points.

Airbnb knew they needed to overcome these challenges.

Enter Minerva

In a multi-year journey, Airbnb built Minerva, it’s in-house metrics platform. The core_data tables were rebuilt from scratch with leaner and normalised schema. Minerva was meant to be built on top of these tables with three specific goals:

Programmatic joins of tables to build analytics friendly insights
Reliable data backfills when business logic is changed (also when some bugs or issues are fixed)
Consistent and correct generation of derived data

❝

Minerva takes fact and dimension tables as inputs, performs data denormalization, and serves the aggregated data to downstream applications.

Where Minerva stands at a high level

According to the article written in 2018, Minerva was supporting 12000 metrics, 4000 dimensions, and 200 data producers.

Data Production

Data production in Minerva

Minerva produces data through the stages shown above. Metrics are defined, validated, and reviewed. Then the corresponding DAG that would produce the data is generated. Minerva runtime executes the DAG with support for self healing, data checks, monitoring, and batch processing. After that, Minerva API serves the derived data to many different downstream consumers.

💡Quick Insight

The term DAG stands for Directed Acyclic Graph. We have all read about DAGs in our algorithms course. But what does it mean in the data engineering world? If you are unsure, check this article to understand how they are relevant.

Apart from the above, Minerva also supports backfills to ensure changes in business logic or issues in data are handled. In my personal experience, having seamless and robust backfilling process is super critical in a successful data platform at scale.

Lastly, Minerva has support for cost attribution to limit costs for producing data, and data retention mechanism to clean up obsolete data when required.

Data Consumption

In Minerva, the same metric is produced once and consumed everywhere. This was one of the motivations behind building Minerva. As the article mentions, in the past, different teams would derive the same metrics using slightly different tables which would lead to inconsistency. Minerva enables consistent data consumption.

Apart from the features discussed above, Minerva offers more:

Data catalog - to easily search through available metrics and related metadata.
Data exploration - to visually gather insights from data.
A/B testing - to support consistent metrics in A/B tests.
Executive reporting - to provide critical insights to the leadership.
Data analysis - to enable data analysts programmatically run analysis through notebooks.

Design Principles of Minerva

How Airbnb Standardized Metric Computation at Scale

Part II: The six design principles of Minerva compute infrastructure

medium.com/airbnb-engineering/airbnb-metric-computation-with-minerva-part-2-9afe6695b486

Briefly, the development of Minerva followed a set of core principles-

Data standardisation
- Minerva standardised the sources and consumption of data.
- Instead of thinking in terms of tables, Minerva was built around metrics and dimensions.
- Data producers have to provide self-sufficient metadata while configuring and setting up metrics in the platform.
Declarative design
- Minerva popularised the idea of what instead of how. Producers define what they need to produce but they don’t care about how the data is produced.
Scalability
- Minerva is highly scalable, both computationally and operationally.
- It reuses existing data to avoid wastage of resources.
- The platform was design following the DRY principle, aka Don’t Repeat Yourself principle.
- It can self-heal, meaning, missing data is backfilled automatically whenever required.
- Minerva can also trigger alerts to responsible teams for any kind of serious platform or data issues.
Consistency
- Minerva ensures data consistency by automatically triggering backfills when there are changes in the configurations or dimension sources.
Availability
- Minerva ensures availability through the use of staging environment for data backfill.
Testability
- Minerva provides testability to the users through its prototyping tool to ensure data correctness.

Minerva API Architecture

How Airbnb Enables Consistent Data Consumption at Scale

Part-III: Building a coherent consumption experience

medium.com/airbnb-engineering/how-airbnb-enables-consistent-data-consumption-at-scale-1c0b6a8b9206

As we have already mentioned, Minerva was designed to ease its users from the burden of how and where while enabling them to think only of what. Also, it supports seamless integration with downstream applications.

This is why Minerva API was designed.

The above high-level architecture has two major components:

Metadata fetcher - to manage data sources metadata.
API server - to serve data to the clients through split-apply-combine strategy.

Lastly, as the part 3 showcases in details, Minerva API integrates with all different downstream clients (like Apache Superset) and provides consistent view of the data.

And that’s a wrap! If you feel curious to know more about Minerva, don’t miss the details from the articles.

Till the next episode, goodbye!

Reply

or to participate.