• Biweekly Engineering
  • Posts
  • Avoiding Coupling in Microservice Architecture | Biweekly Engineering - Episode 23

Avoiding Coupling in Microservice Architecture | Biweekly Engineering - Episode 23

How to avoid coupling in microservices with Capital One Tech | Introduction to Apache Kafka | A million push notifications at Gojek

Hello hello my favourite subscribers! Welcome again to the 23rd episode of Biweekly Engineering - the newsletter that brings you curated software engineering articles from the internet!

Today, we have

  • How to avoid coupling in microservices from the Capital One Tech blog

  • Back to basics - introduction to Apache Kafka

Let’s fly! 🚀

A rainy day in Prague, Czechia

Avoiding Coupling in Microservices

When a monolithic system is transitioning to microservices, it’s often common to end up building a distributed monolith. This becomes a nightmare for the system owners because now you have all the disadvantages of a monolithic system with all the complications of the microservices world! 😬 

One of the few basic principles of building a successful and meaningful microservice architecture is to avoid strong coupling between systems.

It’s not always possible, but do it where it fits.

In this well-written article, the author discusses some basic strategies that avoids creating strong coupling in a microservice architecture. Let’s summarise-

  • No database sharing - Database shouldn’t be shared between services where multiple services need to retrieve the same data. Such a database should be owned by the most relevant service which should expose APIs for other services to consume.

  • No code sharing - Code should not be shared between microservices, rather services should have their copy of the code in form of a library. For example, if a service owns an object definition that is directly used by another service, a change in the object would potentially result in breaking the other service. This should be avoided by having libraries that are separately used by separate services.

  • Asynchronous communication or safety mechanisms - In places where synchronous communication is not necessary, resort to asynchronous communication. If it’s not possible, have mechanisms to make sure a caller doesn’t get stuck or wait for too long for a response from a callee.

  • No test environment sharing - Test environments should not be shared between services. If two services depend on the same test environment to run their own tests, they can bring the environment down.

  • Avoiding downstream calls in integration tests - During integration tests, try to avoid calling downstream services. This dependency could cause the failure of integration tests when those services are down - something surely we don’t want. An ideal approach would be to spin up those downstream dependencies as a part of the testing process.

  • No excess data sharing - A service should share only the data that is absolutely needed by the clients, instead of sharing excessive data that might get passed around to other services. Sometimes, this can also create security concerns when sensitive data is shared without a real use-case.

Note that every system has its unique constraints. It is not always possible to remain true to these principles to the core. And it is also fairly common to see deviation from them. As system owners, we need to make sure the deviation is happening for a reason, not for a poor design choice.

Introduction to Apache Kafka

Back to basics - here is a brief but solid introduction to Apache Kafka from one of my favourite newsletters - 2 Minute Streaming!

So what is Apache Kafka?

At its core, though, it’s a distributed commit log.

Log is a simple and an efficient data structure. You just append new data to the end of the log and you cannot mutate it. This is what Kafka leverages to support queuing messages at a high scale.

The Apache Kafka ecosystem has some core concepts as briefly described below-

  • Producers produce data to Kafka and consumers consume data. This is called a publisher-subscriber (often affectionately called pub-sub) architecture.

  • Producers produce data to a topic and consumers consume by subscribing to the topic.

  • Each topic is partitioned to facilitate scalability.

  • Partitions are replicated.

  • Brokers receive data from producers and serve data to the consumers. They are responsible for storing the partitions and their replicas.

Since Kafka is a distributed system, it needs coordination among the broker nodes. The coordination is done by the controller brokers. A controller broker acts as the source of truth for metadata and handles broker failure.

💡 Did you know?

Kafka was built in LinkedIn. It was open-sourced by them in 2011.

In many ways, Apache Kafka changed the world. Numerous companies all over the world use Kafka to build their systems at scale. Indeed, a fine example of a successful technology!

This was the first-ever blog published in LinkedIn Engineering blog:

Aaand we are done for today! I wanted to keep the episode shorter to provide more room for you to actually take your time to read the articles.

Until next episode, ciao!

Reply

or to participate.