SDC#12 - From Monolithic to Microservices
Architecture of Uber's Schemaless Database and More...
Hello, this is Saurabh…👋
Welcome to the 116 new subscribers who have joined us since last week.
If you aren’t subscribed yet, join 1200+ curious developers looking to expand their knowledge by subscribing to this newsletter.
In this edition, I cover the following topics:
🖥 System Design Concept → Moving From Monolithic to Microservices
🧰 Case Study → Architecture of Uber’s Schemaless Database
🍔 Food For Thought → 24 Principles for System Design Interviews
So, let’s dive in.
🖥 Moving from Monolithic to Microservices
All right, Microservices are bad! (Don’t be angry without reading the full post)
They’ve been getting a lot of hate lately anyway.
But does that make plain-old Monolithic systems the epitome of good?
Not really!
At the very least, you should evolve your system to a Modular Monolith.
Modular Monolith is a pattern that combines the benefits of modular design with the simplicity of a monolithic architecture.
Some advantages of a Modular Monolith are:
Loosely-coupled modules
Well-defined boundaries
Explicit dependencies
Here’s what it looks like:
Basically, the Modular Monolith architecture divides our application into neat little boxes known as modules. These modules are largely independent of each other and isolated in terms of impact.
“How do we deploy stuff” - you may ask
Don’t worry.
You will still build and deploy a single application.
But you make the development process more efficient.
And you certainly make maintenance a lot easier.
In other words, Modular Monoliths have the potential to provide many advantages of Microservices without the problems associated with microservices.
However, that’s not all.
With Modular Monolith, you also have the opportunity to evolve your system into a Vertical Slice architecture.
So, instead of horizontal logical layers, your code is now split into vertical slices of business functionality.
Here’s what it looks like:
When you add or change a feature in a vertically-sliced application, you can scope the changes to the area of business concern.
And guess what?
The vertically-sliced modules can be split off into potential microservices over a period of time. And in doing so, you’ll have learned a thing or two about your domain and the best way to split your system into functionalities.
So, what’s the big takeaway from all of this?
No need to hate microservices
No need to hate monoliths
Evolve your architecture as your application needs to evolve.
Choose the right tool for the job and be happy.
🧰 Architecture of Uber’s Schemaless Database
A couple of editions ago, I spoke about why Uber ditched their monolithic Postgres and moved to an in-house database known as Schemaless.
You can check out that post over here.
Time to give a sneak peek into the overall architecture of the Schemaless Database and what we can learn from it.
As per Uber’s claim, Schemaless is a scalable and fault-tolerant datastore.
It has two types of nodes:
Worker Nodes
Storage Nodes
The job of the worker nodes is to receive requests from the clients, fan out those requests to the storage nodes, and aggregate the results.
The storage nodes store the actual data. The data itself is divided into a fixed number of shards (typically 4096).
Here’s what the high-level arrangement looks like:
“Why the need for two separate node types?” - you may ask.
It’s because having separate nodes allows the team to scale each part independently.
Each shard is mapped to a particular storage node. Also, each shard is replicated to a number of storage nodes. The replication factor can be controlled via configuration.
The typical replication factor comprises 3 nodes - one primary and two replicas. The replica nodes are distributed across multiple data centers to maintain data redundancy in case of an outage.
Here’s what a possible arrangement can look like:
Let’s now understand what happens in the case of read-and-write requests to the Schemaless DB
Read Requests
When a read request is made to the Schemaless database, the worker node can read the data from any storage node.
The client can configure the request to be read from the primary or the replica nodes. By default, the primary is chosen. This guarantees read-after-write consistency.
Basically, it’s a guarantee that if the client makes some updates, they will always see their own updates.
We spoke about this topic in great detail in an earlier post about the problems caused by replication lag.
In case the primary node itself is down, the read requests failover to other replica nodes. There’s a chance that the replicas may have stale data. But Uber typically sees subsecond latency on the replication so this issue isn’t a big deal for them.
Write Requests
Writes are more interesting.
A replica going down doesn’t impact the writes because all the writes go to the primary node.
But if the primary is down, Schemaless still accepts write requests. However, the data is persisted to disk on another randomly chosen primary node.
The trade-off here is that subsequent read requests cannot read these writes before the master is up or if a replica node is promoted to master.
This approach is also known as Buffered Writes. More on that in another post.
At face value, relying on a single node for writes may sound problematic.
But in the case of Uber, they prioritized the concept of total order on writes to each shard. This guarantees that they can read data from any replica and it will have the same order.
In other words, staying with a single node for write requests safeguards them from behavior such as going backward in time.
P.S. This post is inspired by the explanation provided on the Uber Engineering Blog. However, the diagrams have been drawn or re-drawn based on the information shared to make things clearer. You can find the original article over here.
🍔 Food For Thought
👉 24 Principles for System Design Interview
System Design Interviews can be tricky to navigate.
You don’t get days to think about a system design.
Often, you are expected to come up with a reasonably good answer within an hour or so while having a constant discussion with the panel.
To make a candidate’s life easier, I’ve compiled a list of 24 general principles that can help you provide better answers.
I posted it on X (Twitter) and so far, there have been 550+ likes and 80+ reposts of the same. Do check it out below👇
Here’s the link to the post:
https://x.com/ProgressiveCod2/status/1713443724473295181?s=20
👉 How do you deal with daily standup meetings that keep going on forever?
I asked this question on X (formerly Twitter) and got some really interesting answers.
Seems like it’s a common problem.
Do check it out👇
Link to the tweet:
https://x.com/ProgressiveCod2/status/1705178338514313397?s=20
That’s it for today! ☀️
Enjoyed this issue of the newsletter?
Share with your friends and colleagues
See you later with another value-packed edition — Saurabh.