SDC#24 - How DNS Works?

The Interview Question You Don't Want to Get Wrong...

Jan 16, 2024

Hello, this is Saurabh…👋

Welcome to the 337 new subscribers who have joined us since last week.

If you aren’t subscribed yet, join 3900+ curious Software Engineers looking to expand their system design knowledge by subscribing to this newsletter.

In this edition, I cover the below topics

🖥 System Design Concept → How does DNS Resolution Work?

🍔 Food For Thought → The Impact of AI on UI and APIs

So, let’s dive in.

🖥 What is DNS?

DNS stands for Domain Name Systems and acts as the telephone directory of the Internet.

As you know, the Internet is a collection of machines and each machine is identified by a numerical address. This address is known as the IP address of the machine and is a way for others to contact this machine.

But human beings are not good at remembering numbers. Or, to put it another way, human beings are better at remembering names.

DNS servers eliminate the need for humans to keep track of IP addresses of all your favorite websites. It’s a centralized server that maps easily remembered names (such as Google[dot]com or Amazon[dot]com) and maps them to dedicated IP addresses.

This process of converting a hostname into an IP address is called DNS resolution. Whenever you open a website on your browser, the DNS resolution process runs in the background.

But what exactly happens in the DNS resolution process?

To understand the whole picture, it’s important to first learn about the different types of DNS Servers involved in the process.

DNS Server Types

There are 4 types of DNS servers:

DNS Recursor
Root Nameserver
TLD Nameserver
Authoritative Nameserver

The Domain Name System works in an inverted tree structure. At the top of the tree is the root name server that is followed by TLDs and TLDs are followed by Authoritative Name Servers.

You can play around with the diagram on Eraser.io

Let’s look at each type in more detail:

1 - DNS Recursor

The DNS recursor is a server that receives queries from client machines through applications such as web browsers.

Think of the recursor as a librarian whose job is to find a particular book somewhere in the library. The librarian might go from section to section searching for the book. However, for the person who requests the book, the process of finding the book is completely opaque.

In a similar manner, the DNS recursor goes across the various DNS servers and finds the IP address. If needed, it also makes additional requests to other DNS servers to satisfy the DNS query.

Here’s how the overall sequence looks like:

2 - Root Nameserver

The root name server is the first step in resolving the host names to IP addresses.

In a domain name, the root is represented in the hidden trailing “.” at the end of the domain name. Typing this extra “.” is not necessary as your browser automatically adds it.

On a high-level, the DNS administration is structured in a hierarchy using different managed areas or zones. The root zone sits at the very top of that hierarchy and the root name servers operate in that region.

There are 13 DNS root servers as shown below.

However, it’s NOT entirely correct that there are just 13 physical root servers in the world.

Imagine the utter chaos that would cause on the Internet.

This was a reality during the early days of Internet when there was just one server for each of the 13 IP addresses shown in the table. However, today each of the 13 IP addresses are backed by several servers and they use Anycast routing to distribute requests based on load and proximity.

“Who operates these DNS Root Servers?” - you may ask.

The Internet Corporation for Assigned Names and Numbers (ICANN) operates servers for one of the 13 IP addresses. Others are operated by various important organizations like NASA, University of Maryland, Verisign and so on.

You can refer to the complete list in the above table.

“But if root nameservers are at the top of the hierarchy, how do resolvers find them?” - you may ask

Well - it almost sounds like a chicken and an egg problem.

Since root name servers are the top of the DNS hierarchy, recursive resolvers can’t find them in a DNS lookup.

To deal with this, every DNS resolver has the list of the 13 root server IP addresses built into its software. Whenever a DNS lookup is initiated, the recursive resolver communicates with one of those 13 IP addresses.

3 - TLD Nameserver

The next step in the search for an IP address takes the DNS resolution process to the TLD nameserver.

TLD server stands for Top-Level Domain server. You can think of it as a specific section inside a library that holds books of a particular type.

In the context of DNS, the TLD server hosts the last portion of the hostname. In other the TLD server is responsible for the “com” in amazon[dot]com.

Basically, a TLD nameserver maintains information for all the domain names that share a common domain extension such as .com, .net or anything else.

“Who manages the TLD nameservers?” - you may ask.

It’s handled by the Internet Assigned Numbers Authority (IANA), which is a branch of ICANN. The IANA also breaks up the TLD servers into two main groups:

Generic top-level domains such as .com, .org, .net and so on.
Country code top-level domains such as .us, .in, .uk and so on.

4 - Authoritative Nameservers

The final piece of the DNS resolution process is fulfilled by the Authoritative Nameservers.

When the Recursive Resolver receives a response from a TLD nameserver, that response directs the resolver to an authoritative nameserver.

The authoritative nameserver contains information specific to the domain name it serves (for example, amazon[dot]com).

Basically, the authoritative nameserver stores the IP address of the server in the DNS A Record or if the domain has a CNAME record.

In other words, if the authoritative name server has access to the record, it will return the IP address back to the DNS recursive resolver.

Types of DNS Queries

There are three main types of DNS queries:

Recursive
Iterative
Non-recursive

Let’s understand both of them.

1 - Recursive Query

In this type of query, a DNS client expects that a DNS server (typically a DNS recursive resolver) will find the IP address and get the job done. If the server isn’t able to find any record, the server will return an error message.

The client doesn’t care how many queries the server has made to find the result whether positive or negative.

2 - Iterative Query

In the iterative query approach, the DNS client doesn’t mind if the DNS server just returns the best answer it possibly can.

If the DNS server that gets the query doesn’t have a match for the domain name, it returns a referral to a DNS server that’s authorized for a lower level of the namespace.

It falls back on the DNS client to make another query to the referral address and the process continues with additional DNS servers that fall within the query chain.

3 - Non-recursive Query

The third type of query comes into the picture when a DNS client queries a DNS server for a record it has access to. This can happen because of two main reasons:

The server may be the authoritative server
The DNS record information exists within its cache.

Steps in a DNS Lookup

Let’s now look at the main part of the post and understand what happens during a DNS lookup process.

In other words, what happens from a DNS resolution perspective when you type www[dot]google[dot]com in your browser?

Here’s the complete process expressed visually.

Let’s look at each step in more detail:

Step 1

The browser sends the DNS query to the operating system.
The operating system checks within its cache for the IP address.
If found, it’s all good. If not, then the OS makes a query to the DNS resolver.
This query is a recursive query meaning that the resolver must return either an IP address or an error.
For most users, the DNS resolver is provided by their Internet Service Provider.

Step 2

The DNS resolver starts by querying one of the root DNS servers for the IP of www[dot]google[dot]com. Remember those 13 root IP addresses backed by a cluster of servers. Yes, those folks get the query.
Now, this query isn’t recursive but iterative. The response must be an address even if it isn’t an exact address.
This is how the query trace looks like when you run the command dig +trace www.google.com

.			46248	IN	NS	j.root-servers.net.
.			46248	IN	NS	h.root-servers.net.
.			46248	IN	NS	d.root-servers.net.
.			46248	IN	NS	m.root-servers.net.
.			46248	IN	NS	l.root-servers.net.
.			46248	IN	NS	f.root-servers.net.
.			46248	IN	NS	g.root-servers.net.
.			46248	IN	NS	c.root-servers.net.
.			46248	IN	NS	b.root-servers.net.
.			46248	IN	NS	i.root-servers.net.
.			46248	IN	NS	a.root-servers.net.
.			46248	IN	NS	e.root-servers.net.
.			46248	IN	NS	k.root-servers.net.

Step 3

Now, these root servers hold the locations of the top-level domains such as .com, .net and so on.
Since the root servers don’t have the direct IP information for www[dot]google[dot]com, it returns the location of the .com servers.

Step 4

With this info, the resolver queries one of the .com TLD server for the location of google[dot]com.
Like the root servers, each TLDs also have 4-13 clustered name servers existing in many locations. As we saw earlier, there are two types of TLDs - country specific and generic.
This is how Step 4 request looks like when you execute dig +trace www.google.com

com.			172800	IN	NS	e.gtld-servers.net.
com.			172800	IN	NS	b.gtld-servers.net.
com.			172800	IN	NS	a.gtld-servers.net.
com.			172800	IN	NS	d.gtld-servers.net.
com.			172800	IN	NS	i.gtld-servers.net.
com.			172800	IN	NS	f.gtld-servers.net.
com.			172800	IN	NS	j.gtld-servers.net.
com.			172800	IN	NS	k.gtld-servers.net.
com.			172800	IN	NS	c.gtld-servers.net.
com.			172800	IN	NS	g.gtld-servers.net.
com.			172800	IN	NS	h.gtld-servers.net.
com.			172800	IN	NS	l.gtld-servers.net.
com.			172800	IN	NS	m.gtld-servers.net.

Step 5

The query from the DNS resolver to the TLD is also an iterative query.
The TLD server responds with the IP address of the domain’s nameserver.
For example, though the TLD server doesn’t have the IP address for google[dot]com, but it knows the location of google’s name servers. These could be name servers such as “ns1.google.com” to “ns4.google.com”

Step 6

Finally, the DNS resolver queries one of Google’s nameservers for the IP address of www[dot]google[dot]com.
See the below section from the output of the dig command.

google.com.		172800	IN	NS	ns2.google.com.
google.com.		172800	IN	NS	ns1.google.com.
google.com.		172800	IN	NS	ns3.google.com.
google.com.		172800	IN	NS	ns4.google.com.

Step 7

This time, the queried name server knows the IP address since it is the authoritative nameserver for that domain. This query could be considered as a non-recursive query.
It responds with an A or AAAA address record.

Step 8

At this point, the DNS resolver has finished the recursion process and is able to respond to the end user’s operating system with an IP address.
See the below output from the dig command.

www.google.com.		300	IN	A	142.250.194.132

Step 9 & 10

With the IP address in hand, the operating system provides it to the browser that initiates the TCP connection to start loading the page using HTTP.
In case you are interested to read more about HTTP and how it works, I wrote a super-detailed edition about it.

SDC#22 - HTTP/1.1 vs HTTP/2

Saurabh Dashora

January 2, 2024

Read full story

DNS Caching

Now, before we end the discussion, it’s also important to understand the role of caching in DNS.

DNS caching involves storing data closer to the client so that the DNS query can be resolved earlier and one can avoid going through the chain.

There are two caching locations before the DNS query even reaches the DNS resolver.

1 - Browser DNS Caching

Modern web browsers are designed to cache DNS records for a set amount of time.

When a request is made for a DNS record, the browser cache is the first location that’s checked for the requested record.

2 - OS Level DNS Caching

This is the second spot where a DNS record can be cached.

When a DNS query reaches the operating system, a process known as “stub resolver” checks its own cache to see if it already has the record.

If not, it sends a DNS query (with a recursive flag as we saw earlier) to a DNS recursive resolver inside the Internet Service Provider.

However, the caching doesn’t stop here either.

The DNS resolver also maintains its own cache. If the resolver doesn’t have the A records for a domain but has the NS records for the authoritative nameserver, it can directly query those name servers and bypass the root and TLD servers.
If the resolver doesn’t have the NS records, it can send a query to the TLD servers directly and skip the root server. This depends on whether the TLD (.com, .net) server locations are cached within the resolver.
In the highly unlikely event that the resolver doesn’t have records pointing to the TLD servers, it will query the root server. This can happen after a DNS cache has been purged for some reason.

👉 So - what do you think about DNS? And have you faced any other variations of this question in your interviews?

🍔 Food For Thought

👉 Is all future development going to be backend only?

Last week, there was a very interesting take by Naval that suggest that AIs will eventually make UIs and APIs unnecessary.

What could be the implications of this transformation?

Would all future development be only backend development?

Will systems interact with each other using prompts?

Here’s the link:

https://twitter.com/naval/status/1745304473587810393?s=20

That’s it for today! ☀️

Enjoyed this issue of the newsletter?

Share with your friends and colleagues

See you next week with another value-packed edition — Saurabh

System Design Codex

SDC#22 - HTTP/1.1 vs HTTP/2

Discussion about this post