Arslan Ahmad

June 18th, 2025

How to Design a URL Shortener Service (System Design Interview Guide)

Learn how to design a URL shortening service (like TinyURL or Bit.ly) in a system design interview. We cover the full design: requirements (e.g. generating short links, redirection, custom aliases), system architecture, database schema, and strategies for scalability and fault tolerance.

1. Why do we need URL shortening?

URL shortening is used to create shorter aliases for long URLs. We call these shortened aliases “short links.” Users are redirected to the original URL when they hit these short links. Short links save a lot of space when displayed, printed, messaged, or tweeted. Additionally, users are less likely to mistype shorter URLs.

For example, if we shorten the following URL through TinyURL:

https://www.designgurus.io/course/grokking-the-system-design-interview

We would get:

https://tinyurl.com/vzet59pa

The shortened URL is nearly one-third the size of the actual URL.

URL shortening is used to optimize links across devices, track individual links to analyze audience, measure ad campaigns' performance, or hide affiliated original URLs.

If you haven’t used tinyurl.com before, please try creating a new shortened URL and spend some time going through the various options their service offers. This will help you a lot in understanding this chapter.

URL Shortener System Design

2. Requirements and Goals of the System

💡 You should always clarify requirements at the beginning of the interview. Be sure to ask questions to find the exact scope of the system that the interviewer has in mind.

Our URL shortening system should meet the following requirements:

Functional Requirements:

Given a URL, our service should generate a shorter and unique alias of it. This is called a short link. This link should be short enough to be easily copied and pasted into applications.
When users access a short link, our service should redirect them to the original link.
Users should optionally be able to pick a custom short link for their URL.
Links will expire after a standard default timespan. Users should be able to specify the expiration time.

Non-Functional Requirements:

The system should be highly available. This is required because, if our service is down, all the URL redirections will start failing.
URL redirection should happen in real-time with minimal latency.
Shortened links should not be guessable (not predictable).

Extended Requirements:

Analytics; e.g., how many times a redirection happened?
Our service should also be accessible through REST APIs by other services.

3. Capacity Estimation and Constraints

Our system will be read-heavy. There will be lots of redirection requests compared to new URL shortenings. Let’s assume a 100:1 ratio between read and write.

Traffic estimates: Assuming, we will have 500M new URL shortenings per month, with 100:1 read/write ratio, we can expect 50B redirections during the same period:

What would be Queries Per Second (QPS) for our system? New URLs shortenings per second:

<center> 500 million / (30 days * 24 hours * 3600 seconds) = ~200 URLs/s </center>

Considering 100:1 read/write ratio, URLs redirections per second will be:

Storage estimates: Let’s assume we store every URL shortening request (and associated shortened link) for 5 years. Since we expect to have 500M new URLs every month, the total number of objects we expect to store will be 30 billion:

<center> 500 million * 5 years * 12 months = 30 billion </center>

Let’s assume that each stored object will be approximately 500 bytes (just a ballpark estimate--we will dig into it later). We will need 15TB of total storage:

<center> 30 billion * 500 bytes = 15 TB </center>

Bandwidth estimates: For write requests, since we expect 200 new URLs every second, total incoming data for our service will be 100KB per second:

<center> 200 * 500 bytes = 100 KB/s </center>

For read requests, since every second we expect ~20K URLs redirections, total outgoing data for our service would be 10MB per second:

<center> 20K * 500 bytes = ~10 MB/s </center>

Memory estimates: If we want to cache some of the hot URLs that are frequently accessed, how much memory will we need to store them? If we follow the 80-20 rule, meaning 20% of URLs generate 80% of traffic, we would like to cache these 20% hot URLs.

Since we have 20K requests per second, we will be getting 1.7 billion requests per day:

<center> 20K * 3600 seconds * 24 hours = ~1.7 billion </center>

To cache 20% of these requests, we will need 170GB of memory.

<center> 0.2 * 1.7 billion * 500 bytes = ~170GB </center>

One thing to note here is that since there will be many duplicate requests (of the same URL), our actual memory usage will be less than 170GB.

High-level estimates: Assuming 500 million new URLs per month and 100:1 read:write ratio, following is the summary of the high level estimates for our service:

<table class="tg" style="color: rgb(61, 61, 78); font-family: "Droid Serif", Georgia, serif; font-size: 18px; border-color: rgba(210,210,214,var(--tw-border-opacity)); --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-spacing: 0px; --tw-border-opacity:1; margin: 2em auto; table-layout: fixed; width: 439.5px; text-align: start;"> <tbody style="--tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent;"> <tr style="--tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent;"> <td class="tg-rmb8" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(204, 219, 180); vertical-align: top;">New URLs</td> <td class="tg-rmb8" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(204, 219, 180); vertical-align: top;">200/s</td> </tr> <tr style="--tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent;"> <td class="tg-yw4l" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(214, 230, 189); vertical-align: top;">URL redirections</td> <td class="tg-yw4l" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(214, 230, 189); vertical-align: top;">20K/s</td> </tr> <tr style="--tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent;"> <td class="tg-rmb8" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(204, 219, 180); vertical-align: top;">Incoming data</td> <td class="tg-rmb8" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(204, 219, 180); vertical-align: top;">100KB/s</td> </tr> <tr style="--tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent;"> <td class="tg-yw4l" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(214, 230, 189); vertical-align: top;">Outgoing data</td> <td class="tg-yw4l" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(214, 230, 189); vertical-align: top;">10MB/s</td> </tr> <tr style="--tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent;"> <td class="tg-rmb8" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(204, 219, 180); vertical-align: top;">Storage for 5 years</td> <td class="tg-rmb8" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(204, 219, 180); vertical-align: top;">15TB</td> </tr> <tr style="--tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent;"> <td class="tg-yw4l" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(214, 230, 189); vertical-align: top;">Memory for cache</td> <td class="tg-yw4l" style="padding: 10px 5px; --tw-shadow:0 0 transparent; --tw-ring-inset:var(--tw-empty, ); --tw-ring-offset-width:0px; --tw-ring-offset-color:#fff; --tw-ring-color:rgba(59,130,246,0.5); --tw-ring-offset-shadow:0 0 transparent; --tw-ring-shadow:0 0 transparent; border-width: 1px; border-style: solid; border-color: rgb(187, 187, 187); --tw-border-opacity:1; word-break: normal; hyphens: auto; font-family: Arial, sans-serif; font-size: 14px; overflow: hidden; color: black; background-color: rgb(214, 230, 189); vertical-align: top;">170GB</td> </tr> </tbody> </table>

💡 Once we've finalized the requirements, it's always a good idea to define the system APIs. This should explicitly state what is expected from the system.

We can have SOAP or REST APIs to expose the functionality of our service. Following could be the definitions of the APIs for creating and deleting URLs:

createURL(api_dev_key, original_url, custom_alias=None, user_name=None, expire_date=None)

Parameters:
api_dev_key (string): The API developer key of a registered account. This will be used to, among other things, throttle users based on their allocated quota.
original_url (string): Original URL to be shortened.
custom_alias (string): Optional custom key for the URL.
user_name (string): Optional user name to be used in the encoding.
expire_date (string): Optional expiration date for the shortened URL.

Returns: (string)
A successful insertion returns the shortened URL; otherwise, it returns an error code.

deleteURL(api_dev_key, url_key)

Where “url_key” is a string representing the shortened URL to be retrieved; a successful deletion returns ‘URL Removed’.

How do we detect and prevent abuse? A malicious user can put us out of business by consuming all URL keys in the current design. To prevent abuse, we can limit users via their api_dev_key. Each api_dev_key can be limited to a certain number of URL creations and redirections per some time period (which may be set to a different duration per developer key).

5. Database Design

💡 Defining the DB schema in the early stages of the interview would help to understand the data flow among various components and later would guide towards data partitioning.

A few observations about the nature of the data we will store:

We need to store billions of records.
Each object we store is small (less than 1K).
There are no relationships between records—other than storing which user created a URL.
Our service is read-heavy.

Database Schema:

We would need two tables: one for storing information about the URL mappings and one for the user's data who created the short link.

Database Schema

What kind of database should we use? Since we anticipate storing billions of rows, and we don’t need to use relationships between objects – a NoSQL store like DynamoDB, Cassandra or Riak is a better choice. A NoSQL choice would also be easier to scale. Please see 'SQL vs NoSQL' for more details.

6. Basic System Design and Algorithm

The problem we are solving here is how to generate a short and unique key for a given URL.

In the TinyURL example in Section 1, the shortened URL is “https://tinyurl.com/vzet59pa”. The last eight characters of this URL constitute the short key we want to generate. We’ll explore two solutions here:

a. Encoding actual URL

We can compute a unique hash (e.g., MD5 or SHA256, etc.) of the given URL. The hash can then be encoded for display. This encoding could be base36 ([a-z ,0-9]) or base62 ([A-Z, a-z, 0-9]) and if we add ‘+’ and ‘/’ we can use Base64 encoding. A reasonable question would be, what should be the length of the short key? 6, 8, or 10 characters?

Using base64 encoding, a 6 letters long key would result in 64^6 = ~68.7 billion possible strings.
Using base64 encoding, an 8 letters long key would result in 64^8 = ~281 trillion possible strings.

With 68.7B unique strings, let's assume six letter keys would suffice for our system.

If we use the MD5 algorithm as our hash function, it will produce a 128-bit hash value. After base64 encoding, we’ll get a string having more than 21 characters (since each base64 character encodes 6 bits of the hash value). Now we only have space for 6 (or 8) characters per short key; how will we choose our key then? We can take the first 6 (or 8) letters for the key. This could result in key duplication; to resolve that, we can choose some other characters out of the encoding string or swap some characters.

What are the different issues with our solution? We have the following couple of problems with our encoding scheme:

If multiple users enter the same URL, they can get the same shortened URL, which is not acceptable.
What if parts of the URL are URL-encoded? e.g., http://www.designgurus.org/distributed.php?id=design, and http://www.designgurus.org/distributed.php%3Fid%3Ddesign are identical except for the URL encoding.

Workaround for the issues: We can append an increasing sequence number to each input URL to make it unique and then generate its hash. We don’t need to store this sequence number in the databases, though. Possible problems with this approach could be an ever-increasing sequence number. Can it overflow? Appending an increasing sequence number will also impact the performance of the service.

Another solution could be to append the user id (which should be unique) to the input URL. However, if the user has not signed in, we would have to ask the user to choose a uniqueness key. Even after this, if we have a conflict, we have to keep generating a key until we get a unique one.

b. Generating keys offline

We can have a standalone Key Generation Service (KGS) that generates random six-letter strings beforehand and stores them in a database (let’s call it key-DB). Whenever we want to shorten a URL, we will take one of the already-generated keys and use it. This approach will make things quite simple and fast. Not only are we not encoding the URL, but we won’t have to worry about duplications or collisions. KGS will make sure all the keys inserted into key-DB are unique

Can concurrency cause problems? As soon as a key is used, it should be marked in the database to ensure that it is not used again. If there are multiple servers reading keys concurrently, we might get a scenario where two or more servers try to read the same key from the database. How can we solve this concurrency problem?

Servers can use KGS to read/mark keys in the database. KGS can use two tables to store keys: one for keys that are not used yet, and one for all the used keys. As soon as KGS gives keys to one of the servers, it can move them to the used keys table. KGS can always keep some keys in memory to quickly provide them whenever a server needs them.

For simplicity, as soon as KGS loads some keys in memory, it can move them to the used keys table. This ensures each server gets unique keys. If KGS dies before assigning all the loaded keys to some server, we will be wasting those keys--which could be acceptable, given the huge number of keys we have.

KGS also has to make sure not to give the same key to multiple servers. For that, it must synchronize (or get a lock on) the data structure holding the keys before removing keys from it and giving them to a server.

What would be the key-DB size? With base64 encoding, we can generate 68.7B unique six letters keys. If we need one byte to store one alpha-numeric character, we can store all these keys in:

<center> 6 (characters per key) * 68.7B (unique keys) = 412 GB. </center>

Isn't KGS a single point of failure? Yes, it is. To solve this, we can have a standby replica of KGS. Whenever the primary server dies, the standby server can take over to generate and provide keys.

Can each app server cache some keys from key-DB? Yes, this can surely speed things up. Although, in this case, if the application server dies before consuming all the keys, we will end up losing those keys. This can be acceptable since we have 68B unique six-letter keys.

How would we perform a key lookup? We can look up the key in our database to get the full URL. If it’s present in the DB, issue an “HTTP 302 Redirect” status back to the browser, passing the stored URL in the “Location” field of the request. If that key is not present in our system, issue an “HTTP 404 Not Found” status or redirect the user back to the homepage.

Should we impose size limits on custom aliases? Our service supports custom aliases. Users can pick any ‘key’ they like, but providing a custom alias is not mandatory. However, it is reasonable (and often desirable) to impose a size limit on a custom alias to ensure we have a consistent URL database. Let’s assume users can specify a maximum of 16 characters per customer key (as reflected in the above database schema).

7. Data Partitioning and Replication

To scale out our DB, we need to partition it so that it can store information about billions of URLs. Therefore, we need to develop a partitioning scheme that would divide and store our data into different DB servers.

a. Range Based Partitioning: We can store URLs in separate partitions based on the hash key's first letter. Hence we will save all the URL hash keys starting with the letter ‘A’ (and 'a') in one partition, save those that start with the letter ‘B’ in another partition, and so on. This approach is called range-based partitioning. We can even combine certain less frequently occurring letters into one database partition. Thus, we should develop a static partitioning scheme to always store/find a URL in a predictable manner.

The main problem with this approach is that it can lead to unbalanced DB servers. For example, we decide to put all URLs starting with the letter ‘E’ into a DB partition, but later we realize that we have too many URLs that start with the letter ‘E.’

b. Hash-Based Partitioning: In this scheme, we take a hash of the object we are storing. We then calculate which partition to use based upon the hash. In our case, we can take the hash of the ‘key’ or the short link to determine the partition in which we store the data object.

Our hashing function will randomly distribute URLs into different partitions (e.g., our hashing function can always map any 'key' to a number between [1…256]). This number would represent the partition in which we store our object.

This approach can still lead to overloaded partitions, which can be solved using 'Consistent Hashing'.

8. Cache

We can cache URLs that are frequently accessed. We can use any off-the-shelf solution like Memcached, which can store full URLs with their respective hashes. Thus, the application servers, before hitting the backend storage, can quickly check if the cache has the desired URL.

How much cache memory should we have? We can start with 20% of daily traffic and, based on clients’ usage patterns, we can adjust how many cache servers we need. As estimated above, we need 170GB of memory to cache 20% of daily traffic. Since a modern-day server can have 256GB of memory, we can easily fit all the cache into one machine. Alternatively, we can use a couple of smaller servers to store all these hot URLs.

Which cache eviction policy would best fit our needs? When the cache is full, and we want to replace a link with a newer/hotter URL, how would we choose? Least Recently Used (LRU) can be a reasonable policy for our system. Under this policy, we discard the least recently used URL first. We can use a Linked Hash Map or a similar data structure to store our URLs and Hashes, which will also keep track of the URLs that have been accessed recently.

To further increase the efficiency, we can replicate our caching servers to distribute the load between them.

How can each cache replica be updated? Whenever there is a cache miss, our servers would be hitting a backend database. Whenever this happens, we can update the cache and pass the new entry to all the cache replicas. Each replica can update its cache by adding the new entry. If a replica already has that entry, it can simply ignore it.

Request flow for accessing a shortened URL

9. Load Balancer (LB)

We can add a Load balancing layer at three places in our system:

Between Clients and Application servers
Between Application Servers and database servers
Between Application Servers and Cache servers

Initially, we could use a simple Round Robin approach that distributes incoming requests equally among backend servers. This LB is simple to implement and does not introduce any overhead. Another benefit of this approach is that if a server is dead, LB will take it out of the rotation and stop sending any traffic to it.

A problem with Round Robin LB is that we do not consider the server load. As a result, if a server is overloaded or slow, the LB will not stop sending new requests to that server. To handle this, a more intelligent LB solution can be placed that periodically queries the backend server about its load and adjusts traffic based on that.

10. Purging or DB cleanup

Should entries stick around forever, or should they be purged? If a user-specified expiration time is reached, what should happen to the link?

If we chose to continuously search for expired links to remove them, it would put a lot of pressure on our database. Instead, we can slowly remove expired links and do a lazy cleanup. Our service will ensure that only expired links will be deleted, although some expired links can live longer but will never be returned to users.

Whenever a user tries to access an expired link, we can delete the link and return an error to the user.
A separate Cleanup service can run periodically to remove expired links from our storage and cache. This service should be very lightweight and scheduled to run only when the user traffic is expected to be low.
We can have a default expiration time for each link (e.g., two years).
After removing an expired link, we can put the key back in the key-DB to be reused.
Should we remove links that haven’t been visited in some length of time, say six months? This could be tricky. Since storage is getting cheap, we can decide to keep links forever.

Detailed component design for URL shortening

11. Telemetry

How many times a short URL has been used, what were user locations, etc.? How would we store these statistics? If it is part of a DB row that gets updated on each view, what will happen when a popular URL is slammed with a large number of concurrent requests?

Some statistics worth tracking: country of the visitor, date and time of access, web page that referred the click, browser, or platform from where the page was accessed.

12. Security and Permissions

Can users create private URLs or allow a particular set of users to access a URL?

We can store the permission level (public/private) with each URL in the database. We can also create a separate table to store UserIDs that have permission to see a specific URL. If a user does not have permission and tries to access a URL, we can send an error (HTTP 401) back. Given that we are storing our data in a NoSQL wide-column database like Cassandra, the key for the table storing permissions would be the ‘Hash’ (or the KGS generated ‘key’). The columns will store the UserIDs of those users that have permission to see the URL.

12. Summary

In my experience, every successful software developer has followed a systematic approach to solve a system design question during an interview. The current interview process requires us to present a reasonable solution within 45 minutes and the key to success is being organized during the system design interview. Hopefully, this step-by-step approach will help you stay on track during the interview.

Please take a look at Grokking the System Design Interview for more system design interview questions like:

To learn software architecture and practice advanced system design interview questions take a look at Grokking the Advanced System Design Interview.

Want to learn more about system design interviews, check followings posts from Design Gurus:
[1] System Design Interviews: A Step-By-Step Guide
[2] System Design Interview Basics: CAP vs. PACELC
[3] Mastering the System Design Interview: A Complete Guide

FAQs

1. What are the key components of a URL shortener service?

API Gateway: Acts as the entry point for clients, routing API calls to backend services. It can handle cross-cutting concerns (like logging, rate limiting, or authentication) before requests reach the core application.
Database: A storage system to persist the mapping from short codes to original URLs. This is often a fast key-value store or NoSQL database to allow quick lookups (Common System Design Interview Examples (With Solutions)). Every short URL code serves as the key to retrieve the full URL.
Hashing/ID Generator: A mechanism to create unique short codes for URLs. This could be a hashing function or a sequential ID generator (which is then encoded) to ensure each long URL gets a unique short identifier.
Caching: An in-memory cache (e.g., Redis) for frequently accessed URL mappings. Caching popular links reduces database load and speeds up redirects, since most traffic is read-heavy (many more redirects than new URL shortenings).
Load Balancer: Distributes incoming requests across multiple servers so no single machine becomes a bottleneck. This improves availability and allows the service to handle more traffic by scaling horizontally.
CDN: A Content Delivery Network can be used to serve static assets (like the website UI or images) and deliver content from edge locations closer to users, reducing latency and offloading work from origin servers (What is CDN?). (While the redirection itself is a quick lookup, a CDN helps with any static content and can improve global access speeds.)

2. What scalability strategies help a URL shortener handle high traffic?

Sharding: Partition the URL data across multiple database servers (for example, by hashing the short code or using a range of IDs for each shard). This spreads out the load and storage, allowing the system to handle a massive number of entries without one DB becoming a hotspot.
Caching: Use an in-memory cache layer to store popular URL mappings and serve frequent redirects quickly. This greatly reduces latency and database reads, as most redirect lookups can be served from cache. Cache invalidation or TTLs ensure the cache doesn’t serve stale data if URLs expire.
CDN: Leverage a CDN for any static content and to improve response times for users globally. By caching content at the network edge, a CDN reduces the load on core servers and speeds up access for users far from your data center (What is CDN?). This is especially useful if your service provides a web interface or API responses that can be cached.
Asynchronous Processing: Offload non-critical tasks from the request/response cycle. For instance, logging clicks or updating analytics can be done via message queues and background workers (Kafka, RabbitMQ, etc.), so the user’s request to shorten or redirect isn’t slowed down (What is asked in a system design interview?). This keeps the core operations (generating a short link or looking one up) fast, while heavy processing happens in the background.
Auto-Scaling: In cloud environments, set up auto-scaling for your service instances. This means the system can automatically add more servers during traffic spikes and remove them during low usage. Auto-scaling ensures the shortener service remains responsive under varying loads without manual intervention, providing elasticity and cost efficiency.

3. SQL or NoSQL – what’s the right database choice for a URL shortener, and why use key-value stores?

A URL shortener’s storage needs are simple (map short code → long URL), and the choice between SQL and NoSQL often comes down to scalability vs. consistency requirements. In practice, many URL shorteners opt for NoSQL or key-value databases for performance:

NoSQL (Key-Value Store): NoSQL databases like DynamoDB or Cassandra are designed to scale horizontally across many servers, handling high write/read throughput with low latency. They fit this use case well – the data model is a simple key-value mapping. This means lookups by short code are fast and easy to distribute. For example, a NoSQL store can partition data by short code hash, allowing the service to scale to billions of entries. (In short, NoSQL is chosen for scalability and performance, whereas SQL can become a bottleneck at large scale (What is asked in a system design interview?).)
SQL (Relational DB): A relational database can certainly store URL mappings and ensures strong consistency (ACID properties). However, scaling a single SQL database to internet-scale often requires sharding and careful tuning, which adds complexity. SQL might be used if the system requires complex queries or transactions (which most URL shorteners don’t need) or strict consistency. In most cases, the simpler access patterns of a URL shortener favor a distributed NoSQL solution (What is asked in a system design interview?), using the relational database either for smaller-scale systems or where strong consistency overrides scaling needs.

4. What security measures protect a URL shortener service?

Rate Limiting: Enforce limits on how many requests (or URL-create operations) each user or IP can make in a given time frame. This prevents abuse such as spamming the service with too many URLs or DDoS attacks (Common System Design Interview Examples (With Solutions)). For example, the system might allow only a certain number of shorten requests per minute per user.
Blacklists: Maintain a blacklist of disallowed URLs or domains. If a user tries to shorten a link known to be malicious, phishing, or otherwise unsafe, the service should reject it. Similarly, you might blacklist abusive users or IPs from using the API if they violate terms. This helps keep the platform free of known spam or malware links.
Authentication: Require user accounts or API keys for generating URLs, especially for advanced features (like custom aliases or bulk shortening). Authentication ensures accountability and makes it harder for anonymous actors to misuse the service (What is asked in a system design interview?). For instance, only logged-in users might be allowed to create a large number of short URLs, which deters bots and attackers.
Bot Detection: Implement bot detection and mitigation (e.g., CAPTCHAs or anomaly detection algorithms). If the system notices a single client creating or accessing thousands of short URLs in a short time, it could be an automated bot. The service should flag or throttle such behavior. Techniques include challenge-response tests (CAPTCHA), analyzing request patterns, and integrating bot detection services to differentiate real users from scripts.
Malware Scanning: Integrate security checks for target URLs. When a new URL is submitted, the shortener can scan it using malware/phishing databases or services (like Google Safe Browsing). This ensures the generated short link doesn’t direct users to a known malicious site. Additionally, on each redirect, the service can perform a quick check against an updated blacklist of URLs/domains to block any that have become unsafe since being shortened.

5. How are short URLs generated (Base62, hashing, counters, etc.)?

Generating unique short codes is a core function of a URL shortener. Several methods are commonly used, each with pros and cons:

Base62 Encoding: This is a standard way to create compact alphanumeric codes. The idea is to take a number (for example, an auto-incrementing ID) and encode it in a base-62 system (digits 0-9, lowercase a-z, and uppercase A-Z). Base62 can represent large numbers in only a few characters – e.g., a 6-character code in Base62 yields roughly 56 billion combinations. This provides a huge space of unique IDs while keeping the URL short.
Hashing: In this approach, the service computes a hash of the original URL (using a hash function like MD5 or SHA-256) and then uses a portion of that hash (possibly encoded in Base62) as the short code (Designing a URL Shortening Service like TinyURL). This method ensures the same long URL always maps to the same short code (unless you add randomness). However, hashing can lead to collisions (different URLs yielding the same hash) if not handled carefully, so the system may need to check for collisions and resolve them (for example, by appending an extra character or re-hashing with a salt in case of duplicates).
Counter/Sequential ID: A simple and popular scheme is to use a global counter or sequence. Each time a new URL needs shortening, increment the counter and use that number as the unique ID. The number is then encoded (often in Base62) to produce the short string. This guarantees unique codes with no collisions and is straightforward to implement. The challenge is ensuring the counter is distributed (if you have multiple servers) or using a centralized service (like a database sequence or a dedicated ID generator service, e.g., Twitter’s Snowflake algorithm) to generate IDs without conflicts.
UUIDs/Random Strings: Another method is to generate a random unique code for each URL, such as using a UUID or a secure random generator. This doesn’t require keeping a counter, but raw UUIDs are quite long (128 bits). To use them in URLs, they might be Base64/Base62 encoded or truncated. Truly random codes have a very low chance of collision, especially if they are long enough, but you might still check for uniqueness in the database. Some systems pre-generate a list of random codes (or use a Key Generation Service) so that each incoming URL can just pick an unused code. The downside is that purely random or UUID-based codes can be longer than sequential codes and may not be as human-readable, but they are effective for decentralization and avoiding any single point of ID generation.

Each of these methods can be used in real-world URL shorteners, and often a combination is employed (e.g., a counter for most cases and a fallback to random generation if needed, or hashing with a tweak to avoid collisions). The key is to ensure the generated short link is unique, not easily guessable, and sufficiently short. (Designing a URL Shortening Service like TinyURL)

Here are few more related links:

System Design Interview

Data Partitioning

What our users say

ABHISHEK GUPTA

My offer from the top tech company would not have been possible without this course. Many thanks!!

AHMET HANIF

Whoever put this together, you folks are life savers. Thank you :)

Ashley Pean

Check out Grokking the Coding Interview. Instead of trying out random Algos, they break down the patterns you need to solve them. Helps immensely with retention!