What are different rate limiting algorithms?

Image
Free Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog. Take a step towards a better tech career now!

Rate limiting is a crucial technique in controlling the amount of traffic a server receives within a specified time frame. It's used to prevent overuse of resources, improve server reliability, and ensure fair usage among users. Rate limiting is common in API management to prevent abuse and to manage traffic effectively.

Different Rate Limiting Algorithms:

1. Fixed Window Counter

  • Description: Divides time into fixed windows and counts the number of requests in each window.
  • Example: If the limit is 100 requests per hour, and a user makes 100 requests in the first half-hour, they will be blocked for the remaining half-hour, even if the server is underutilized during that time.

2. Sliding Log

  • Description: Keeps a time-stamped log of requests. It checks whether adding a new request would exceed the rate limit, considering the time frame.
  • Example: If the limit is 100 requests per hour, each incoming request is checked against the log of requests in the past hour. Older entries are discarded.

3. Sliding Window Counter

  • Description: A hybrid of the fixed window and the sliding log, offering a balance between efficiency and precision. It combines the fixed window's simplicity and the sliding log's accuracy.
  • Example: If the limit is 100 requests per hour, the server counts requests in the current window and a fraction of the requests from the previous window, based on the time elapsed.

4. Token Bucket

  • Description: Uses tokens to control traffic flow. Tokens are added to a bucket at a regular rate and requests consume tokens. If the bucket runs out of tokens, new requests are denied.
  • Example: A bucket can hold 10 tokens and 1 token is added every 10 seconds. A request needs 1 token to pass. If there's a sudden burst of 15 requests, only 10 can go through, and subsequent requests must wait for new tokens.

5. Leaky Bucket

  • Description: Requests are added to a queue (bucket) and processed at a fixed rate to smooth out burst traffic.
  • Example: If the bucket size is 10 and the rate is 1 request per second, and a burst of 20 requests comes in, the first 10 are queued and processed at 1 per second, while the rest are either queued (if the bucket can hold them) or discarded.

Application of Rate Limiting

  • APIs and Web Services: To control traffic and prevent abuse.
  • Network Traffic: To control data flow in networks.
  • Application Servers: To prevent overload and ensure fair usage.

In implementing rate limiting, it's crucial to choose an algorithm that aligns with the system's needs, balancing between fairness, efficiency, and resource utilization.

Ref: Grokking the System Design Interview

TAGS
System Design Fundamentals
Scalability
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
What is tail call optimization?
What questions are asked in the Zscaler product manager interview?
Understanding typical interviewer expectations for senior positions
Related Courses
Image
Grokking the Coding Interview: Patterns for Coding Questions
Grokking the Coding Interview Patterns in Java, Python, JS, C++, C#, and Go. The most comprehensive course with 476 Lessons.
Image
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
Image
Grokking Advanced Coding Patterns for Interviews
Master advanced coding patterns for interviews: Unlock the key to acing MAANG-level coding questions.
Image
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.