362. Design Hit Counter - Detailed Explanation

Free Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog. Take a step towards a better tech career now!

Problem Statement:

Design a Hit Counter that counts the number of hits received in the past 5 minutes (i.e., past 300 seconds). The system should provide two functions:

  • hit(timestamp): Record a hit that occurs at the given timestamp (in seconds).
  • getHits(timestamp): Return the number of hits in the past 5 minutes from the given timestamp (i.e., count all hits that occurred from timestamp-299 to timestamp, inclusive).

Assume that:

  • Timestamps are given in non-decreasing chronological order (each new call’s timestamp is >= previous call’s timestamp).
  • The earliest timestamp starts at 1.
  • It’s possible to have multiple hits at the same timestamp (multiple calls to hit(t) in the same second).

Example 1:
Operations: hit(1), hit(2), hit(3), getHits(4), hit(300), getHits(300), getHits(301)

  • After hits at timestamps 1, 2, 3, getHits(4) counts hits in [4-299, 4] = [−295, 4], which includes 1, 2, 3. Output: 3.
  • After a hit at 300, getHits(300) counts hits in [300-299, 300] = [1, 300], which includes 1, 2, 3, 300. Output: 4.
  • getHits(301) counts hits in [301-299, 301] = [2, 301], which includes 2, 3, 300 (hit at 1 is now 300 seconds old and no longer counted). Output: 3.

Example 2:
Operations: hit(100) (called 3 times), getHits(100), getHits(101), getHits(400)

  • Three hits at timestamp 100.

  • getHits(100) counts hits in [100-299, 100] = [−199, 100], which includes all three hits at 100. Output: 3.

  • getHits(101) counts hits in [101-299, 101] = [−198, 101], still includes the three hits at 100 (they occurred within the last 5 minutes). Output: 3.

  • getHits(400) counts hits in [400-299, 400] = [101, 400]. The hits at 100 are now more than 5 minutes old (100 is outside [101,400]), so they are not counted. Output: 0.

Example 3:
Operations: hit(1), hit(2), hit(3), getHits(300), getHits(600), hit(600), getHits(600)

  • Hits at 1, 2, 3.

  • getHits(300) counts hits in [300-299, 300] = [1, 300], which includes 1, 2, 3. Output: 3.

  • getHits(600) (with no new hits since 3) counts hits in [600-299, 600] = [301, 600]. All hits at 1, 2, 3 are now expired (older than 5 minutes). Output: 0.

  • A new hit at 600.

  • getHits(600) now counts hits in [600-299, 600] = [301, 600] again, but this time it includes the hit at 600. Output: 1.

Constraints & Expectations:

  • We need to handle a continuous stream of hits efficiently. Even if there are many hits, the operations should be fast.
  • Memory should be managed so that we do not store an unbounded number of past hits unnecessarily (only the last 5 minutes of hits are relevant at any time).
  • The solution should handle the case of no hits gracefully (return 0 for getHits if there have been no hits in the past 5 minutes).

Hints:

  • Hint 1: Think about the sliding window of time (5 minutes = 300 seconds). How can you keep track of hits that fall into the last 300-second window and discard those that are older? What data structure allows adding new items and removing old items in order? (Consider using a FIFO structure like a queue.)

  • Hint 2: A naive approach would be to record every hit and then on each query count how many fall in the range. However, iterating over all hits for each query is inefficient. Instead, try to update the count incrementally as new hits come in and old hits fall out of the 5-minute window.

  • Hint 3 (Follow-up idea): If there could be an extremely large number of hits per second, storing every single hit might be inefficient. Can you group hits by timestamp to optimize memory and counting? (For example, store pairs of {timestamp, number_of_hits_at_that_timestamp} in your data structure.)

Approaches:

  1. Brute Force Approach:
    • Idea: Use a list (or array) to record the timestamp of every hit. For getHits(timestamp), iterate through the list and count how many hits fall within the last 300 seconds of the given timestamp.

    • Why it's inefficient: If there are N hits recorded, each call to getHits may scan up to N elements to count relevant hits. In the worst case with many hits, this becomes very slow (linear time per query). As the number of hits grows, the time to answer a query grows too, which is not optimal. Also, if we never remove old hits, the list can grow indefinitely, wasting memory on hits that are no longer needed.

    • Walkthrough Example:

      Suppose we have hits at times [10, 50, 300, 301] and we call getHits(301). A brute-force method would scan the entire list: it checks 10 (which is older than 301-299=2, so not counted), checks 50 (older than 2, not counted), checks 300 (within [2,301], counted), checks 301 (within [2,301], counted). It would return 2. If we then call getHits(302), it would again scan all hits. As this list grows, scanning all elements for each query is slow.

Python Implementation (Brute Force):

Python3
Python3

. . . .

Java Implementation (Brute Force):

Java
Java

. . . .

Complexity Analysis (Brute Force):

  • Time Complexity: Recording a hit is O(1) (just appending to a list). However, each getHits call is O(N) in the worst case, where N is the number of hits recorded, because we potentially check each hit. If many queries are made or N is large, this is very slow.

  • Space Complexity: O(N) for storing all hit timestamps. This can become large if the system runs for a long time without cleaning up old hits.

  1. Optimized Approach (Using Queue/Deque):
    • Idea: Use a queue to maintain only the hits from the last 300 seconds. The queue naturally preserves the order of hits (oldest in front, newest at back). For each new hit(t), add the timestamp to the queue. For each getHits(t), first remove any timestamps from the front of the queue that are older than 5 minutes from t (i.e., remove hits with timestamp <= t-300). The remaining elements in the queue are all hits within the last 300 seconds. The answer is then the length of the queue.

    • Why it’s efficient: The queue only holds hits that are still relevant (within the last 5 minutes). Old hits get removed as we process queries, so the queue size is at most 300 seconds worth of hits. In the worst case where there’s at least 1 hit every second, the queue would hold at most 300 entries (or slightly more if multiple per second, unless we group them). Each hit is enqueued once and dequeued at most once, making each operation very efficient on average.

    • Walkthrough Example: Consider the sequence of operations from Example 1:

      • At t=1: call hit(1). Queue = [1].

      • At t=2: call hit(2). Queue = [1, 2].

      • At t=3: call hit(3). Queue = [1, 2, 3].

      • At t=4: call getHits(4). Before counting, remove hits older than 5 minutes from time 4 (remove hits ≤ 4-300 = -296). None are older, so Queue stays [1,2,3]. Count = 3.

      • At t=300: call hit(300). Queue = [1, 2, 3, 300].

      • At t=300: call getHits(300). Remove hits older than 5 minutes from time 300 (remove hits ≤ 300-300 = 0). None are ≤0, so Queue stays [1,2,3,300]. Count = 4.

      • At t=301: call getHits(301). Remove hits older than 5 minutes from time 301 (remove hits ≤ 301-300 = 1). This removes the hit at timestamp 1. Now Queue = [2, 3, 300]. Count = 3.

      This matches the expected outputs: 3, 4, and 3 for the getHits calls, respectively. Notice how the queue automatically drops the oldest hit when it falls out of the 300-second window, and we never iterate over all hits explicitly – we just pop from the front until the oldest timestamp is within the window.

Python Implementation (Optimized with Queue):

Python3
Python3

. . . .

Java Implementation (Optimized with Queue):

Java
Java

. . . .

Complexity Analysis (Optimized):

  • Time Complexity: Both hit and getHits operate in O(1) time on average. Recording a hit is O(1). For getHits, removing expired hits is O(k) where k is the number of hits to remove, but each old hit is removed exactly once in its lifetime. Thus, each hit will be dequeued at most one time, making the amortized time per operation O(1). This is much faster than the brute force approach for large N.
  • Space Complexity: O(H) where H is the number of hits in the last 5 minutes. In the worst case (if there’s at least one hit every second and we don't group them), H could be up to 300 (or a bit more if multiple per second without grouping). This is effectively O(300) which is a constant upper bound, or O(N) in terms of N hits but with old hits constantly removed. The memory use remains bounded by the 5-minute window of data, which is efficient.

Step-by-Step Walkthrough (Optimized Approach Visualization):

Let's visualize how the optimized queue approach works with a sequence of operations. We will use a queue to store recent hit timestamps and update it as hits come in or as we query the hit count.

Suppose we perform the following operations:

hit(1) -> hit(2) -> hit(3) -> getHits(4) -> hit(300) -> getHits(300) -> getHits(301)

We maintain a queue of timestamps of hits that are within the last 5 minutes of the current time:

  • After hit(1): Queue contains [1]. (Hit at 1 recorded)

  • After hit(2): Queue contains [1, 2]. (Hits at 1 and 2 recorded in order)

  • After hit(3): Queue contains [1, 2, 3]. (Hits at 1, 2, 3 in the queue)

  • On getHits(4): We check for hits older than 5 minutes from time 4. The window is [4-299, 4] = [-295, 4], so no hits are too old (all current hits 1,2,3 are within the last 300 seconds). Queue remains [1, 2, 3]. The number of hits in the queue is 3, so getHits(4) returns 3.

  • After hit(300): Queue contains [1, 2, 3, 300]. (We add the hit at 300; note that 1, 2, 3 are still in queue but 1 is now exactly 299 seconds older than 300)

  • On getHits(300): Now check for hits older than 5 minutes from time 300. The window is [300-299, 300] = [1, 300], so we should keep hits that occurred at timestamp ≥1. The hit at 1 is exactly at the boundary (300 - 299 = 1, so 1 is 299 seconds older than 300, which is within 5 minutes). Therefore, no hits are removed yet (since we remove hits with timestamp ≤ 0, and none qualify). Queue stays [1, 2, 3, 300]. The queue size is 4, so getHits(300) returns 4.

  • On getHits(301): Now consider the window [301-299, 301] = [2, 301]. Any hit with timestamp < 2 is now out of the 5-minute range. The hit at 1 is older than 5 minutes from 301 (because 301 - 300 = 1, so anything at 1 or earlier is beyond 300 seconds ago). We remove 1 from the queue. Queue becomes [2, 3, 300]. All remaining hits (2, 3, 300) are within the last 5 minutes (between 2 and 301). The queue size is 3, so getHits(301) returns 3.

This step-by-step process shows how we incrementally maintain the count of hits. We never recount from scratch; we just drop off old hits as they expire and add new ones as they come, keeping the queue up to date. This makes each operation efficient.

Common Mistakes:

  • Not removing expired hits: A common mistake is to record hits but forget to remove hits that are older than 5 minutes. This can cause getHits to return incorrect counts (too high) and also waste memory. Always ensure you discard hits that fall outside the 300-second window when processing a query.
  • Off-by-one errors in time window: It’s easy to make a mistake with inclusive/exclusive bounds. Remember that if we want the last 5 minutes inclusive, we consider hits with timestamps in the range [timestamp-299, timestamp]. In implementation, removing hits with timestamp <= current_time - 300 is correct, because for a current time t, anything at time t-300 or earlier is more than 5 minutes old (since at t-300 it’s exactly 5 minutes difference, which should be excluded). In our example, a hit at time 1 is counted at time 300 (difference 299s) but not at time 301 (difference 300s). Be careful with these boundary conditions.
  • Using the wrong data structure: Some might try using structures like a fixed-size array or a hash map without fully understanding how to evict old entries. A queue is conceptually simpler for this problem. If using an array of size 300 (indexed by second modulo 300), one might forget to reset counts when reusing an index or check if an entry is stale (the timestamp associated with that index is not within the current window). Ensure that if you use a circular buffer, you also store the timestamp of the last update for each slot to know if it's still valid.
  • Not handling multiple hits in the same second: If multiple hits occur at the same timestamp and you choose to optimize by grouping (e.g., storing [timestamp, count] pairs), be careful to update the count rather than adding multiple separate entries for the same timestamp in that structure. If you don't account for this, you might either overcount or undercount. (In the simple queue approach above, we simply add multiple entries of the same timestamp, which works fine too.)
  • Assuming queries only come after hits: In some designs, people might assume getHits will always be called after at least one hit. Make sure your code returns 0 for getHits if no hits have been recorded in the past 5 minutes or even if no hits at all have been recorded yet.

Edge Cases:

Consider these edge cases to ensure your solution handles them correctly:

  • No hits at all: Calling getHits on a brand new counter (or after a long period with no hits) should return 0.
  • No hits in the last 5 minutes: If there were hits earlier, but none in the recent 5-minute window, the function should also return 0 (and ideally, the data structure would have removed all old hits). For example, if hits occurred at times 1,2,3 and now the timestamp is 600, there are no recent hits to count.
  • Continuous hits over a long time: The data structure should not grow indefinitely. For instance, if there is at least one hit every few seconds for hours, the queue approach should continuously drop old timestamps and not use more memory than necessary.
  • Multiple hits at the exact same timestamp: e.g., hit(10) called 5 times when timestamp = 10. The getHits(10) should count all 5 hits. Our queue approach handles this by having five entries of value 10 (or if optimized by grouping, one entry with count 5). Ensure your logic accounts for all hits.
  • Timestamp jumps forward significantly: If a getHits(t) is called where t is much larger than the last recorded hit’s time (e.g., last hit at 100, now getHits(1000)), the code should correctly remove all old hits. In our queue approach, the while-loop will clear out everything since all hits are older than t-300. The result will be 0. Make sure such scenarios don’t break the logic (like an empty queue should be handled properly).
  • Follow-up (High Hit Rate): The LeetCode problem asks how to handle the case where there could be a very large number of hits per second. In that scenario, storing every single hit might use too much memory. A common improvement is to group hits by second. For example, instead of storing a timestamp for each hit, store pairs like (timestamp, hit_count_in_that_second). If multiple hits share the same timestamp, you increment the count for that timestamp. The queue would then store at most 300 entries (one per second in the window). This reduces memory usage when there are many hits in the same second, while logic for adding and evicting remains similar.
  • Sliding Window Average or Sum: Problems like “Moving Average from Data Stream” (LC 346) or “Maximum number of events in a time frame” use similar sliding window techniques. They require maintaining a window of the last N elements or last T seconds of data and computing a statistic. Techniques from this hit counter (using a queue or circular buffer) apply to those as well.
  • Recent Calls Counter: LeetCode has a very similar problem called “Number of Recent Calls” (LC 933) where you have to count recent requests in the last 3000 milliseconds. The solution is almost identical: using a queue to store timestamps of calls and removing old ones. If you understood the Hit Counter, you can solve that easily.
  • Logger Rate Limiter: Another related problem (LC 359) involves limiting how often a message can be logged within a time frame. While the context is different (it's about allowing/disallowing messages rather than counting hits), it also uses a time-window concept (in that case, a dictionary of message -> last timestamp seen, or similar sliding window logic). This is a good exercise in thinking about time-based data structures.
TAGS
leetcode
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
What is the best coding challenge platform?
Incorporating unit tests and edge case checks in interviews
What is the minimum salary for a CCNA?
Related Courses
Image
Grokking the Coding Interview: Patterns for Coding Questions
Grokking the Coding Interview Patterns in Java, Python, JS, C++, C#, and Go. The most comprehensive course with 476 Lessons.
Image
Grokking Modern AI Fundamentals
Master the fundamentals of AI today to lead the tech revolution of tomorrow.
Image
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
Image
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.
;