1792. Maximum Average Pass Ratio - Detailed Explanation
Problem Statement
Imagine a school with several classes. Each class has a certain number of students who will pass the exam and a total number of students in the class. We are given this information as a list of classes, where each class is represented as [pass_i, total_i]
. We also have extraStudents
– a number of additional brilliant students we can assign to any classes. These extra students are guaranteed to pass the exam in whichever class they go to. Our goal is to distribute these extra students among the classes in such a way that the average pass ratio across all classes is maximized.
- Pass ratio of a class: The fraction of students in the class who pass, i.e.
pass_i / total_i
. - Average pass ratio: The average of all classes’ pass ratios (sum of pass ratios divided by number of classes).
We need to return the maximum possible average pass ratio after optimally assigning all extraStudents
to classes. (The answer is typically accepted within an error of 1e-5.)
Example 1:
- Input:
classes = [[1, 2], [3, 5], [2, 2]]
,extraStudents = 2
- Output:
0.78333
- Explanation: Initially, the pass ratios are 1/2 = 0.5, 3/5 = 0.6, and 2/2 = 1.0. We have 2 extra students to assign. The optimal assignment is to put both extra students into the first class (which currently has the lowest pass ratio). This yields new class stats: first class becomes [3,4] (3 passes out of 4), second stays [3,5], third stays [2,2]. The new pass ratios are 3/4 = 0.75, 3/5 = 0.6, 2/2 = 1.0. The average of these is (0.75 + 0.6 + 1.0) / 3 = 0.78333 .
Example 2:
- Input:
classes = [[2, 4], [3, 9], [4, 5], [2, 10]]
,extraStudents = 4
- Output:
0.53485
- Explanation: One optimal distribution is to assign the extra students to the classes in such a way that two go to the first class, one to the second class, and one to the fourth class. After assignment, the classes could become: [4,6], [4,10], [4,5], [3,11]. Their pass ratios would be 4/6 ≈ 0.667, 4/10 = 0.4, 4/5 = 0.8, 3/11 ≈ 0.273. The average of these ratios is approximately 0.53485. (There could be other distributions that yield the same maximum average.)
Example 3:
- Input:
classes = [[1, 3], [2, 4]]
,extraStudents = 1
- Output:
0.50000
- Explanation: Initially, the pass ratios are 1/3 ≈ 0.3333 and 2/4 = 0.5, with an average of ~0.4167. We have 1 extra student. If we add this student to the first class, that class becomes [2,4] with ratio 0.5, and the second remains [2,4] with ratio 0.5. The new average is (0.5 + 0.5)/2 = 0.5. If instead we added the student to the second class, we’d get [1,3] (0.3333) and [3,5] (0.6) with an average of ~0.4667. So the best choice is to assign the extra student to the first class, giving an average of 0.5.
Constraints:
1 <= classes.length <= 10^5
(there can be up to 100,000 classes)- Each
classes[i]
has two values:1 <= pass_i <= total_i <= 10^5
(passes and totals are positive, and passes are not more than total) 1 <= extraStudents <= 10^5
(up to 100,000 extra students can be assigned)- The answer will be checked with a tolerance of 1e-5, so minor floating-point differences are acceptable.
Hints
-
Hint 1: Consider how adding an extra student (who will definitely pass) affects a class’s pass ratio. Try to compute the increase in pass ratio for a class if you add one extra passing student. (Think in terms of a formula for the difference.)
-
Hint 2: Each extra student gives a diminishing benefit if you keep adding them to the same class. The first extra student might increase a class’s pass ratio a lot, but the next extra student added to that same class will increase it a bit less, and so on. This means we should be strategic about where to put each student.
-
Hint 3: A good strategy is to always assign the next student to the class that would gain the most improvement in pass ratio from that one student. This hints at a greedy approach. Can you think of a data structure that efficiently supports repeatedly picking the class with the maximum potential gain?
Multiple Approaches
We will discuss two approaches to solve this problem:
- Approach 1: Brute Force Greedy (Naively assign students one by one by scanning through classes).
- Approach 2: Optimized Greedy using a Priority Queue (Max-Heap).
Both approaches use the idea of greedily assigning each extra student to the class where it yields the most benefit. The difference lies in how efficiently we find that class at each step.
Approach 1: Brute Force Greedy (Naive)
Idea: Distribute the extra students one at a time. For each extra student, check every class to find out which class would get the largest increase in pass ratio if the student were added there. Assign the student to that class, update the class’s pass and total counts, then repeat for the next student.
Step-by-step:
-
Calculate initial ratios: Start by computing the current pass ratio for each class (
pass_i / total_i
). Also, for each class, figure out how the pass ratio would change if an extra passing student is added. The increase (or "gain") for adding one student to class i can be calculated as:
[ \text{gain}_i = \frac{pass_i + 1}{total_i + 1} - \frac{pass_i}{total_i} ]
This formula gives the marginal improvement in the class’s pass ratio if we add one student. -
Greedy selection (without a heap): Among all classes, find the class with the highest gain for adding a student. Assign one extra student to that class (meaning increment
pass_i
andtotal_i
for that class). -
Update and repeat: After adding the student to that class, update that class’s pass ratio and recompute its gain for any further students. Then again find the class with the largest gain among all classes for the next student.
-
Continue this process until all
extraStudents
have been assigned.
By the end, all extra students are allocated in a (hopefully) optimal way. Then we compute the average pass ratio across all classes with the updated numbers.
Visual Example (Approach 1):
Suppose classes = [[1,2], [3,5], [2,2]]
and extraStudents = 2
(Example 1). Initially, the gains for each class if we add one student are:
- Class1 (1/2 → 2/3): gain = 0.6667 - 0.5 = 0.1667
- Class2 (3/5 → 4/6): gain = 0.6667 - 0.6 = 0.0667
- Class3 (2/2 → 3/3): gain = 1.0000 - 1.0 = 0 (class3 is already at 100% pass rate)
The best gain is for Class1, so we assign the first extra student to Class1. Now classes become [[2,3], [3,5], [2,2]]
. Recompute gains for another extra student:
- Class1 (now 2/3 → 3/4): gain = 0.75 - 0.6667 = 0.0833
- Class2 (still 3/5 → 4/6): gain = 0.0667
- Class3 (still 2/2 → 3/3): gain = 0
Now Class1 still has the highest gain, so assign the second student to Class1 again. Classes become [[3,4], [3,5], [2,2]]
. No more extra students left. Finally, compute the average: (3/4 + 3/5 + 2/2) / 3 = 0.78333, as required.
Pros: This greedy strategy will find the optimal distribution (it’s essentially the right idea). It’s straightforward to implement by following the problem statement directly.
Cons: It’s extremely slow for large inputs. Each extra student assignment involves scanning all classes to find the best gain. In the worst case, if there are n
classes and m
extra students, this approach can take O(n * m) time, which is not feasible when both n and m can be up to 100k (that could be 10^10 operations!). We need to optimize the selection of the best class.
Approach 2: Optimized Greedy with a Priority Queue (Max-Heap)
Idea: We still assign students one by one to the class with the highest current gain, but we use a max-heap (priority queue) to efficiently retrieve the class with the largest gain at each step . Instead of scanning all classes for each student, we keep the classes organized by their potential gain.
Step-by-step:
-
Compute initial gains: As before, calculate the initial gain for adding one student to each class using the formula ((pass_i+1)/(total_i+1) - (pass_i/total_i)). Push each class into a max-heap keyed by this gain (i.e., the class with the highest gain should be at the top of the heap).
-
Greedy allocation using heap: While we still have extra students to assign:
- Extract the class with the maximum gain from the heap (this is O(log n) operation for the heap).
- Add one extra student to this class (update its
pass
andtotal
by +1). - Recompute that class’s new gain after adding the student.
- Push the class back into the heap with its updated gain (so it can be considered for further students).
-
After assigning all
extraStudents
, compute the average pass ratio across all classes using the finalpass/total
values.
Using the heap ensures that at each step we get the class with the largest marginal increase without scanning all classes. The heap order updates dynamically as classes’ gains change when we add students.
Why this works: Every time we assign a student, we choose the class where this single addition has the most impact on improving the average. This greedy choice is optimal because adding the student to any other class first would yield a smaller immediate improvement, and due to the diminishing returns property, we can never get that missed improvement back later. By always taking the maximum marginal increase first, we ensure no opportunity for a larger gain is wasted. This strategy is guaranteed to produce the maximum average in the end.
Visual Example (Approach 2):
Let’s walk through the same example [[1,2],[3,5],[2,2]]
with extraStudents=2
using a max-heap:
-
Initial heap: We compute gains as in Approach 1. We push (gain, class) for each class into a max-heap. Initially the heap contains:
(0.1667, Class1), (0.0667, Class2), (0, Class3)
with Class1 at the top (max gain). -
Assign 1st extra: Pop from heap -> get Class1 (gain 0.1667). Assign one student to Class1 (now [2,3]). Recompute Class1’s gain = 0.0833. Push Class1 back with gain 0.0833. (Class2 and Class3 remain in heap with their old gains because they haven’t changed yet.)
-
Heap after 1st assignment:
(0.0833, Class1), (0.0667, Class2), (0, Class3)
– Class1 still top. -
Assign 2nd extra: Pop from heap -> Class1 again (gain 0.0833). Assign one student to Class1 (now [3,4]). Recompute gain for Class1 = 0.0500 (if another student were added). Push it back. Extra students are now 0, so we stop.
-
We then compute the average from final ratios: same result 0.78333.
Even though Class1 kept being the top candidate in this case, the heap was crucial for efficiency. In larger cases, different classes will pop to the top as their gains overtake each other when updated.
Pros: This approach is much faster. Using a max-heap reduces the selection time for each student from O(n) to O(log n). The overall time becomes O((n + m) log n) for n classes and m extra students, which is feasible for n, m up to 100k. The greedy strategy ensures the result is optimal.
Cons: The implementation is a bit more complex due to using a priority queue and maintaining updates. Also, we need to be careful with precision (use double or a suitable type for gains) and performance of recomputing the gain. But this is manageable.
Code Implementations
Below are implementations of both approaches in Python and Java, along with example usage in a main
or test function to demonstrate the solution.
Python Implementation – Brute Force Greedy
Note: This brute force approach will work for the small examples, but it will be very slow for large inputs. It is mainly for understanding and not intended for actual large-scale use.
Python Implementation – Optimized Greedy (Priority Queue)
Java Implementation – Brute Force Greedy
Java Implementation – Optimized Greedy (Priority Queue)
Note: In the Java optimized solution, we use a custom comparator for the priority queue that always compares the current gain of two classes. We directly store the class’s [pass, total]
in the heap and compute the gain on the fly in the comparator each time it’s needed. This ensures the heap is ordered by the latest gain values without manually updating them (because whenever a class is polled and re-offered after updating, its gain changes appropriately in the comparator’s eyes).
Complexity Analysis
-
Brute Force Greedy (Approach 1): Let
n
be the number of classes andm
be the number of extra students.- Time Complexity: O(n * m). For each of the
m
extra students, we scan through up ton
classes to find the best gain. In the worst case, with bothn
andm
up to 100k, this could be on the order of 10^10 operations, which is not tractable. - Space Complexity: O(1) (or O(n) if we count storing the class list and some minor overhead). We only use a few variables for tracking indices and gains. The class data can be considered input space. No significant extra space is used besides that.
- Time Complexity: O(n * m). For each of the
-
Optimized Greedy with Heap (Approach 2):
-
Time Complexity: O((n + m) log n). We initially push n classes into the heap (cost ~O(n) for building the heap). Then we perform m iterations, each doing a
poll
andoffer
on the heap, which are O(log n) each. Thus, roughly O(n + 2m * log n) which simplifies to O((n + m) log n) in big-O terms. With n, m up to 100k, this is on the order of (200k * log(100k)) which is around 200k * ~17 ≈ 3.4 million operations, easily doable. -
Space Complexity: O(n). We store all classes in the heap. Aside from that, we use a constant amount of extra variables. So it’s linear in the number of classes (which in the worst case is 100k, fine for memory).
-
Comparison: The optimized approach is significantly better for large inputs. The brute force may be conceptually simpler, but it won’t run within time limits for large constraints. The heap-based approach handles the maximum input sizes efficiently.
Step-by-Step Walkthrough (Optimized Solution)
Let’s do a detailed walkthrough of the optimized greedy approach using a concrete example to see how the algorithm progresses at each step. We’ll use Example 1: classes = [[1,2], [3,5], [2,2]]
, extraStudents = 2
.
Initial State:
We start by calculating the initial pass ratios and the initial gains for each class if one student is added:
Class | Pass/Total (Current Ratio) | Ratio if +1 student | Gain (Increase in Ratio) |
---|---|---|---|
1 | 1/2 = 0.5000 | 2/3 ≈ 0.6667 | +0.1667 |
2 | 3/5 = 0.6000 | 4/6 ≈ 0.6667 | +0.0667 |
3 | 2/2 = 1.0000 | 3/3 = 1.0000 | +0.0000 |
We then create a max-heap of these gains. Initially, the heap (max at top) would contain:
[(+0.1667, Class1), (+0.0667, Class2), (+0.0000, Class3)]
(We keep track of which class each gain belongs to.)
Iteration 1 (Assign first extra student):
- We extract the top of the heap, which is Class1 with gain 0.1667. This means currently, Class1 is the best place to put an extra student.
- Assign the extra student to Class1. Class1’s stats update from [1,2] to [2,3].
- Now recalculate Class1’s new ratio and new gain:
-
New ratio for Class1 = 2/3 ≈ 0.6667.
-
If we were to add another student to Class1, the ratio would be 3/4 = 0.75. The new gain for an additional student would be 0.75 - 0.6667 = 0.0833.
-
- We push Class1 back into the heap with its updated gain 0.0833. Classes 2 and 3 remain in the heap with their previous gains (they haven’t changed yet because we didn’t add to them).
- Heap after 1st assignment:
[(+0.0833, Class1), (+0.0667, Class2), (+0.0000, Class3)]
. Class1 is still on top, but notice its gain dropped from 0.1667 to 0.0833 after getting one student, reflecting diminishing returns.
Iteration 2 (Assign second extra student):
- Again, extract the top of the heap. Now it’s Class1 with gain 0.0833 (it’s still the highest gain; Class2 is 0.0667, Class3 is 0).
- Assign the next extra student to Class1. Update Class1 from [2,3] to [3,4].
- Recalculate Class1’s new ratio and gain:
- New ratio = 3/4 = 0.75.
- If one more student were added, ratio would be 4/5 = 0.8. New gain would be 0.8 - 0.75 = 0.0500.
- Push Class1 back into heap with gain 0.0500. (If we had more extra students, we’d continue, but now we’ve used both extras.)
- Heap after 2nd assignment:
[(+0.0667, Class2), (+0.0500, Class1), (+0.0000, Class3)]
. (Class2 is now at top, but we have no more students to assign.)
Final Calculation:
Now that all extra students are assigned, we compute the average pass ratio:
- Class1: 3/4 = 0.75
- Class2: 3/5 = 0.6 (unchanged from start in this example)
- Class3: 2/2 = 1.0 (unchanged)
Average = (0.75 + 0.6 + 1.0) / 3 = 2.35 / 3 = 0.78333, which matches the expected result.
Through this walkthrough, we saw:
- How the gain values changed after each assignment (Class1’s gain went down each time it got an extra student).
- How the algorithm always picked the class with the highest current gain.
- How other classes stayed in the heap and would become the top if their gain ever surpassed the others (in this case, if we had more students, eventually Class2’s turn would come once Class1’s gain dropped below Class2’s gain).
This step-by-step process demonstrates the greedy allocation in action, using the priority queue to always make the locally optimal choice (which leads to the globally optimal result).
Common Mistakes & Edge Cases
-
Choosing the highest current ratio instead of highest gain: A common pitfall is to mistakenly add students to the class with the lowest pass percentage or the highest fail rate, or conversely the highest current ratio. In truth, neither of those naively yields the optimal result. It’s all about the marginal improvement. For example, a class with a low ratio might not always give the biggest boost if its total is huge (adding one student to a very large class has little effect on its ratio). Always compute the gain from adding a student, and use that to decide.
-
Not updating the gain after each assignment: If you add an extra student to a class and then forget to recalculate that class’s new gain for the next iteration, you’ll likely assign subsequent students incorrectly. The gain values change after each student is added to a class (diminishing returns).
-
Integer division vs floating point: When implementing, especially in languages like Java or C++, make sure to use floating-point division for calculating ratios and gains. Using integer division (which truncates) will lead to incorrect comparisons of gains (e.g., 1/2 would be 0 in integer math). In Python this isn’t an issue since
/
yields float by default. -
Precision considerations: The final result needs to be within 1e-5 of the correct answer. Using double precision (
double
in Java, Python float) is typically sufficient. Be careful if you try to format or round the result – it’s usually fine to just return the double. Summing many ratios (up to 100k classes) in double precision is okay, but avoid using single precision floats as they may introduce too much error. -
Edge Case – No extra students: If
extraStudents = 0
, simply compute and return the current average pass ratio without any changes. -
Edge Case – All students already passing: If every class already has
pass_i = total_i
(100% pass rate in each class), then the average is 1.0 and adding extra students won’t change any ratios (they’ll also pass, keeping each class at 100%). The algorithm should handle this by essentially doing nothing (each class’s gain will be 0, so it doesn’t matter where extras go). -
Edge Case – Classes with 1 student or very small classes: The formula works just fine even if a class has only one student. For example, a class [0,1] (0 out of 1 passing) has ratio 0; adding a student yields [1,2] with ratio 0.5 (gain 0.5, which is quite high). The algorithm naturally handles these cases, tending to allocate to small classes if they have low pass counts because the gain is often large.
-
Large inputs: Ensure your solution is efficient (use the heap approach). With up to 100k extra students, a slow approach will time out. Also ensure your heap operations and memory usage are optimal. Python’s heapq and Java’s PriorityQueue can handle 100k elements easily.
Alternative Variations
-
Maximizing Weighted Average: A slight variation is if we wanted to maximize the total pass ratio or a weighted average where classes have different importance. The greedy strategy would change if, say, classes had different weights. In this problem, each class contributes equally to the average, which is why we simply average the ratios. If instead we cared about the overall pass percentage of all students (combined pass/combined total), the problem would be trivial – you’d always add all extra students to any classes (it wouldn’t matter, as each extra student adds the same amount to total passes overall). It’s the averaging per class that makes it non-trivial.
-
Probabilistic variant: If those extra students weren’t guaranteed to pass but instead had some probability p of passing, the problem would turn into maximizing expected value. The greedy approach might still apply by comparing expected gains, but the calculation of “gain” would be different (it might involve diminishing returns in expected pass probability). That would be a more complex variation and not as straightforward as this guaranteed pass scenario.
-
Resource Allocation Problems: This problem is essentially a resource allocation task with diminishing returns. Similar setups could be allocating limited resources (like budget, time, or staff) to projects to maximize overall success rate or outcome. The technique we used – always assign the next resource to the project/class that yields the biggest improvement – is a common greedy strategy in such scenarios.
-
Greedy + Heap pattern: The approach here is a classic example of using a greedy algorithm with a priority queue to incrementally build an optimal solution. This pattern appears in various problems (some listed below). Whenever you have to repeatedly choose the next best option out of many and the state changes in a way that affects what the "best" next option is, a heap is a great tool to keep track of the best candidates efficiently.
Related Problems for Further Practice
To reinforce the concepts from this problem (greedy choice with priority queue, maximizing or minimizing a certain metric by distribution or selection), you can practice the following problems:
-
LeetCode 502. IPO: You have a certain initial capital and a list of projects with their profits and capital requirements. You can do at most k projects. This uses a greedy strategy with a max-heap to always pick the most profitable project available within your current capital. (It’s a similar "pick the best next option" using heaps.)
-
LeetCode 1642. Furthest Building You Can Reach: You are given heights of buildings and a limited number of bricks and ladders. You must climb from building to building. The greedy solution uses a min-heap to decide where to use ladders (always use ladders on the largest jumps). This is another resource allocation problem where you greedily allocate a resource to the most costly needs.
-
LeetCode 630. Course Schedule III: You have a list of courses with durations and deadlines, and you want to take as many courses as possible. The solution involves sorting by deadlines and using a max-heap to drop the longest course when needed. This showcases a greedy strategy where a heap is used to efficiently decide which course to drop to accommodate a new one.
-
LeetCode 1383. Maximum Performance of a Team: You are selecting engineers with speed and efficiency under certain constraints. The approach uses sorting and a min-heap to maintain the team with the largest performance. It’s a bit different, but it relies on a greedy selection and heap to maintain the optimal combination as you iterate.
GET YOUR FREE
Coding Questions Catalog
