1086. High Five - Detailed Explanation
Problem Statement
You are given a list of score records for different students. Each record is a two-element list where the first element is the student id and the second element is a score. A student can have multiple scores. Your task is to calculate the average of the top five scores for each student and return the result in the form of a list of lists sorted by the student id in ascending order.
Example 1
- Input:
items = [[1, 91], [1, 92], [2, 93], [2, 97], [1, 60], [2, 77], [1, 65], [1, 87], [1, 100], [2, 100], [2, 76]]
- Output:
[[1, 87], [2, 88]]
- Explanation:
-
For student 1, the top five scores are: 91, 92, 87, 100, 65.
The average is calculated as:
( \text{average} = \lfloor(91 + 92 + 87 + 100 + 65) / 5\rfloor = \lfloor435/5\rfloor = 87 ). -
For student 2, the top five scores are: 93, 97, 77, 100, 76.
The average is calculated as:
( \text{average} = \lfloor(93 + 97 + 77 + 100 + 76) / 5\rfloor = \lfloor443/5\rfloor = 88 ).
-
Constraints
- The number of score records can vary.
- Each record is of the form
[student_id, score]
. - It is guaranteed that each student has at least five scores.
Hints for the Approach
- Hint 1: Use a hash map (or dictionary) where the key is the student id and the value is a collection of scores. This allows you to group the scores for each student efficiently.
- Hint 2: To get the top five scores quickly for each student, you can either:
- Sort the scores in descending order and then pick the first five.
- Or use a min-heap of size five to maintain the top five scores in one pass.
- Hint 3: Once you have the top five scores for each student, compute the average (using integer division) and then return the results sorted by student id.
Approaches
Approach 1: Using Dictionary and Sorting
-
Group Scores:
Iterate over each score record and group scores by student id using a dictionary. -
Sort and Pick Top Five:
For each student, sort their scores in descending order and take the first five scores. -
Compute Average:
Calculate the average of these top five scores using integer division. -
Sort the Result:
Finally, sort the output list by student id in ascending order.
Pseudocode
initialize dictionary: student_scores
for each record in items:
student_id, score = record
add score to student_scores[student_id]
result = []
for each student_id in student_scores:
sort student_scores[student_id] in descending order
top_five = first five elements of sorted list
average = sum(top_five) // 5
add [student_id, average] to result
sort result by student_id
return result
Approach 2: Using Dictionary and Min-Heap
-
Group Scores with Min-Heap:
Instead of storing all scores and then sorting, use a min-heap (priority queue) for each student. Maintain the heap such that it only contains the top five scores.- When adding a new score, if the heap’s size is less than five, simply add the score.
- If the heap is full and the new score is greater than the smallest element in the heap, remove the smallest and add the new score.
-
Compute Average:
For each student, compute the average from the five scores present in the heap. -
Sort the Result:
Output the result sorted by student id.
Pseudocode
initialize dictionary: student_heaps
for each record in items:
student_id, score = record
if student_id not in student_heaps:
initialize a min-heap for student_id
add score to student_heaps[student_id]
if size of student_heaps[student_id] > 5:
remove smallest element from student_heaps[student_id]
result = []
for each student_id in student_heaps:
average = sum(student_heaps[student_id]) // 5
add [student_id, average] to result
sort result by student_id
return result
Step-by-Step Walkthrough with Visual Example
Let’s walk through Approach 1 with the provided example:
Given items:
[[1, 91], [1, 92], [2, 93], [2, 97], [1, 60],
[2, 77], [1, 65], [1, 87], [1, 100], [2, 100], [2, 76]]
Step 1: Group Scores by Student
- Student 1: [91, 92, 60, 65, 87, 100]
- Student 2: [93, 97, 77, 100, 76]
Step 2: Sort Scores in Descending Order and Pick Top Five
-
Student 1: Sorted scores → [100, 92, 91, 87, 65, 60]
Top five: [100, 92, 91, 87, 65] -
Student 2: Sorted scores → [100, 97, 93, 77, 76]
Top five: [100, 97, 93, 77, 76]
Step 3: Compute the Average
- Student 1: ( \text{average} = (100 + 92 + 91 + 87 + 65) // 5 = 435 // 5 = 87 )
- Student 2: ( \text{average} = (100 + 97 + 93 + 77 + 76) // 5 = 443 // 5 = 88 )
Step 4: Sort the Output
Result sorted by student id:
[[1, 87], [2, 88]]
Code Implementation
Python Implementation
Java Implementation
Complexity Analysis
Approach 1 (Dictionary + Sorting)
- Time Complexity:
-
Grouping scores: (O(n)) where (n) is the number of records.
-
Sorting scores for each student: If there are (m) students and each has (k) scores on average, then sorting takes (O(m \cdot k \log k)).
-
Overall, the time complexity is (O(n + m \cdot k \log k)).
-
- Space Complexity:
- (O(n)) for storing scores in the dictionary.
Approach 2 (Dictionary + Min-Heap)
- Time Complexity:
- Inserting each score into the min-heap is (O(\log 5)) (constant time since the heap size is capped at 5) for each record, leading to (O(n)) overall.
- Space Complexity:
- (O(n)) for the dictionary and the heaps.
Common Mistakes and Edge Cases
Common Mistakes
-
Not Removing Extra Scores:
Failing to ensure that only the top five scores are used when a student has more than five scores. -
Sorting Order:
Sorting in the wrong order (ascending instead of descending) can lead to incorrect averages. -
Integer Division Issues:
Be mindful that the average is calculated using integer division.
Edge Cases
-
Exact Five Scores:
When a student has exactly five scores, the solution should simply compute the average without any extra processing. -
Multiple Students:
Ensure the solution correctly sorts the output by student id. -
Handling Ties:
Although not a direct issue in this problem, be careful when there are duplicate scores or similar edge scenarios.
Alternative Variations and Related Problems
Alternative Variations
-
Top K Scores:
What if you need to calculate the average of the top K scores, where K is a parameter? -
Weighted Scores:
A variation where scores have different weights. -
Dynamic Updates:
Consider scenarios where new scores are continuously added, and the average must be updated in real-time.
Related Problems for Further Practice
-
Top K Frequent Elements:
Use heaps or dictionaries to find the most frequent elements. -
Kth Largest Element in an Array:
A problem that also benefits from min-heap usage. -
Merge Intervals:
Practice combining and sorting data for effective grouping.
GET YOUR FREE
Coding Questions Catalog
