973. K Closest Points to Origin - Detailed Explanation

Free Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog. Take a step towards a better tech career now!

Problem Statement

Given an array of 2D points on the X-Y plane (each point is an integer coordinate [x, y]) and an integer k, find the k closest points to the origin (0, 0). The distance of a point from the origin is given by the Euclidean distance formula:

[ \text{distance} = \sqrt{x^2 + y^2} ]

We need to return k points whose distances to (0,0) are the smallest. You may return the answer in any order, and the answer is guaranteed to be unique (except for the order).

Examples

Example 1:

  • Input: points = [[1,3],[-2,2]], k = 1
  • Output: [[-2,2]]
  • Explanation: The distance between (1, 3) and the origin is √10, and the distance between (-2, 2) and the origin is √8. Since √8 < √10, point (-2,2) is closer to the origin. We only need the closest k=1 point, so the answer is [[-2,2]].

Example 2:

  • Input: points = [[3,3],[5,-1],[-2,4]], k = 2
  • Output: [[3,3],[-2,4]]
  • Explanation: The two closest points to the origin are (-2,4) and (3,3) (in any order). The answer [[-2,4],[3,3]] would also be accepted because both points are in the result set regardless of order.

Example 3:

  • Input: points = [[0,0],[2,2],[3,3]], k = 2
  • Output: [[0,0],[2,2]]
  • Explanation: The point (0,0) has distance 0 (the smallest possible). The next closest point is (2,2) with distance √8 ≈ 2.83. Thus, the 2 closest points are (0,0) and (2,2). (Note: If k equals the number of points, the output would include all points.)

Constraints

  • 1 <= k <= points.length <= 10^4
  • -10^4 <= x_i, y_i <= 10^4

These constraints mean we could have up to 10,000 points, and each coordinate can be as large as 10,000 (positive or negative). The result must always include exactly k points (no more, no less).

Hints

  • Distance Comparison: To determine which points are closer, compare their squared distances x^2 + y^2 instead of the actual distance. This avoids dealing with square roots since if (d_1 < d_2), then (\sqrt{d_1} < \sqrt{d_2}). Using squared distance simplifies calculations and prevents floating-point precision issues.
  • Brute Force Thinking: How would you solve this by hand? You would likely calculate all distances and then pick the smallest k. This straightforward method can be implemented by sorting the points by distance. Keep this simple approach in mind as a starting point.
  • Optimize for Efficiency: If you don't need to fully sort all points, can you find a way to keep track of just the k closest as you go through the points? Consider using a data structure like a heap (priority queue) to maintain the closest points found so far, or an algorithm like Quickselect to avoid sorting the entire list.

Approach 1: Brute Force (Sort by Distance)

The brute-force (or straightforward) solution is to compute the distance of every point from the origin and then sort all the points by this distance. After sorting, the first k points in the sorted order will be the k closest points.

Steps:

  1. Compute Distances: Calculate the squared Euclidean distance for each point as dist = x^2 + y^2. (Using squared distance is sufficient for comparison and avoids the overhead of square root.)

  2. Sort Points: Sort the list of points in ascending order based on their distance to the origin (smallest distance first).

  3. Select k Points: Take the first k points from the sorted list. These will be the k points with the smallest distances.

  4. Return Result: Return those k points. The order of these k points in the output does not matter (any order is acceptable).

This approach is easy to implement and understand. However, sorting all n points has a time complexity of O(n log n). Given the constraints (n up to 10,000), sorting is efficient enough in practice. But we will later explore a more optimal approach.

Complexity Analysis (Brute Force): Sorting the points dominates the complexity at O(n log n) time . Computing each distance is O(1) and done n times (O(n)), which is minor compared to sorting. The space complexity is O(n) if we count the output list (or O(1) extra space if sorting is in-place). This is acceptable for n = 10,000.

Python Implementation (Brute Force)

Python3
Python3

. . . .

Java Implementation (Brute Force)

Java
Java

. . . .

Approach 2: Optimal Solution (Using a Heap / Priority Queue)

While sorting is straightforward, we can be more efficient when we only need the k closest points rather than a full sorted order. Using a heap (priority queue), we can keep track of the k closest points seen so far as we iterate through the list, without sorting all points.

The idea is to use a max-heap of size k. A max-heap allows us to quickly retrieve and remove the farthest point among the current k closest, which helps in maintaining only the k best candidates:

  • We iterate through each point and compute its distance.
  • We push each point into the heap with its distance as the key. If the heap size exceeds k, we remove the point with the largest distance (i.e., the farthest point) to keep only the k closest in the heap.
  • This way, after processing all points, the heap will contain only the k points with smallest distances.

Using a max-heap of size k ensures that we never store more than k points at once. Each insertion or removal from the heap takes O(log k) time, and we do this for n points, leading to O(n log k) time complexity. When k is much smaller than n, this is a significant improvement over O(n log n).

Steps:

  1. Initialize an empty max-heap (in languages with only min-heap, we can invert the distance sign to simulate a max-heap).

  2. Iterate over points: For each point, calculate its squared distance d = x^2 + y^2.

    • If the heap currently has fewer than k points, add the new point to the heap.
    • If the heap already has k points and the new point's distance is less (closer) than the largest distance in the heap (the root of the max-heap), then remove the farthest point from the heap and add the new point. If the new point is farther than any in the heap, skip it (since we already have k closer points).
  3. Finalize result: After processing all points, the heap contains k closest points. Extract these points from the heap. (They may not be in sorted order, which is fine.)

  4. Return the k points collected.

This approach avoids sorting all points. We essentially perform n insertions (and maybe removals) on a heap of size at most k, which is efficient especially when k << n. In the worst case where k ≈ n, the complexity is O(n log n), comparable to sorting, but for smaller k it's better.

Complexity Analysis (Heap Optimized): Time complexity is O(n log k), since each of the n points may cause a heap insertion and possibly a removal (each heap operation is O(log k)). In the best scenario (very small k), this is much faster than sorting. The space complexity is O(k) to store the heap of k closest points (plus space for output). If k = n, space is O(n). Generally, this is also efficient.

Python Implementation (Using Heap)

Python3
Python3

. . . .

Java Implementation (Using Priority Queue)

Java
Java

. . . .

Step-by-Step Walkthrough Example

Let's walk through an example step by step to see how the algorithms work. We will use the heap approach for illustration, as it dynamically updates the set of closest points:

Example: points = [[3,3], [5,-1], [-2,4]], k = 2 (this is the same as Example 2 above).

We want the 2 points closest to the origin out of these three.

  • Initial State: Start with an empty heap (no points considered yet).

  • Process (3,3): Distance = 3^2 + 3^2 = 18.

    • Heap: add (3,3).
    • Heap now contains: {(3,3)} (distance 18). (Size = 1, which is less than k, so no removals yet.)
  • Process (5,-1): Distance = 5^2 + (-1)^2 = 25 + 1 = 26.

    • Heap: add (5,-1).
    • Heap now contains: {(3,3), (5,-1)} with distances {18, 26}. (Size = 2, equals k.)
    • Currently, the farthest point in the heap is (5,-1) with dist 26, and the closer is (3,3) with dist 18.
  • Process (-2,4): Distance = (-2)^2 + 4^2 = 4 + 16 = 20.

    • The heap is full (size k=2). Check the largest distance in heap (which is 26 for point (5,-1)).
    • Compare new point's distance (20) with the largest distance in heap (26). Since 20 < 26, the new point (-2,4) is closer than the farthest point currently in heap.
    • Remove the farthest point (5,-1) from the heap, and add (-2,4).
    • Heap now contains: {(3,3), (-2,4)} with distances {18, 20}.
  • Final: The heap contains (3,3) and (-2,4) which are the two closest points to the origin. We return these points.

Step-by-step, we've maintained a set of the closest points. At the end of the iteration, we correctly have the two nearest points. Visually, if you plot these points, you can see that (3,3) and (-2,4) are indeed closer to the origin than (5,-1).

If we were using the sorting approach on this example, we would compute all distances [18, 26, 20], sort them to [18, 20, 26], and pick the first two corresponding points. Both methods yield the same result. The heap approach did not sort the third point that turned out to be farthest; it eliminated it on the fly, which saves some work.

Common Mistakes and Edge Cases

Common Mistakes to Avoid:

  • Using Actual Distance (Floating Point): It’s not necessary to use sqrt or compute the actual Euclidean distance. Using squared distance (x^2 + y^2) is enough for comparison. Computing the square root for each point could introduce floating-point errors and is slower. Many beginners unnecessarily calculate the real distance; instead, compare squared distances directly for a cleaner solution.

  • Incorrect Heap Type: If using a heap, be careful to use the correct type (min-heap vs max-heap). In many languages (like Python), the default heap is a min-heap. Using a min-heap directly and always popping the smallest distance will give you the closest point first, but you would have to pop k times to get k points (which is actually another valid approach: pop k times from a min-heap of all points). However, building a min-heap of all n points is O(n) and popping k times adds O(k log n). This is fine, but using a max-heap of size k is typically more efficient when k << n. Ensure you implement the heap logic that matches the intended approach.

  • Not Accounting for Negative Coordinates: Remember that a point can have negative x or y (e.g., [-2,4]). Squaring handles negatives (because (-2)^2 = 4, same as 2^2), so the formula works for all quadrants. Just be careful with data types if you square large numbers: in Java, for example, use a larger data type (long) if needed to avoid overflow when computing x^2 + y^2, since intermediate values might exceed the range of 32-bit int. In this problem's constraints, int is sufficient (max (10000)^2 + (10000)^2 = 2e8, which is below 2^31-1), but it's a good habit to consider overflow for larger bounds.

  • Assuming Output Order Matters: Some might think they need to output points sorted by distance. The problem specifically states that any order is fine. Don't spend extra time ordering the result – it’s not required. Just output the k points in the order you obtained them (unless the problem explicitly asks for sorted order, which in this case it does not).

Edge Cases to Consider:

  • Minimum Input: If points has length equal to k (e.g., k = n), then the answer is simply all the points. Your solution should handle this gracefully (the heap approach, for instance, will end up just keeping all points).

  • Single Point or k=1: If there is only one point or k is 1, the solution should immediately return that single smallest-distance point. Make sure no off-by-one errors occur in such simple cases.

  • Points with Equal Distances: If multiple points have the exact same distance to the origin (for example, (2,2) and (-2,-2) both have distance √8), any of those points are valid to include as long as you return k points. The problem guarantee of unique answer means they will not create a scenario where there's ambiguity in which points to choose for the k-th spot. But even if distances tie, our approach still works (the order of those tied doesn't matter). Just be aware when testing your solution that ties can happen, and the output can have any of the tie points.

  • Origin Point: A point that is exactly at the origin (0,0) has distance 0, which will always be among the closest. Ensure your code handles zeros correctly (which it will in the distance formula). If such a point is present and k >= 1, it should definitely appear in the output.

By keeping these points in mind, you can avoid common pitfalls and ensure your solution works for all valid inputs.

Alternative Variations of the Problem

This problem can be adapted or extended in a few ways:

  • K Farthest Points to Origin: Instead of closest, find the k farthest points. This is essentially the opposite criteria. It can be solved by a similar approach: for example, using a min-heap of size k (to discard closer points and keep farthest) or sorting in descending order by distance. The logic is analogous, just inverted.

  • Closest Points to a Different Reference: We might be asked to find k points closest to an arbitrary point (not the origin). The approach remains the same, except the distance formula would be (\sqrt{(x - x_0)^2 + (y - y_0)^2}) for a given reference point ((x_0, y_0)). We would still use squared distances and similar algorithms (sorting, heap, quickselect).

  • Higher Dimensions: The concept can extend to 3D points or higher dimensions. For example, "k closest points to the origin in 3D space" would use distance (\sqrt{x^2 + y^2 + z^2}). The algorithms (sorting, heap, selection) remain the same in principle.

  • Streaming Data: A variation is when points are not all known in advance, but come in one by one (a stream of points), and you need to continuously maintain the k closest points seen so far. In that scenario, using a heap is an excellent approach because you can update it with each new point in O(log k) time.

  • K Closest Elements in a Sorted Array: On a one-dimensional line (which is a simpler case), there is a classic problem of finding k closest numbers to a target value. While the context and techniques (two-pointer or binary search) differ, it’s a related idea of "finding closest by some distance metric."

Understanding the solution to k closest points can help in solving these variants, as the underlying idea of comparing distances and selecting a subset of elements by a criterion is common.

Solving similar problems helps reinforce the concepts. Here are a few related problems:

  • "Kth Largest Element in an Array" (LeetCode 215) – Find the k-th largest number in an unsorted array. This can be solved with a heap or quickselect, and is analogous in that you're selecting a subset (or one element) based on order statistics.

  • "Top K Frequent Elements" (LeetCode 347) – Given an array, find the k most frequent elements. This uses a heap or bucket sort, again focusing on selecting k items by a certain score (frequency instead of distance).

  • "K Closest Elements" (LeetCode 658) – Given a sorted array and a target, find the k closest values to the target. This is a one-dimensional version of finding closest points and can be approached with two pointers or binary search.

  • "Closest Points to Origin" in Streaming Data (variant) – Not a specific LeetCode problem, but a common interview question: given a large stream of points or a very large array that doesn’t fit in memory, how do you find the k closest points? (Hint: You would again use a max-heap of size k, reading points sequentially).

  • Amazon Delivery Optimization (Interview variation) – This is known in interviews (like an Amazon OA) as finding the closest k delivery centers or destinations to a source. It’s essentially the same problem as k closest points to origin but framed in a real-world scenario.

TAGS
leetcode
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
2357. Make Array Zero by Subtracting Equal Amounts - Detailed Explanation
Learn to solve Leetcode 2357. Make Array Zero by Subtracting Equal Amounts with multiple approaches.
What should every backend developer know?
Is ByteDance a startup?
Related Courses
Image
Grokking the Coding Interview: Patterns for Coding Questions
Grokking the Coding Interview Patterns in Java, Python, JS, C++, C#, and Go. The most comprehensive course with 476 Lessons.
Image
Grokking Modern AI Fundamentals
Master the fundamentals of AI today to lead the tech revolution of tomorrow.
Image
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
Image
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.
;