N Closest Points

Given a set of points, return the set of n closest points to the origin.

The Idea: The distance to any arbitrary point (x, y) from the origin can be defined with pythagoreans distances formula. $d = \sqrt{|x_2 - x_1|^2 + |y_2 - y_1|^2}$ . So all we do is sort on this basis.

Complexity: O(NlogN) time and O(1) space

Implementation 1:

using point = pair<int, int>;

double distance(const point &p)
{
    return sqrt(pow(p.first, 2) + pow(p.second, 2));
}

vector<point> closest_points( vector<point> &points,  int n)
{
    sort(points.begin(), points.end(), [](auto &left, auto &right) {
        return distance(left) < distance(right);
    });

    return vector<point>(points.begin(), points.begin() + n);
}

int main()
{
    vector<point> test = {
        {1,1}, {10,20}, {15, 8}, {1,0}, {2,3}, {5,6}
    };

    vector<point> cp = closest_points(test, 3);
    for (point p : cp) {
        cout << "(" << p.first << ", " << p.second << ")" << endl;
    }
}

Implementation 2:

Same complexity, subtle variation of work.

struct Point {
    Point(int x, int y) : x(x), y(y) {
        distance = _distance(x, y);
    };
    int x, y;
    double distance;

private:
    // from the origin (0,0) -> (a,b)
    double _distance(double a, double b) {
        return(sqrt(pow(a, 2) + pow(b, 2)));
    }
};

vector<Point> n_closest_points(const vector<pair<int, int>> &pts, const int n) {
    vector<Point> points;
    points.reserve(pts.size());
    for (auto &point : pts) points.emplace_back(point.first, point.second);

    sort(points.begin(), points.end(),
        [](const Point & a, const Point & b) {
        return a.distance < b.distance;
    });

    return vector<Point>(points.begin(), points.begin() + n);
}

int main()
{
    vector<pair<int, int>> pts = { {1,1}, {1,10}, {2,3}, {0,0}, {3,2} };
    for (int n = 1; n <= pts.size(); n++) {
        vector<Point> n_close = n_closest_points(pts, n);
        for (auto p : n_close) {
            cout << p.x << ' ' << p.y << '\t';
        }
        cout << endl;
    }
}

Implementation 3:

The Idea: The key point to realize is that we only need to maintain k elements, and these elements do not have to be sorted. If we maintain a heap of k elements, and simply continue popping the max distance (root) of the heap with the next smallest distance, we only do work proportional to the amount of elements maintained in the heap, which is k.

Complexity: O(N) + O(klog(k)) + O(k-n) + O(k). O(N) for finding the euclidean distances first, then O(klog(k)) for heapifying the first k elements, then O(k-n) checking the maintaining distances, and finally O(k) for pretty printing the actual coordinates from the tuples that were initially created.

import math
import heapq

def n_closest_points(pts, k):
    # make turn into max heap by making distance negative
    pts2 = [(-math.sqrt(x*x + y*y), x, y) for x, y in pts]
    k = min(k, len(pts2))

    # add in the first k elements of the list into heap
    # heapq will compare by default with the first element
    # of the tuple so no need to define comparator
    pq = pts2[0: k]
    heapq.heapify(pq)

    # append the remaining points to heapq.
    # since heapq only needs to manage k points at a time
    # it runs very efficiently
    # we only replace elements in the heap if we exceed the
    # maximum element of the top of the heap which will be the
    # maximum distance
    for pt in pts2[k:]:
        # nsmallest will return a list of the nsmallest elements
        root = heapq.nsmallest(1, pq)[0][0]
        if pt[0] > root:
            heapq.heapreplace(pq, pt)

    return [(x, y) for dist, x, y in heapq.nsmallest(k , pq)]


pts1 = [[1,1], [1,10], [2,3], [0,0], [3,2]]
pts2 = [[1,1], [10,20], [15, 8], [1,0], [2,3], [5,6]]

print(n_closest_points(pts1, 3))
print(n_closest_points(pts1, 2))
print(n_closest_points(pts1, 1))

print(n_closest_points(pts2, 5))
print(n_closest_points(pts2, 4))

PreviousFrequency Compression NextFind Sum Pairs

Last updated 5 years ago

Was this helpful?