Given a set of points, return the set of n closest points to the origin.
The Idea: The distance to any arbitrary point (x, y) from the origin can be defined with pythagoreans distances formula. d=∣x2−x1∣2+∣y2−y1∣2. So all we do is sort on this basis.
Complexity: O(NlogN) time and O(1) space
Implementation 1:
using point = pair<int, int>;
double distance(const point &p)
{
return sqrt(pow(p.first, 2) + pow(p.second, 2));
}
vector<point> closest_points( vector<point> &points, int n)
{
sort(points.begin(), points.end(), [](auto &left, auto &right) {
return distance(left) < distance(right);
});
return vector<point>(points.begin(), points.begin() + n);
}
int main()
{
vector<point> test = {
{1,1}, {10,20}, {15, 8}, {1,0}, {2,3}, {5,6}
};
vector<point> cp = closest_points(test, 3);
for (point p : cp) {
cout << "(" << p.first << ", " << p.second << ")" << endl;
}
}
Implementation 2:
Same complexity, subtle variation of work.
struct Point {
Point(int x, int y) : x(x), y(y) {
distance = _distance(x, y);
};
int x, y;
double distance;
private:
// from the origin (0,0) -> (a,b)
double _distance(double a, double b) {
return(sqrt(pow(a, 2) + pow(b, 2)));
}
};
vector<Point> n_closest_points(const vector<pair<int, int>> &pts, const int n) {
vector<Point> points;
points.reserve(pts.size());
for (auto &point : pts) points.emplace_back(point.first, point.second);
sort(points.begin(), points.end(),
[](const Point & a, const Point & b) {
return a.distance < b.distance;
});
return vector<Point>(points.begin(), points.begin() + n);
}
int main()
{
vector<pair<int, int>> pts = { {1,1}, {1,10}, {2,3}, {0,0}, {3,2} };
for (int n = 1; n <= pts.size(); n++) {
vector<Point> n_close = n_closest_points(pts, n);
for (auto p : n_close) {
cout << p.x << ' ' << p.y << '\t';
}
cout << endl;
}
}
Implementation 3:
The Idea: The key point to realize is that we only need to maintain k elements, and these elements do not have to be sorted. If we maintain a heap of k elements, and simply continue popping the max distance (root) of the heap with the next smallest distance, we only do work proportional to the amount of elements maintained in the heap, which is k.
Complexity: O(N) + O(klog(k)) + O(k-n) + O(k). O(N) for finding the euclidean distances first, then O(klog(k)) for heapifying the first k elements, then O(k-n) checking the maintaining distances, and finally O(k) for pretty printing the actual coordinates from the tuples that were initially created.
import mathimport heapqdefn_closest_points(pts,k):# make turn into max heap by making distance negative pts2 = [(-math.sqrt(x*x + y*y), x, y) for x, y in pts] k =min(k, len(pts2))# add in the first k elements of the list into heap# heapq will compare by default with the first element# of the tuple so no need to define comparator pq = pts2[0: k] heapq.heapify(pq)# append the remaining points to heapq.# since heapq only needs to manage k points at a time# it runs very efficiently# we only replace elements in the heap if we exceed the# maximum element of the top of the heap which will be the# maximum distancefor pt in pts2[k:]:# nsmallest will return a list of the nsmallest elements root = heapq.nsmallest(1, pq)[0][0]if pt[0]> root: heapq.heapreplace(pq, pt)return [(x, y) for dist, x, y in heapq.nsmallest(k , pq)]pts1 = [[1,1], [1,10], [2,3], [0,0], [3,2]]pts2 = [[1,1], [10,20], [15,8], [1,0], [2,3], [5,6]]print(n_closest_points(pts1, 3))print(n_closest_points(pts1, 2))print(n_closest_points(pts1, 1))print(n_closest_points(pts2, 5))print(n_closest_points(pts2, 4))