Hone logo
Hone
Problems

Build a k-d Tree in JavaScript

This challenge asks you to implement a k-dimensional tree (k-d tree) data structure in JavaScript. k-d trees are powerful spatial data structures used for organizing points in a k-dimensional space, enabling efficient searching for nearest neighbors and range queries. This implementation will be crucial for applications like geographic information systems, computer graphics, and machine learning.

Problem Description

You need to create a JavaScript class, KDTree, that can store and query points in a k-dimensional space. The k-d tree should be built recursively by splitting the points along different dimensions at each level.

Key Requirements:

  1. Node Structure: Each node in the k-d tree should store a point, the dimension it was split on, and references to its left and right children.
  2. Construction: The KDTree class should have a constructor that accepts an array of points and builds the tree. Each point will be an array of numbers representing its coordinates (e.g., [x, y] for 2D).
  3. Splitting Logic: At each level of the tree, points should be split along the dimension depth % k, where depth is the current level and k is the dimensionality of the points. The median point along that dimension should be chosen as the splitting point.
  4. Nearest Neighbor Search: Implement a method findNearest(queryPoint) that returns the point in the tree closest to the given queryPoint.
  5. Range Search (Optional but Recommended): Implement a method findRange(queryPoint, radius) that returns all points within a specified radius of the queryPoint.

Expected Behavior:

  • The KDTree should be initialized with a list of points.
  • findNearest should return the single closest point.
  • findRange should return an array of points.

Edge Cases:

  • Empty Tree: What happens if the tree is initialized with no points?
  • Duplicate Points: How should duplicate points be handled?
  • Points with Different Dimensions: The implementation should assume all points have the same dimensionality (k). You can throw an error or handle this gracefully.
  • 1D Points: The implementation should work for k=1.

Examples

Example 1: Building a 2D Tree

Input:
const points = [
  [2, 3],
  [5, 4],
  [9, 6],
  [4, 7],
  [8, 1],
  [7, 2]
];

// Assume KDTree class is defined elsewhere

const kdTree = new KDTree(points);

// (Internal representation after building - not directly observable but conceptually described)
// Root: [7, 2] (split on x-axis, dimension 0)
//  Left Child: [5, 4] (split on y-axis, dimension 1)
//   Left-Left: [2, 3]
//   Left-Right: [4, 7]
//  Right Child: [9, 6] (split on y-axis, dimension 1)
//   Right-Left: [8, 1]

Example 2: Nearest Neighbor Search in a 2D Tree

Input:
const points = [
  [2, 3],
  [5, 4],
  [9, 6],
  [4, 7],
  [8, 1],
  [7, 2]
];
const kdTree = new KDTree(points);
const queryPoint = [3, 5];

Output:
[4, 7]

Explanation:
The point [4, 7] is the closest point in the tree to [3, 5].
Distance([3, 5], [4, 7]) = sqrt((4-3)^2 + (7-5)^2) = sqrt(1^2 + 2^2) = sqrt(5) ≈ 2.236
Other distances:
[2, 3]: sqrt((3-2)^2 + (5-3)^2) = sqrt(1^2 + 2^2) = sqrt(5)
[5, 4]: sqrt((3-5)^2 + (5-4)^2) = sqrt((-2)^2 + 1^2) = sqrt(5)
[9, 6]: sqrt((3-9)^2 + (5-6)^2) = sqrt((-6)^2 + (-1)^2) = sqrt(37)
[8, 1]: sqrt((3-8)^2 + (5-1)^2) = sqrt((-5)^2 + 4^2) = sqrt(41)
[7, 2]: sqrt((3-7)^2 + (5-2)^2) = sqrt((-4)^2 + 3^2) = sqrt(25) = 5

Wait, after recalculating, both [2,3] and [5,4] are also sqrt(5) away. The algorithm might return any of these if they are equidistant. Let's adjust the explanation to clarify tie-breaking or acknowledge that any of the closest is acceptable. For this example, we'll assume a specific traversal order leads to [4, 7] being found first in a particular pruning path. A more robust explanation would note tie-breaking rules or that any of the points with minimum distance is a valid answer.

Let's assume a convention that in case of ties, the point encountered earliest in the tree traversal (or the one that comes first lexicographically if that's how the median was chosen) is returned. Given the typical median finding and recursive building, [4,7] might be a plausible first-found equidistant point depending on implementation details. To avoid ambiguity in the example, let's pick a query point with a clearly unique closest neighbor:

New Query Point: [8, 3]
Distance([8, 3], [8, 1]): sqrt((8-8)^2 + (3-1)^2) = sqrt(0^2 + 2^2) = sqrt(4) = 2
Distance([8, 3], [7, 2]): sqrt((8-7)^2 + (3-2)^2) = sqrt(1^2 + 1^2) = sqrt(2) ≈ 1.414
Distance([8, 3], [9, 6]): sqrt((8-9)^2 + (3-6)^2) = sqrt((-1)^2 + (-3)^2) = sqrt(10) ≈ 3.162

Revised Explanation for queryPoint = [8, 3]:
The point [7, 2] is the closest point in the tree to [8, 3] with a distance of approximately 1.414.

Example 3: Range Search in a 2D Tree

Input:
const points = [
  [2, 3],
  [5, 4],
  [9, 6],
  [4, 7],
  [8, 1],
  [7, 2]
];
const kdTree = new KDTree(points);
const queryPoint = [6, 3];
const radius = 3; // Search for points within radius 3 of [6, 3]

Output:
[
  [5, 4],
  [7, 2]
]

Explanation:
- Distance([6, 3], [5, 4]) = sqrt((6-5)^2 + (3-4)^2) = sqrt(1^2 + (-1)^2) = sqrt(2) ≈ 1.414 (within radius 3)
- Distance([6, 3], [7, 2]) = sqrt((6-7)^2 + (3-2)^2) = sqrt((-1)^2 + 1^2) = sqrt(2) ≈ 1.414 (within radius 3)
- Distance([6, 3], [2, 3]) = sqrt((6-2)^2 + (3-3)^2) = sqrt(4^2 + 0^2) = sqrt(16) = 4 (outside radius 3)
- Other points are further away.

Example 4: Handling Empty Input

Input:
const points = [];
const kdTree = new KDTree(points);
const queryPoint = [1, 1];

Output for findNearest:
null

Output for findRange:
[]

Explanation:
An empty tree contains no points, so nearest neighbor search should return null, and range search should return an empty array.

Constraints

  • The number of points N will be between 0 and 100,000.
  • The dimensionality of points k will be between 1 and 10.
  • Coordinates of points will be integers between -1,000,000 and 1,000,000.
  • The radius for range search will be a non-negative number.
  • Your findNearest implementation should aim for an average time complexity better than O(N) (ideally O(log N) on average).
  • Your findRange implementation should aim for an average time complexity better than O(N) (ideally O(sqrt(N) + M) for 2D, where M is the number of points in range).

Notes

  • Median Finding: For building the tree, you'll need an efficient way to find the median of a sub-array along a specific dimension. You can use sorting or a selection algorithm (like Quickselect) for this. Sorting each time will be simpler but less performant.
  • Distance Calculation: You'll need a helper function to calculate the Euclidean distance between two points. For comparison purposes, you can often work with squared distances to avoid the computationally more expensive square root operation.
  • Nearest Neighbor Search Algorithm: The standard k-d tree nearest neighbor search involves traversing the tree, pruning branches that cannot possibly contain a closer point than the current best. This pruning is key to achieving logarithmic average-case performance.
  • Range Search Algorithm: Similar to nearest neighbor search, range search can be optimized by pruning branches.
  • Immutability (Optional but good practice): Consider if your tree building process should be immutable (i.e., not modify the original input points array). For this challenge, modifying it in place to sort might be acceptable for simplicity.
  • Error Handling: Be mindful of potential errors, such as points with inconsistent dimensions.
Loading editor...
javascript