I got stung by this one after processing quite a bit of data. When doing a nearest neighbor search, I have been leveraging the GiST index functionality in PostGIS. The documentation describes how to take advantage of these indexes by using the && operator to first find overlapping bounding boxes and then do the compute intensive calculation on the smaller subset of matched features. However, there is a condition in which the overlapping bounding box operator does not return the nearest features. Perhaps this is well know, but I got hit by it.

Consider the case of searching for the nearest road from a point on a map. A natural way of performing this search is to expand a bounding box around the point and use the && operator to select the roads that intersect that bounding box. Then, the distance to each of those roads in the returned subset is computed and the minimum distance is returned. Observe the following scenario:

The tiny square in the upper right (just below and to the right of the rectangle) is the bbox of the point from which we wish to find the nearest road. The yellow rectangle is the bbox of the nearest road. The large transparent blue rectangle is a bbox of the next nearest road. So the only overlapping bbox for the point is the large rectangle. Thus, the && operator does not find the nearest road and our calculation is wrong.

The solution I have come up with so far is to use the bbox operator, compute the nearest distance, and use a new bbox expanded around the point using the just computed distance. This operation will find any overlapping bbox within that range and will come up with the correct nearest road. I don’t like this solution as it requires two && searches and multiple distance computations – not very optimal.

Share this post:Follow CCRi: