mirror of https://github.com/tidwall/tile38.git
2a4272c95f
The current KNN implementation has two areas that can be improved: - The current behavior is somewhat incorrect. When performing a kNN query, the current code fetches k items from the index, and then sorts these items according to Haversine distance. The problem with this approach is that since the items fetched from the index are ordered by a Euclidean metric, there is no guarantee that item k + 1 is not closer than item k in great circle distance, and hence incorrect results can be returned when closer items beyond k exist. - The secondary sort is a performance killer. This requires buffering all k items (again...they were already run through a priority queue in) the index, and then a sort. Since the items are mostly sorted, and Go's sort implementation is a quickSort this is the worst case for the sort algorithm. Both of these can be fixed by applying a proper distance metric in the index nearby operation. In addition, this cleans up the code considerably, removing a number of special cases that applied only to NEARBY operations. This change implements a geodetic distance metric that ensures that the order from the index is correct, eliminating the need for the secondary sort and special filtering cases in the ScanWriter code. |
||
---|---|---|
.. | ||
aof.go | ||
aofmigrate.go | ||
aofshrink.go | ||
atomic.go | ||
atomic_test.go | ||
bson.go | ||
checksum.go | ||
client.go | ||
config.go | ||
crud.go | ||
dev.go | ||
expire.go | ||
expression.go | ||
fence.go | ||
follow.go | ||
hooks.go | ||
json.go | ||
json_test.go | ||
keys.go | ||
live.go | ||
output.go | ||
pubsub.go | ||
readonly.go | ||
respconn.go | ||
scan.go | ||
scanner.go | ||
scripts.go | ||
search.go | ||
server.go | ||
stats.go | ||
stats_cpu.go | ||
stats_cpu_darlin.go | ||
test.go | ||
token.go | ||
token_test.go |