Each MATCH is inclusive OR, thus
WITHIN fleet MATCH train* truck* BOUNDS 33 -112 34 -113
will find all trains and trucks that within the provides bounds.
This commit allows for buffering any GeoJSON object.
For example:
INTERSECTS fleet BUFFER 1000 OBJECT {...LineString...}
This will buffer add a 1 kilometer buffer to a linesting and
search the 'fleet' collection for all objects that
intersect the buffered linestring.
This commit also allows for performing INTERSECTS with a POINT
type. Thus allowing for a polygon-over-point operation, which is
an inverted point-in-polygon.
This commit changes the logic for managing the expiration of
objects in the database.
Before: There was a server-wide hashmap that stored the
collection key, id, and expiration timestamp for all objects
that had a TTL. The hashmap was occasionally probed at 20
random positions, looking for objects that have expired. Those
expired objects were immediately deleted, and if there was 5
or more objects deleted, then the probe happened again, with
no delay. If the number of objects was less than 5 then the
there was a 1/10th of a second delay before the next probe.
Now: Rather than a server-wide hashmap, each collection has
its own ordered priority queue that stores objects with TTLs.
Rather than probing, there is a background routine that
executes every 1/10th of a second, which pops the expired
objects from the collection queues, and deletes them.
The collection/queue method is a more stable approach than
the hashmap/probing method. With probing, we can run into
major cache misses for some cases where there is wide
TTL duration, such as in the hours or days. This may cause
the system to occasionally fall behind, leaving should-be
expired objects in memory. Using a queue, there is no
cache misses, all objects that should be expired will be
right away, regardless of the TTL durations.
Fixes#616
The current KNN implementation has two areas that can be improved:
- The current behavior is somewhat incorrect. When performing a kNN
query, the current code fetches k items from the index, and then sorts
these items according to Haversine distance. The problem with this
approach is that since the items fetched from the index are ordered by
a Euclidean metric, there is no guarantee that item k + 1 is not closer
than item k in great circle distance, and hence incorrect results can be
returned when closer items beyond k exist.
- The secondary sort is a performance killer. This requires buffering
all k items (again...they were already run through a priority queue in)
the index, and then a sort. Since the items are mostly sorted, and
Go's sort implementation is a quickSort this is the worst case for the
sort algorithm.
Both of these can be fixed by applying a proper distance metric in
the index nearby operation. In addition, this cleans up the code
considerably, removing a number of special cases that applied only
to NEARBY operations.
This change implements a geodetic distance metric that ensures that
the order from the index is correct, eliminating the need for the
secondary sort and special filtering cases in the ScanWriter code.
This commit fixes a case where a roaming geofence will not fire
a "faraway" event when it's supposed to.
The fix required rewriting the nearby/faraway detection logic. It
is now much more accurate and takes overall less memory, but it's
also a little slower per operation because each object proximity
is checked twice per update. Once to compare the old object's
surrounding, and once to evaulated the new object. The two lists
are then used to generate accurate "nearby" and "faraway" results.