This commit changes the logic for managing the expiration of
objects in the database.
Before: There was a server-wide hashmap that stored the
collection key, id, and expiration timestamp for all objects
that had a TTL. The hashmap was occasionally probed at 20
random positions, looking for objects that have expired. Those
expired objects were immediately deleted, and if there was 5
or more objects deleted, then the probe happened again, with
no delay. If the number of objects was less than 5 then the
there was a 1/10th of a second delay before the next probe.
Now: Rather than a server-wide hashmap, each collection has
its own ordered priority queue that stores objects with TTLs.
Rather than probing, there is a background routine that
executes every 1/10th of a second, which pops the expired
objects from the collection queues, and deletes them.
The collection/queue method is a more stable approach than
the hashmap/probing method. With probing, we can run into
major cache misses for some cases where there is wide
TTL duration, such as in the hours or days. This may cause
the system to occasionally fall behind, leaving should-be
expired objects in memory. Using a queue, there is no
cache misses, all objects that should be expired will be
right away, regardless of the TTL durations.
Fixes#616
This commit includes updates that affects the build, testing, and
deployment of Tile38.
- The root level build.sh has been broken up into multiple scripts
and placed in the "scripts" directory.
- The vendor directory has been updated to follow the Go modules
rules, thus `make` should work on isolated environments. Also
some vendored packages may have been updated to a later
version, if needed.
- The Makefile has been updated to allow for making single
binaries such as `make tile38-server`. There is some scaffolding
during the build process, so from now on all binaries should be
made using make. For example, to run a development version of
the tile38-cli binary, do this:
make tile38-cli && ./tile38-cli
not this:
go run cmd/tile38-cli/main.go
- Travis.CI docker push script has been updated to address a
change to Docker's JSON repo meta output, which in turn fixes
a bug where new Tile38 versions were not being properly pushed
to Docker
This commit fixes an issue where Tile38 was using lots of extra
memory to track objects that are marked to expire. This was
creating problems with applications that set big TTLs.
How it worked before:
Every collection had a unique hashmap that stores expiration
timestamps for every object in that collection. Along with
the hashmaps, there's also one big server-wide list that gets
appended every time a new SET+EX is performed.
From a background routine, this list is looped over at least
10 times per second and is randomly searched for potential
candidates that might need expiring. The routine then removes
those entries from the list and tests if the objects matching
the entries have actually expired. If so, these objects are
deleted them from the database. When at least 25% of
the 20 candidates are deleted the loop is immediately
continued, otherwise the loop backs off with a 100ms pause.
Why this was a problem.
The list grows one entry for every SET+EX. When TTLs are long,
like 24-hours or more, it would take at least that much time
before the entry is removed. So for databased that have objects
that use TTLs and are updated often this could lead to a very
large list.
How it was fixed.
The list was removed and the hashmap is now search randomly. This
required a new hashmap implementation, as the built-in Go map
does not provide an operation for randomly geting entries. The
chosen implementation is a robinhood-hash because it provides
open-addressing, which makes for simple random bucket selections.
Issue #502