Using LimitedSearch, it only checks for overlapping matches in one
place instead of checking at each byte.
This gains about 50% in compression speed while only losing about
2% in compression ratio.
I've been experimenting for a while with a new brotli compressor.
Instead of being a translation of the C implementation,
it's a rewrite in Go, with a modular structure thanks to interfaces.
(A few low-level functions still come from the C version, though.)
The performance is getting to the point where it seems to be worth
adding to the brotli repository.