Commit Graph

87 Commits

Author SHA1 Message Date
Andy Balholm 57434b5091 Encoder: check for empty block
Fixes #51
2024-07-29 09:56:04 -07:00
Andy Balholm 97e8583d85 matchfinder.M4: some refinements to scoring 2024-01-24 16:11:21 -08:00
Andy Balholm 17e5901d05 Make my matchfinder work more accessible. 2024-01-11 17:31:05 -08:00
Andy Balholm cf812c06f8 matchfinder: add M0
M0 is a MatchFinder based on the algorithm for brotli level 0.
2024-01-11 16:00:40 -08:00
Andy Balholm 1b6cf3696e matchfinder: remove MultiHash
It was an interesting experiment, but it didn't
do any better than M4.
2024-01-09 06:29:08 -08:00
Andy Balholm 265f3afc2a matchfinder: penalize score for overlapping matches 2024-01-09 06:03:56 -08:00
Andy Balholm a8d524a96d matchfinder: replace Score function with DistanceBitCost 2024-01-09 05:40:40 -08:00
Andy Balholm 578645e154 matchfinder: add MultiHash 2024-01-09 05:06:39 -08:00
Andy Balholm 24b2bfad2d matchfinder.M4: add Score function 2024-01-02 13:38:12 -08:00
Andy Balholm 4a024e3eff matchfinder.M4: add match chain 2024-01-01 16:13:22 -08:00
Andy Balholm 3a1c5cd370 Fix typo in comment. 2024-01-01 15:06:21 -08:00
Andy Balholm 0d2aef37af matchfinder.M4: factor out extendMatch2 2023-12-30 16:25:51 -08:00
Andy Balholm 63f3f4372d matchfinder.M4: add LimitedSearch option
Using LimitedSearch, it only checks for overlapping matches in one
place instead of checking at each byte.
This gains about 50% in compression speed while only losing about
2% in compression ratio.
2023-12-30 15:56:13 -08:00
Andy Balholm 924a0eb0c6 matchfinder.M4: more refactoring
Factor out matchEmitter.trim, and make TableBits configurable.
2023-12-28 17:21:34 -08:00
Andy Balholm c506503c67 matchfinder: factor out matchEmitter 2023-12-28 17:01:08 -08:00
Andy Balholm 349ed2fce1 Add matchfinder package.
I've been experimenting for a while with a new brotli compressor.
Instead of being a translation of the C implementation,
it's a rewrite in Go, with a modular structure thanks to interfaces.
(A few low-level functions still come from the C version, though.)

The performance is getting to the point where it seems to be worth
adding to the brotli repository.
2023-12-28 16:09:32 -08:00
Jay Wren b7a4cf9ec5
remove Content-Type requirement
* https://github.com/golang/go/issues/31753 is fixed
2023-02-25 11:50:08 -05:00
Andy Balholm 2848168f55 Reader.Reset: recover from errors.
When a Reader encounters an error, its internal state may be corrupted.
So Reset needs to be extra thorough, to avoid cascading errors.

Based on PR 37 by Sovianum.
2022-09-23 19:39:05 -07:00
Andy Balholm 786ec621f6 Reuse ringbuffer in Reader.
Fixes #33
2022-05-18 12:06:45 -07:00
Matt Dainty f001d275a3
Fix Reader.Reset() docs 2022-05-12 18:31:16 +01:00
Andy Balholm 34a5640cc1 Add an example for Writer.Reset.
It is based on the one for flate.Writer.Reset.

Fixes #31.
2022-05-03 10:55:39 -07:00
Andy Balholm 1d750214c2 Optimize log2FloorNonZero with math/bits. 2021-09-22 11:21:01 -07:00
zuiwuchang ec682fbe0a Rest set writer.err=nil 2021-08-25 10:29:47 +08:00
Andy Balholm cf8bc3b664 More staticcheck advice. 2021-08-18 18:04:51 -07:00
Andy Balholm a61eb82231 Follow some advice from staticcheck. 2021-08-12 12:37:09 -07:00
Andy Balholm e073f0d4ed Merge branch 'issue22' 2021-07-15 12:01:52 -07:00
Andy Balholm 177b8acd6c Add test for issue 22. 2021-07-15 12:01:31 -07:00
Andy Balholm 5376c15dde Retract v1.0.1 2021-05-26 15:10:20 -07:00
Andy Balholm 94609f9606 Revert "Faster bit writing."
This reverts commit c3da72aa01.

With the sample data from issue 22, one byte in the output file is zero
instead of the correct value. For now at least, we'll go back to the old
way of writing bits.

Fixes #22
2021-04-27 10:48:28 -07:00
Andy Balholm 47c0dbab12 Simplify control flow in Reader.Reset. 2021-03-01 09:45:40 -08:00
Mike Faraponov 87d8f4575c
Reduce allocations of buffer for reused readers 2021-03-01 09:42:42 -08:00
Andy Balholm 729edfbcfe Add documentation link.
Fixes #16
2020-08-04 09:53:43 -07:00
oguzyildiz1991 1b06c5640c Check size of split.lengths instead of a nil check
Fixes https://github.com/andybalholm/brotli/issues/14
2020-07-17 09:24:47 -07:00
Andy Balholm c3da72aa01 Faster bit writing.
Replace the functions in write_bits.go with a bitWriter type based on
the compress/flate package.
2020-06-18 18:58:27 -07:00
Andy Balholm ef7a42160d Use a 64-bit store in writeBits.
This is an optimization that was present in the C version (behind an
ifdef). It gives a nice speed boost to the lower compression levels.
2020-06-06 14:13:21 -07:00
Andy Balholm 097c1c5bc9 Use 32-bit loads in isMatch1 and isMatch5. 2020-05-15 11:17:23 -07:00
Andy Balholm 8f8b18645c Read multiple bytes in findMatchLengthWithLimit
Use 64- or 32-bit loads instead of reading a byte at a time. The
original C source did something like this, in a very C-ish way. It
needed to be simplified to translate it to Go. The exact way this works
was suggested by the assembly code in github.com/golang/snappy.
2020-05-15 10:43:19 -07:00
Erik Dubbelboer a01a7b12c9 Reuse buffers and objects using sync.Pool
This reduces the amount of garbage generated and relieves pressure on
the GC.

For a workload without reusing the Writer (using Writer.Reset) the number of allocations goes from 31 to 9.
While for a workload when you reuse the Writer the number of allocations goes from 25 to 0.
2020-05-10 10:36:19 +02:00
Andy Balholm e2c5f2109f Use len and cap instead of num_commands_ and cmd_alloc_size_. 2020-05-08 16:48:16 -07:00
Andy Balholm 4b2775ea5e Fix some ugly compound literals. 2020-05-08 15:14:47 -07:00
Andy Balholm b2497e8d72 Revert "Use sort.Sort to sort Huffman trees."
This reverts commit 6b5963335e.

It doesn't really have the performance benefit I thought it did.
2020-05-08 13:52:56 -07:00
Andy Balholm 6b5963335e Use sort.Sort to sort Huffman trees. 2020-05-07 17:51:21 -07:00
Andy Balholm 7c7a5a10ef Push output directly to dst instead of buffering. 2020-05-07 17:27:37 -07:00
Andy Balholm 625cbb6f92 Replace storage_size_ with len(storage). 2020-05-07 15:40:58 -07:00
Andy Balholm f41712f811 Remove unnecessary parameters from encodeData.
The pointers passed to out_size and output were always the same,
so there is no need to have them as parameters.

Based on
00ca10b927
and b3ee528567
2020-05-06 17:20:27 -07:00
Andy Balholm 511ca97d30 Benchmark all compression levels. 2020-05-06 16:08:01 -07:00
Andy Balholm 2c14228f02 Preserve w.commands across Reset. 2020-05-05 17:36:16 -07:00
Andy Balholm cb9be97eb7 Reuse more memory when a Writer is Reset. 2020-05-05 17:18:33 -07:00
Andy Balholm 00ca370ce2 Put full license in bench_test.go 2020-05-05 16:14:45 -07:00
Andy Balholm 3c3658f2fb A couple of tweaks 2020-05-04 16:09:27 -07:00