Use 64- or 32-bit loads instead of reading a byte at a time. The
original C source did something like this, in a very C-ish way. It
needed to be simplified to translate it to Go. The exact way this works
was suggested by the assembly code in github.com/golang/snappy.