Rollup merge of #69050 - nnethercote:micro-optimize-leb128, r=michaelwoerister
Micro-optimize the heck out of LEB128 reading and writing.
This commit makes the following writing improvements:
- Removes the unnecessary `write_to_vec` function.
- Reduces the number of conditions per loop from 2 to 1.
- Avoids a mask and a shift on the final byte.
And the following reading improvements:
- Removes an unnecessary type annotation.
- Fixes a dangerous unchecked slice access. Imagine a slice `[0x80]` --
the current code will read past the end of the slice some number of
bytes. The bounds check at the end will subsequently trigger, unless
something bad (like a crash) happens first. The cost of doing bounds
check in the loop body is negligible.
- Avoids a mask on the final byte.
And the following improvements for both reading and writing:
- Changes `for` to `loop` for the loops, avoiding an unnecessary
condition on each iteration. This also removes the need for
`leb128_size`.
All of these changes give significant perf wins, up to 5%.