This stores the stack of iterators inline (we have a maximum depth with
`uint` keys), and then uses direct pointer offsetting to manipulate it,
in a blazing fast way:
Before:
bench_iter_large ... bench: 43187 ns/iter (+/- 3082)
bench_iter_small ... bench: 618 ns/iter (+/- 288)
After:
bench_iter_large ... bench: 13497 ns/iter (+/- 1575)
bench_iter_small ... bench: 220 ns/iter (+/- 91)
Also, removes `.each_{key,value}_reverse` as an offering to
placate the gods of external iterators for my heinous sin of
attempting to add new internal ones (in a previous version of this
PR).