@jongiddy noticed bad performance due to the lack of inlining on `then`
and `then_with`. I confirmed that inlining really is the culprit by
creating a custom `then` function and repeating his benchmark on my
machine with and without the `#[inline]` attribute.
The numbers were exactly the same on my machine without the attribute.
With `#[inline]` I got the same performance as I did with manually
inlined implementation.