Now that it has its own header, various reporter TUs that want to
format text do not have to also include Clara. Together with
outlining implementations from a header into a separate TU, this
has noticeably improved the compilation times of the testing impl.
As part of this split, I also implemented some improvements to the
TextFlow code in comparison to the upstream code. These are:
* Replaced the `Spacer` type with a free function that constructs
special `Column` that does the same thing.
* Generic performance improvements, such as eliminating needless
allocations, reserving space in needed allocations, and using smarter
algorithms in some places.
* Because `Column` only ever stored 1 string in its vector, it now
holds the string directly instead.