Use __builtin_clear_padding and __builtin_zero_non_value_bits that were
introduced in gcc 11 and MSVC 19.27 to clear the padding bits in atomic
types. The intrinsics are used in bitwise_cast and atomic reference
constructors.
Also, separated atomic impl specializations for enums to allow using
static_cast to convert values to storage. This in turn allows to
relax compiler requirements to mark atomic constructors constexpr.
Updated docs and added tests for structs with padding and constexpr
atomic constructors.
The generic implementation is based on the lock pool. A list of condition
variables (or waiting futexes) is added per lock. Basically, the lock
pool serves as a global hash table, where each lock represents
a bucket and each wait state is an element. Every wait operation
allocates a wait state keyed on the pointer to the atomic object. Notify
operations look up the wait state by the atomic pointer and notify
the condition variable/futex. The corresponding lock needs to be acquired
to protect the wait state list during all wait/notify operations.
Backends not involving the lock pool are going to be added later.
The implementation of wait operation extends the C++20 definition in that
it returns the newly loaded value instead of void. This allows the caller
to avoid loading the value himself.
The waiting/notifying operations are not address-free. Address-free variants
will be added later.
Added tests for the new operations and refactored existing tests for atomic
operations. Added docs for the new operations.