unordered/doc/modules/ROOT/pages/rationale.adoc

143 lines
7.4 KiB
Plaintext

[#rationale]
:idprefix: rationale_
= Implementation Rationale
== Closed-addressing Containers
`boost::unordered_[multi]set` and `boost::unordered_[multi]map`
adhere to the standard requirements for unordered associative
containers, so the interface was fixed. But there are
still some implementation decisions to make. The priorities are
conformance to the standard and portability.
The http://en.wikipedia.org/wiki/Hash_table[Wikipedia article on hash tables^]
has a good summary of the implementation issues for hash tables in general.
=== Data Structure
By specifying an interface for accessing the buckets of the container the
standard pretty much requires that the hash table uses closed addressing.
It would be conceivable to write a hash table that uses another method. For
example, it could use open addressing, and use the lookup chain to act as a
bucket but there are some serious problems with this:
* The standard requires that pointers to elements aren't invalidated, so
the elements can't be stored in one array, but will need a layer of
indirection instead - losing the efficiency and most of the memory gain,
the main advantages of open addressing.
* Local iterators would be very inefficient and may not be able to
meet the complexity requirements.
* There are also the restrictions on when iterators can be invalidated. Since
open addressing degrades badly when there are a high number of collisions the
restrictions could prevent a rehash when it's really needed. The maximum load
factor could be set to a fairly low value to work around this - but the
standard requires that it is initially set to 1.0.
* And since the standard is written with a eye towards closed
addressing, users will be surprised if the performance doesn't reflect that.
So closed addressing is used.
=== Number of Buckets
There are two popular methods for choosing the number of buckets in a hash
table. One is to have a prime number of buckets, another is to use a power
of 2.
Using a prime number of buckets, and choosing a bucket by using the modulus
of the hash function's result will usually give a good result. The downside
is that the required modulus operation is fairly expensive. This is what the
containers used to do in most cases.
Using a power of 2 allows for much quicker selection of the bucket to use,
but at the expense of losing the upper bits of the hash value. For some
specially designed hash functions it is possible to do this and still get a
good result but as the containers can take arbitrary hash functions this can't
be relied on.
To avoid this a transformation could be applied to the hash function, for an
example see
http://web.archive.org/web/20121102023700/http://www.concentric.net/~Ttwang/tech/inthash.htm[Thomas Wang's article on integer hash functions^].
Unfortunately, a transformation like Wang's requires knowledge of the number
of bits in the hash value, so it was only used when `size_t` was 64 bit.
Since release 1.79.0, https://en.wikipedia.org/wiki/Hash_function#Fibonacci_hashing[Fibonacci hashing]
is used instead. With this implementation, the bucket number is determined
by using `(h * m) >> (w - k)`, where `h` is the hash value, `m` is `2^w` divided
by the golden ratio, `w` is the word size (32 or 64), and `2^k` is the
number of buckets. This provides a good compromise between speed and
distribution.
Since release 1.80.0, prime numbers are chosen for the number of buckets in
tandem with sophisticated modulo arithmetic. This removes the need for "mixing"
the result of the user's hash function as was used for release 1.79.0.
== Open-addresing Containers
The C++ standard specification of unordered associative containers impose
severe limitations on permissible implementations, the most important being
that closed addressing is implicitly assumed. Slightly relaxing this specification
opens up the possibility of providing container variations taking full
advantage of open-addressing techniques.
The design of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` has been
guided by Peter Dimov's https://pdimov.github.io/articles/unordered_dev_plan.html[Development Plan for Boost.Unordered^].
We discuss here the most relevant principles.
=== Hash Function
Given its rich functionality and cross-platform interoperability,
`boost::hash` remains the default hash function of open-addressing containers.
As it happens, `boost::hash` for integral and other basic types does not possess
the statistical properties required by open addressing; to cope with this,
we implement a post-mixing stage:
{nbsp}{nbsp}{nbsp}{nbsp} _a_ <- _h_ *mul* _C_, +
{nbsp}{nbsp}{nbsp}{nbsp} _h_ <- *high*(_a_) *xor* *low*(_a_),
where *mul* is an _extended multiplication_ (128 bits in 64-bit architectures, 64 bits in 32-bit environments),
and *high* and *low* are the upper and lower halves of an extended word, respectively.
In 64-bit architectures, _C_ is the integer part of 2^64^&#8725;https://en.wikipedia.org/wiki/Golden_ratio[_&phi;_],
whereas in 32 bits _C_ = 0xE817FB2Du has been obtained from https://arxiv.org/abs/2001.05304[Steele and Vigna (2021)^].
When using a hash function directly suitable for open addressing, post-mixing can be opted out of via a dedicated `xref:reference/hash_traits.adoc#hash_traits_hash_is_avalanching[hash_is_avalanching]` trait.
`boost::hash` specializations for string types are marked as avalanching.
=== Platform Interoperability
The observable behavior of `boost::unordered_flat_set`/`unordered_node_set` and `boost::unordered_flat_map`/`unordered_node_map` is deterministically
identical across different compilers as long as their ``std::size_t``s are the same size and the user-provided
hash function and equality predicate are also interoperable
&#8212;this includes elements being ordered in exactly the same way for the same sequence of
operations.
Although the implementation internally uses SIMD technologies, such as https://en.wikipedia.org/wiki/SSE2[SSE2^]
and https://en.wikipedia.org/wiki/ARM_architecture_family#Advanced_SIMD_(NEON)[Neon^], when available,
this does not affect interoperatility. For instance, the behavior is the same
for Visual Studio on an x64-mode Intel CPU with SSE2 and for GCC on an IBM s390x without any supported SIMD technology.
== Concurrent Containers
The same data structure used by Boost.Unordered open-addressing containers has been chosen
also as the foundation of `boost::concurrent_flat_set`/`boost::concurrent_node_set` and
`boost::concurrent_flat_map`/`boost::concurrent_node_map`:
* Open-addressing is faster than closed-addressing alternatives, both in non-concurrent and
concurrent scenarios.
* Open-addressing layouts are eminently suitable for concurrent access and modification
with minimal locking. In particular, the metadata array can be used for implementations of
lookup that are lock-free up to the last step of actual element comparison.
* Layout compatibility with Boost.Unordered flat containers allows for
xref:concurrent.adoc#concurrent_interoperability_with_non_concurrent_containers[fast transfer]
of all elements between a concurrent container and its non-concurrent counterpart,
and vice versa.
=== Hash Function and Platform Interoperability
Concurrent containers make the same decisions and provide the same guarantees
as Boost.Unordered open-addressing containers with regards to
xref:#rationale_hash_function[hash function defaults] and
xref:#rationale_platform_interoperability[platform interoperability].