Initial commit

[SVN r7646]
This commit is contained in:
Beman Dawes 2000-07-27 14:46:23 +00:00
parent b231894f1b
commit ba62287576

489
c++_type_traits.htm Normal file
View File

@ -0,0 +1,489 @@
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Microsoft FrontPage 4.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<title>C++ Type traits</title>
</head>
<body bgcolor="#FFFFFF" link="#0000FF" vlink="#800080">
<h2 align="center">C++ Type traits</h2>
<p align="center"><em>by John Maddock and Steve Cleary</em></p>
<p align="center"><em>This is a draft of an article that will appear in a future
issue of </em><a href="http://www.ddj.com"><em>Dr Dobb's Journal</em></a></p>
<p>Generic programming (writing code which works with any data type meeting a
set of requirements) has become the method of choice for providing reusable
code. However, there are times in generic programming when &quot;generic&quot;
just isn't good enough - sometimes the differences between types are too large
for an efficient generic implementation. This is when the traits technique
becomes important - by encapsulating those properties that need to be considered
on a type by type basis inside a traits class, we can minimise the amount of
code that has to differ from one type to another, and maximise the amount of
generic code.</p>
<p>Consider an example: when working with character strings, one common
operation is to determine the length of a null terminated string. Clearly it's
possible to write generic code that can do this, but it turns out that there are
much more efficient methods available: for example, the C library functions <font size="2" face="Courier New">strlen</font>
and <font size="2" face="Courier New">wcslen</font> are usually written in
assembler, and with suitable hardware support can be considerably faster than a
generic version written in C++. The authors of the C++ standard library realised
this, and abstracted the properties of <font size="2" face="Courier New">char</font>
and <font size="2" face="Courier New">wchar_t</font> into the class <font size="2" face="Courier New">char_traits</font>.
Generic code that works with character strings can simply use <font size="2" face="Courier New">char_traits&lt;&gt;::length</font>
to determine the length of a null terminated string, safe in the knowledge that
specialisations of <font size="2" face="Courier New">char_traits</font> will use
the most appropriate method available to them.</p>
<h4>Type traits</h4>
<p>Class <font size="2" face="Courier New">char_traits</font> is a classic
example of a collection of type specific properties wrapped up in a single class
- what Nathan Myers termed a <i>baggage class</i>[1]. In the Boost type-traits
library, we[2] have written a set of very specific traits classes, each of which
encapsulate a single trait from the C++ type system; for example, is a type a
pointer or a reference type? Or does a type have a trivial constructor, or a
const-qualifier? The type-traits classes share a unified design: each class has
a single member <i>value</i>, a compile-time constant that is true if the type
has the specified property, and false otherwise. As we will show, these classes
can be used in generic programming to determine the properties of a given type
and introduce optimisations that are appropriate for that case.</p>
<p>The type-traits library also contains a set of classes that perform a
specific transformation on a type; for example, they can remove a top-level
const or volatile qualifier from a type. Each class that performs a
transformation defines a single typedef-member <i>type</i> that is the result of
the transformation. All of the type-traits classes are defined inside namespace <font size="2" face="Courier New">boost</font>;
for brevity, namespace-qualification is omitted in most of the code samples
given.</p>
<h4>Implementation</h4>
<p>There are far too many separate classes contained in the type-traits library
to give a full implementation here - see the source code in the Boost library
for the full details - however, most of the implementation is fairly repetitive
anyway, so here we will just give you a flavour for how some of the classes are
implemented. Beginning with possibly the simplest class in the library, is_void&lt;T&gt;
has a member <i>value</i> that is true only if T is void.</p>
<pre>template &lt;typename T&gt;
struct is_void
{ static const bool value = false; };
template &lt;&gt;
struct is_void&lt;void&gt;
{ static const bool value = true; };</pre>
<p>Here we define a primary version of the template class <font size="2" face="Courier New">is_void</font>,
and provide a full-specialisation when T is void. While full specialisation of a
template class is an important technique, sometimes we need a solution that is
halfway between a fully generic solution, and a full specialisation. This is
exactly the situation for which the standards committee defined partial
template-class specialisation. As an example, consider the class
boost::is_pointer&lt;T&gt;: here we needed a primary version that handles all
the cases where T is not a pointer, and a partial specialisation to handle all
the cases where T is a pointer:</p>
<pre>template &lt;typename T&gt;
struct is_pointer
{ static const bool value = false; };
template &lt;typename T&gt;
struct is_pointer&lt;T*&gt;
{ static const bool value = true; };</pre>
<p>The syntax for partial specialisation is somewhat arcane and could easily
occupy an article in its own right; like full specialisation, in order to write
a partial specialisation for a class, you must first declare the primary
template. The partial specialisation contains an extra &lt;&gt; after the
class name that contains the partial specialisation parameters; these define the
types that will bind to that partial specialisation rather than the default
template. The rules for what can appear in a partial specialisation are somewhat
convoluted, but as a rule of thumb if you can legally write two function
overloads of the form:</p>
<pre>void foo(T);
void foo(U);</pre>
<p>Then you can also write a partial specialisation of the form:</p>
<pre>template &lt;typename T&gt;
class c{ /*details*/ };
template &lt;typename T&gt;
class c&lt;U&gt;{ /*details*/ };</pre>
<p>This rule is by no means foolproof, but it is reasonably simple to remember
and close enough to the actual rule to be useful for everyday use.</p>
<p>As a more complex example of partial specialisation consider the class
remove_bounds&lt;T&gt;. This class defines a single typedef-member <i>type</i>
that is the same type as T but with any top-level array bounds removed; this is
an example of a traits class that performs a transformation on a type:</p>
<pre>template &lt;typename T&gt;
struct remove_bounds
{ typedef T type; };
template &lt;typename T, std::size_t N&gt;
struct remove_bounds&lt;T[N]&gt;
{ typedef T type; };</pre>
<p>The aim of remove_bounds is this: imagine a generic algorithm that is passed
an array type as a template parameter, <font size="2" face="Courier New">remove_bounds</font>
provides a means of determining the underlying type of the array. For example <code>remove_bounds&lt;int[4][5]&gt;::type</code>
would evaluate to the type <code>int[5]</code>. This example also shows that the
number of template parameters in a partial specialisation does not have to match
the number in the default template. However, the number of parameters that
appear after the class name do have to match the number and type of the
parameters in the default template.</p>
<h4>Optimised copy</h4>
<p>As an example of how the type traits classes can be used, consider the
standard library algorithm copy:</p>
<pre>template&lt;typename Iter1, typename Iter2&gt;
Iter2 copy(Iter1 first, Iter1 last, Iter2 out);</pre>
<p>Obviously, there's no problem writing a generic version of copy that works
for all iterator types Iter1 and Iter2; however, there are some circumstances
when the copy operation can best be performed by a call to <font size="2" face="Courier New">memcpy</font>.
In order to implement copy in terms of <font size="2" face="Courier New">memcpy</font>
all of the following conditions need to be met:</p>
<ul>
<li>Both of the iterator types Iter1 and Iter2 must be pointers.</li>
<li>Both Iter1 and Iter2 must point to the same type - excluding <font size="2" face="Courier New">const</font>
and <font size="2" face="Courier New">volatile</font>-qualifiers.</li>
<li>The type pointed to by Iter1 must have a trivial assignment operator.</li>
</ul>
<p>By trivial assignment operator we mean that the type is either a scalar
type[3] or:</p>
<ul>
<li>The type has no user defined assignment operator.</li>
<li>The type does not have any data members that are references.</li>
<li>All base classes, and all data member objects must have trivial assignment
operators.</li>
</ul>
<p>If all these conditions are met then a type can be copied using <font size="2" face="Courier New">memcpy</font>
rather than using a compiler generated assignment operator. The type-traits
library provides a class <i>has_trivial_assign</i>, such that <code>has_trivial_assign&lt;T&gt;::value</code>
is true only if T has a trivial assignment operator. This class &quot;just
works&quot; for scalar types, but has to be explicitly specialised for
class/struct types that also happen to have a trivial assignment operator. In
other words if <i>has_trivial_assign</i> gives the wrong answer, it will give
the &quot;safe&quot; wrong answer - that trivial assignment is not allowable.</p>
<p>The code for an optimised version of copy that uses <font size="2" face="Courier New">memcpy</font>
where appropriate is given in listing 1. The code begins by defining a template
class <i>copier</i>, that takes a single Boolean template parameter, and has a
static template member function <font size="2" face="Courier New">do_copy</font>
which performs the generic version of <font size="2">copy</font> (in other words
the &quot;slow but safe version&quot;). Following that there is a specialisation
for <i>copier&lt;true&gt;</i>: again this defines a static template member
function <font size="2" face="Courier New">do_copy</font>, but this version uses
memcpy to perform an &quot;optimised&quot; copy.</p>
<p>In order to complete the implementation, what we need now is a version of
copy, that calls <code>copier&lt;true&gt;::do_copy</code> if it is safe to use <font size="2" face="Courier New">memcpy</font>,
and otherwise calls <code>copier&lt;false&gt;::do_copy</code> to do a
&quot;generic&quot; copy. This is what the version in listing 1 does. To
understand how the code works look at the code for <font size="2" face="Courier New">copy</font>
and consider first the two typedefs <i>v1_t</i> and <i>v2_t</i>. These use <code>std::iterator_traits&lt;Iter1&gt;::value_type</code>
to determine what type the two iterators point to, and then feed the result into
another type-traits class <i>remove_cv</i> that removes the top-level
const-volatile-qualifiers: this will allow copy to compare the two types without
regard to const- or volatile-qualifiers. Next, <font size="2" face="Courier New">copy</font>
declares an enumerated value <i>can_opt</i> that will become the template
parameter to copier - declaring this here as a constant is really just a
convenience - the value could be passed directly to class <font size="2" face="Courier New">copier</font>.
The value of <i>can_opt</i> is computed by verifying that all of the following
are true:</p>
<ul>
<li>first that the two iterators point to the same type by using a type-traits
class <i>is_same</i>.</li>
<li>Then that both iterators are real pointers - using the class <i>is_pointer</i>
described above.</li>
<li>Finally that the pointed-to types have a trivial assignment operator using
<i>has_trivial_assign</i>.</li>
</ul>
<p>Finally we can use the value of <i>can_opt</i> as the template argument to
copier - this version of copy will now adapt to whatever parameters are passed
to it, if its possible to use <font size="2" face="Courier New">memcpy</font>,
then it will do so, otherwise it will use a generic copy.</p>
<h4>Was it worth it?</h4>
<p>It has often been repeated in these columns that &quot;premature optimisation
is the root of all evil&quot; [4]. So the question must be asked: was our
optimisation premature? To put this in perspective the timings for our version
of copy compared a conventional generic copy[5] are shown in table 1.</p>
<p>Clearly the optimisation makes a difference in this case; but, to be fair,
the timings are loaded to exclude cache miss effects - without this accurate
comparison between algorithms becomes difficult. However, perhaps we can add a
couple of caveats to the premature optimisation rule:</p>
<ul>
<li>If you use the right algorithm for the job in the first place then
optimisation will not be required; in some cases, <font size="2" face="Courier New">memcpy</font>
is the right algorithm.</li>
<li>If a component is going to be reused in many places by many people then
optimisations may well be worthwhile where they would not be so for a single
case - in other words, the likelihood that the optimisation will be
absolutely necessary somewhere, sometime is that much higher. Just as
importantly the perceived value of the stock implementation will be higher:
there is no point standardising an algorithm if users reject it on the
grounds that there are better, more heavily optimised versions available.</li>
</ul>
<h4>Table 1: Time taken to copy 1000 elements using copy&lt;const T*, T*&gt;
(times in micro-seconds)</h4>
<table border="1" cellpadding="7" cellspacing="1" width="529">
<tr>
<td valign="top" width="33%">
<p align="center">Version</p>
</td>
<td valign="top" width="33%">
<p align="center">T</p>
</td>
<td valign="top" width="33%">
<p align="center">Time</p>
</td>
</tr>
<tr>
<td valign="top" width="33%">&quot;Optimised&quot; copy</td>
<td valign="top" width="33%">char</td>
<td valign="top" width="33%">0.99</td>
</tr>
<tr>
<td valign="top" width="33%">Conventional copy</td>
<td valign="top" width="33%">char</td>
<td valign="top" width="33%">8.07</td>
</tr>
<tr>
<td valign="top" width="33%">&quot;Optimised&quot; copy</td>
<td valign="top" width="33%">int</td>
<td valign="top" width="33%">2.52</td>
</tr>
<tr>
<td valign="top" width="33%">Conventional copy</td>
<td valign="top" width="33%">int</td>
<td valign="top" width="33%">8.02</td>
</tr>
</table>
<p>&nbsp;</p>
<h4>Pair of References</h4>
<p>The optimised copy example shows how type traits may be used to perform
optimisation decisions at compile-time. Another important usage of type traits
is to allow code to compile that otherwise would not do so unless excessive
partial specialization is used. This is possible by delegating partial
specialization to the type traits classes. Our example for this form of usage is
a pair that can hold references [6].</p>
<p>First, let us examine the definition of &quot;std::pair&quot;, omitting the
comparision operators, default constructor, and template copy constructor for
simplicity:</p>
<pre>template &lt;typename T1, typename T2&gt;
struct pair
{
typedef T1 first_type;
typedef T2 second_type;
T1 first;
T2 second;
pair(const T1 &amp; nfirst, const T2 &amp; nsecond)
:first(nfirst), second(nsecond) { }
};</pre>
<p>Now, this &quot;pair&quot; cannot hold references as it currently stands,
because the constructor would require taking a reference to a reference, which
is currently illegal [7]. Let us consider what the constructor's parameters
would have to be in order to allow &quot;pair&quot; to hold non-reference types,
references, and constant references:</p>
<table border="1" cellpadding="7" cellspacing="1" width="638">
<tr>
<td valign="top" width="50%">Type of &quot;T1&quot;</td>
<td valign="top" width="50%">Type of parameter to initializing constructor</td>
</tr>
<tr>
<td valign="top" width="50%">
<pre>T</pre>
</td>
<td valign="top" width="50%">
<pre>const T &amp;</pre>
</td>
</tr>
<tr>
<td valign="top" width="50%">
<pre>T &amp;</pre>
</td>
<td valign="top" width="50%">
<pre>T &amp;</pre>
</td>
</tr>
<tr>
<td valign="top" width="50%">
<pre>const T &amp;</pre>
</td>
<td valign="top" width="50%">
<pre>const T &amp;</pre>
</td>
</tr>
</table>
<p>A little familiarity with the type traits classes allows us to construct a
single mapping that allows us to determine the type of parameter from the type
of the contained class. The type traits classes provide a transformation &quot;add_reference&quot;,
which adds a reference to its type, unless it is already a reference.</p>
<table border="1" cellpadding="7" cellspacing="1" width="580">
<tr>
<td valign="top" width="21%">Type of &quot;T1&quot;</td>
<td valign="top" width="27%">Type of &quot;const T1&quot;</td>
<td valign="top" width="53%">Type of &quot;add_reference&lt;const
T1&gt;::type&quot;</td>
</tr>
<tr>
<td valign="top" width="21%">
<pre>T</pre>
</td>
<td valign="top" width="27%">
<pre>const T</pre>
</td>
<td valign="top" width="53%">
<pre>const T &amp;</pre>
</td>
</tr>
<tr>
<td valign="top" width="21%">
<pre>T &amp;</pre>
</td>
<td valign="top" width="27%">
<pre>T &amp; [8]</pre>
</td>
<td valign="top" width="53%">
<pre>T &amp;</pre>
</td>
</tr>
<tr>
<td valign="top" width="21%">
<pre>const T &amp;</pre>
</td>
<td valign="top" width="27%">
<pre>const T &amp;</pre>
</td>
<td valign="top" width="53%">
<pre>const T &amp;</pre>
</td>
</tr>
</table>
<p>This allows us to build a primary template definition for &quot;pair&quot;
that can contain non-reference types, reference types, and constant reference
types:</p>
<pre>template &lt;typename T1, typename T2&gt;
struct pair
{
typedef T1 first_type;
typedef T2 second_type;
T1 first;
T2 second;
pair(boost::add_reference&lt;const T1&gt;::type nfirst,
boost::add_reference&lt;const T2&gt;::type nsecond)
:first(nfirst), second(nsecond) { }
};</pre>
<p>Add back in the standard comparision operators, default constructor, and
template copy constructor (which are all the same), and you have a std::pair
that can hold reference types!</p>
<p>This same extension <i>could</i> have been done using partial template
specialization of &quot;pair&quot;, but to specialize &quot;pair&quot; in this
way would require three partial specializations, plus the primary template. Type
traits allows us to define a single primary template that adjusts itself
auto-magically to any of these partial specializations, instead of a brute-force
partial specialization approach. Using type traits in this fashion allows
programmers to delegate partial specialization to the type traits classes,
resulting in code that is easier to maintain and easier to understand.</p>
<h4>Conclusion</h4>
<p>We hope that in this article we have been able to give you some idea of what
type-traits are all about. A more complete listing of the available classes are
in the boost documentation, along with further examples using type traits.
Templates have enabled C++ uses to take the advantage of the code reuse that
generic programming brings; hopefully this article has shown that generic
programming does not have to sink to the lowest common denominator, and that
templates can be optimal as well as generic.</p>
<h4>Acknowledgements</h4>
<p>The authors would like to thank Beman Dawes and Howard Hinnant for their
helpful comments when preparing this article.</p>
<h4>References</h4>
<ol>
<li>Nathan C. Myers, C++ Report, June 1995.</li>
<li>The type traits library is based upon contributions by Steve Cleary, Beman
Dawes, Howard Hinnant and John Maddock: it can be found at www.boost.org.</li>
<li>A scalar type is an arithmetic type (i.e. a built-in integer or floating
point type), an enumeration type, a pointer, a pointer to member, or a
const- or volatile-qualified version of one of these types.</li>
<li>This quote is from Donald Knuth, ACM Computing Surveys, December 1974, pg
268.</li>
<li>The test code is available as part of the boost utility library (see
algo_opt_examples.cpp), the code was compiled with gcc 2.95 with all
optimisations turned on, tests were conducted on a 400MHz Pentium II machine
running Microsoft Windows 98.</li>
<li>John Maddock and Howard Hinnant have submitted a &quot;compressed_pair&quot;
library to Boost, which uses a technique similar to the one described here
to hold references. Their pair also uses type traits to determine if any of
the types are empty, and will derive instead of contain to conserve space --
hence the name &quot;compressed&quot;.</li>
<li>This is actually an issue with the C++ Core Language Working Group (issue
#106), submitted by Bjarne Stroustrup. The tentative resolution is to allow
a &quot;reference to a reference to T&quot; to mean the same thing as a
&quot;reference to T&quot;, but only in template instantiation, in a method
similar to multiple cv-qualifiers.</li>
<li>For those of you who are wondering why this shouldn't be const-qualified,
remember that references are always implicitly constant (for example, you
can't re-assign a reference). Remember also that &quot;const T &amp;&quot;
is something completely different. For this reason, cv-qualifiers on
template type arguments that are references are ignored.</li>
</ol>
<h2>Listing 1</h2>
<pre>namespace detail{
template &lt;bool b&gt;
struct copier
{
template&lt;typename I1, typename I2&gt;
static I2 do_copy(I1 first,
I1 last, I2 out);
};
template &lt;bool b&gt;
template&lt;typename I1, typename I2&gt;
I2 copier&lt;b&gt;::do_copy(I1 first,
I1 last,
I2 out)
{
while(first != last)
{
*out = *first;
++out;
++first;
}
return out;
}
template &lt;&gt;
struct copier&lt;true&gt;
{
template&lt;typename I1, typename I2&gt;
static I2* do_copy(I1* first, I1* last, I2* out)
{
memcpy(out, first, (last-first)*sizeof(I2));
return out+(last-first);
}
};
}
template&lt;typename I1, typename I2&gt;
inline I2 copy(I1 first, I1 last, I2 out)
{
typedef typename
boost::remove_cv&lt;
typename std::iterator_traits&lt;I1&gt;
::value_type&gt;::type v1_t;
typedef typename
boost::remove_cv&lt;
typename std::iterator_traits&lt;I2&gt;
::value_type&gt;::type v2_t;
enum{ can_opt =
boost::is_same&lt;v1_t, v2_t&gt;::value
&amp;&amp; boost::is_pointer&lt;I1&gt;::value
&amp;&amp; boost::is_pointer&lt;I2&gt;::value
&amp;&amp; boost::
has_trivial_assign&lt;v1_t&gt;::value
};
return detail::copier&lt;can_opt&gt;::
do_copy(first, last, out);
}</pre>
<hr>
<p>© Copyright John Maddock and Steve Cleary, 2000</p>
</body>
</html>