added notes about benchmarks

This commit is contained in:
Hans Dembinski 2016-05-06 15:05:10 -04:00
parent 202de7eb7c
commit 7a4019a35b
6 changed files with 121 additions and 41 deletions

View File

@ -2,9 +2,9 @@
Fast n-dimensional histogram with convenient interface for C++ and Python
This project contains an easy-to-use powerful n-dimensional histogram class implemented in `C++0x`, optimized for convenience and excellent performance under heavy duty. The histogram has a complete [C++](http://yosefk.com/c++fqa/defective.html) and [Python](http://www.python.org) interface. Histogram instances can be moved over the language boundary with ease. [Numpy](http://www.numpy.org) is fully supported; histograms can be filled with Numpy arrays at C speeds and are convertible into Numpy arrays without copying data. Histograms can be streamed from/to files and pickled in Python.
This project contains an easy-to-use powerful n-dimensional histogram class implemented in `C++03`, optimized for convenience and excellent performance under heavy duty. Move semantics are supported using `boost::move`. The histogram has a complete [C++](http://yosefk.com/c++fqa/defective.html) and [Python](http://www.python.org) interface, and can be passed over the language boundary with ease. [Numpy](http://www.numpy.org) is fully supported; histograms can be filled with Numpy arrays at C speeds and are convertible into Numpy arrays without copying data. Histograms can be streamed from/to files and pickled in Python.
My goal is to submit this project to the [Boost](http://www.boost.org) libraries, that's why it uses the boost directory structure and namespace. The code is released under the [Boost Software License](http://www.boost.org/LICENSE_1_0.txt).
My goal is to submit this project to [Boost](http://www.boost.org), that's why it uses the Boost directory structure and namespace. The code is released under the [Boost Software License](http://www.boost.org/LICENSE_1_0.txt).
### Dependencies
@ -21,8 +21,9 @@ My goal is to submit this project to the [Boost](http://www.boost.org) libraries
* Intuitive and convenient interface
* Support for different binning schemes, including binning of angles
* Support for weighted events, with variance estimates for each bin
* Support for move semantics using `boost::move`
* Optional underflow- and overflow-bins for each dimension
* High-performance through cache-friendly design
* High performance through cache-friendly design
* Space-efficient memory storage that dynamically grows as needed
* Serialization support with zero-suppression
* Multi-language support: C++ and Python
@ -39,6 +40,32 @@ make install # (or just 'make' to run the tests)
To run the tests, do `make test` or `ctest -V` for more output.
## Benchmarks
The following table shows results of a simple benchmark against
* `TH1I`, `TH3I` and `THnI` of the [ROOT framework](https://root.cern.ch>)
* `histogram` and `histogramdd` from the Python module `numpy`
The benchmark against ROOT is implemented in C++, the benchmark against numpy in Python. For a full discussion of the benchmark, see `docs/html/notes.html`.
Test system: Intel Core i7-4500U CPU clocked at 1.8 GHz, 8 GB of DDR3 RAM
================= ======= ======= ======= ======= ======= =======
distribution uniform normal
----------------- ------------------------- -------------------------
dimension 1D 3D 6D 1D 3D 6D
================= ======= ======= ======= ======= ======= =======
No. of fills 12M 4M 2M 12M 4M 2M
C++: ROOT [t/s] 0.127 0.199 0.185 0.168 0.143 0.179
C++: boost [t/s] 0.172 0.177 0.155 0.172 0.171 0.150
Py: numpy [t/s] 0.825 0.727 0.436 0.824 0.426 0.401
Py: boost [t/s] 0.209 0.229 0.192 0.207 0.194 0.168
================= ======= ======= ======= ======= ======= =======
`boost::histogram` shows consistent performance comparable to the specialized ROOT histograms. It is faster than ROOT's implementation of a N-dimensional histogram `THnI`. The performance of `boost::histogram` is similar in C++ and Python, showing only a small overhead in Python. It is consistently faster than numpy's histogram functions.
## Rationale
There is a lack of a widely-used free histogram class. While it is easy to write an 1-dimensional histogram, writing an n-dimensional histogram poses more of a challenge. If you add serialization and Python/Numpy support onto the wish-list, the air becomes thin. The main competitor is the [ROOT framework](https://root.cern.ch). This histogram class is designed to be more convenient to use, and as fast or faster than the equivalent ROOT histograms. It comes without heavy baggage, instead it has a clean and modern C++ design which follows the advice given in popular C++ books, like those of [Meyers](http://www.aristeia.com/books.html) and [Sutter and Alexandrescu](http://www.gotw.ca/publications/c++cs.htm).

View File

@ -45,7 +45,7 @@
<p>Distributed under the <a class="reference external" href="http://www.boost.org/LICENSE_1_0.txt">Boost Software License, Version 1.0</a>, see accompanying file LICENSE.</p>
<div class="section" id="description">
<h2>Description<a class="headerlink" href="#description" title="Permalink to this headline"></a></h2>
<p>This project contains an easy-to-use powerful n-dimensional histogram class implemented in <code class="docutils literal"><span class="pre">C++0x</span></code>, optimized for convenience and excellent performance under heavy duty. The histogram has a complete C++ and a <a class="reference external" href="http://www.python.org">Python</a> interface, and can be moved over the language boundary with ease. <a class="reference external" href="http://www.numpy.org">Numpy</a> is fully supported; histograms can be filled with Numpy arrays at C speeds and are convertible into Numpy arrays without copying data. Histograms can be streamed from/to files and pickled in Python.</p>
<p>This project contains an easy-to-use powerful n-dimensional histogram class implemented in <code class="docutils literal"><span class="pre">C++03</span></code>, optimized for convenience and excellent performance under heavy duty. Move semantics are supported using <cite>boost::move</cite>. The histogram has a complete C++ and a <a class="reference external" href="http://www.python.org">Python</a> interface, and can be passed over the language boundary with ease. <a class="reference external" href="http://www.numpy.org">Numpy</a> is fully supported; histograms can be filled with Numpy arrays at C speeds and are convertible into Numpy arrays without copying data. Histograms can be streamed from/to files and pickled in Python.</p>
<p>My goal is to submit this project to the <a class="reference external" href="http://www.boost.orgBoost">Boost Libraries</a>.</p>
</div>
<div class="section" id="contents">

View File

@ -88,19 +88,29 @@
<div class="section" id="benchmarks">
<h2>Benchmarks<a class="headerlink" href="#benchmarks" title="Permalink to this headline"></a></h2>
<p>One design goal of this project is to be fast. The act of filling the histogram with a number should be insignificant compared to the CPU cycles spend to retrieve/generate that number. Naturally, we also want to beat the competition.</p>
<p>The following shows the results of a simple benchmark against the histogram classes TH1I, TH3I and THnI of the ROOT framework. The comparison is not fair, since TH1I and TH3I are specialized classes for 1 dimension and 3 dimensions. In addition, all ROOT histograms lack a comparable system to define different binning schemes for each axis.</p>
<p>Large vectors are pre-allocated and with random numbers drawn from a uniform or normal distribution for all tests.
In the timed part, these numbers are read from the vector and put into the histograms. This reduces the overhead to memory access. All tests are run 10 times, the minimum is shown.</p>
<p>The following table shows results of a simple benchmark against</p>
<ul class="simple">
<li><code class="xref cpp cpp-type docutils literal"><span class="pre">TH1I</span></code>, <code class="xref cpp cpp-type docutils literal"><span class="pre">TH3I</span></code> and <code class="xref cpp cpp-type docutils literal"><span class="pre">THnI</span></code> of the <a class="reference external" href="https://root.cern.ch">ROOT framework</a></li>
<li><a class="reference internal" href="types.html#module-histogram" title="histogram"><code class="xref py py-func docutils literal"><span class="pre">histogram()</span></code></a> and <code class="xref py py-func docutils literal"><span class="pre">histogramdd()</span></code> from the Python module <code class="xref py py-mod docutils literal"><span class="pre">numpy</span></code></li>
</ul>
<p>The benchmark against ROOT is implemented in C++, the benchmark against numpy in Python.</p>
<p>Remarks:</p>
<ul class="simple">
<li>The comparison with ROOT puts ROOT at the advantage, since <code class="xref cpp cpp-type docutils literal"><span class="pre">TH1I</span></code> and <code class="xref cpp cpp-type docutils literal"><span class="pre">TH3I</span></code> are specialized classes for 1 dimension and 3 dimensions, not a general class for N-dimensions like <code class="xref cpp cpp-class docutils literal"><span class="pre">boost::histogram</span></code>. ROOT histograms also lack a comparably flexible system to define different binning schemes for each axis.</li>
<li>Large vectors are pre-allocated and with random numbers drawn from a uniform or normal distribution for all tests. In the timed part, these numbers are read from the vector and put into the histograms. This reduces the overhead merely to memory access.</li>
<li>The test with uniform random numbers never fills the overflow and underflow bins, while the test with random numbers from a normal distribution does. This explains some of the differences between the two distributions.</li>
<li>All tests are repeated 10 times, the minimum is shown.</li>
</ul>
<p>Test system: Intel Core i7-4500U CPU clocked at 1.8 GHz, 8 GB of DDR3 RAM</p>
<table border="1" class="docutils">
<colgroup>
<col width="22%" />
<col width="13%" />
<col width="13%" />
<col width="13%" />
<col width="13%" />
<col width="13%" />
<col width="13%" />
<col width="29%" />
<col width="12%" />
<col width="12%" />
<col width="12%" />
<col width="12%" />
<col width="12%" />
<col width="12%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">distribution</th>
@ -117,24 +127,49 @@ In the timed part, these numbers are read from the vector and put into the histo
</tr>
</thead>
<tbody valign="top">
<tr class="row-odd"><td>ROOT</td>
<td>0.01046</td>
<td>0.02453</td>
<td>0.01050</td>
<td>0.01406</td>
<td>0.01766</td>
<td>0.01028</td>
<tr class="row-odd"><td>No. of fills</td>
<td>12M</td>
<td>4M</td>
<td>2M</td>
<td>12M</td>
<td>4M</td>
<td>2M</td>
</tr>
<tr class="row-even"><td>boost</td>
<td>0.01603</td>
<td>0.01922</td>
<td>0.00662</td>
<td>0.01604</td>
<td>0.01836</td>
<td>0.00750</td>
<tr class="row-even"><td>C++: ROOT [t/s]</td>
<td>0.127</td>
<td>0.199</td>
<td>0.185</td>
<td>0.168</td>
<td>0.143</td>
<td>0.179</td>
</tr>
<tr class="row-odd"><td>C++: boost [t/s]</td>
<td>0.172</td>
<td>0.177</td>
<td>0.155</td>
<td>0.172</td>
<td>0.171</td>
<td>0.150</td>
</tr>
<tr class="row-even"><td>Py: numpy [t/s]</td>
<td>0.825</td>
<td>0.727</td>
<td>0.436</td>
<td>0.824</td>
<td>0.426</td>
<td>0.401</td>
</tr>
<tr class="row-odd"><td>Py: boost [t/s]</td>
<td>0.209</td>
<td>0.229</td>
<td>0.192</td>
<td>0.207</td>
<td>0.194</td>
<td>0.168</td>
</tr>
</tbody>
</table>
<p><code class="xref cpp cpp-class docutils literal"><span class="pre">boost::histogram</span></code> shows consistent performance comparable to the specialized ROOT histograms. It is faster than ROOT&#8217;s implementation of a N-dimensional histogram <code class="xref cpp cpp-type docutils literal"><span class="pre">THnI</span></code>. The performance of <code class="xref cpp cpp-class docutils literal"><span class="pre">boost::histogram</span></code> is similar in C++ and Python, showing only a small overhead in Python. It is consistently faster than numpy&#8217;s histogram functions.</p>
</div>
</div>

File diff suppressed because one or more lines are too long

View File

@ -10,7 +10,7 @@ Distributed under the `Boost Software License, Version 1.0 <http://www.boost.org
Description
-----------
This project contains an easy-to-use powerful n-dimensional histogram class implemented in ``C++0x``, optimized for convenience and excellent performance under heavy duty. The histogram has a complete C++ and a `Python <http://www.python.org>`_ interface, and can be moved over the language boundary with ease. `Numpy <http://www.numpy.org>`_ is fully supported; histograms can be filled with Numpy arrays at C speeds and are convertible into Numpy arrays without copying data. Histograms can be streamed from/to files and pickled in Python.
This project contains an easy-to-use powerful n-dimensional histogram class implemented in ``C++03``, optimized for convenience and excellent performance under heavy duty. Move semantics are supported using `boost::move`. The histogram has a complete C++ and a `Python <http://www.python.org>`_ interface, and can be passed over the language boundary with ease. `Numpy <http://www.numpy.org>`_ is fully supported; histograms can be filled with Numpy arrays at C speeds and are convertible into Numpy arrays without copying data. Histograms can be streamed from/to files and pickled in Python.
My goal is to submit this project to the `Boost Libraries <http://www.boost.org Boost>`_.

View File

@ -55,18 +55,36 @@ Benchmarks
One design goal of this project is to be fast. The act of filling the histogram with a number should be insignificant compared to the CPU cycles spend to retrieve/generate that number. Naturally, we also want to beat the competition.
The following shows the results of a simple benchmark against the histogram classes TH1I, TH3I and THnI of the ROOT framework. The comparison is not fair, since TH1I and TH3I are specialized classes for 1 dimension and 3 dimensions. In addition, all ROOT histograms lack a comparable system to define different binning schemes for each axis.
The following table shows results of a simple benchmark against
Large vectors are pre-allocated and with random numbers drawn from a uniform or normal distribution for all tests.
In the timed part, these numbers are read from the vector and put into the histograms. This reduces the overhead to memory access. All tests are run 10 times, the minimum is shown.
* :cpp:type:`TH1I`, :cpp:type:`TH3I` and :cpp:type:`THnI` of the `ROOT framework <https://root.cern.ch>`_
* :py:func:`histogram` and :py:func:`histogramdd` from the Python module :py:mod:`numpy`
The benchmark against ROOT is implemented in C++, the benchmark against numpy in Python.
Remarks:
* The comparison with ROOT puts ROOT at the advantage, since :cpp:type:`TH1I` and :cpp:type:`TH3I` are specialized classes for 1 dimension and 3 dimensions, not a general class for N-dimensions like :cpp:class:`boost::histogram`. ROOT histograms also lack a comparably flexible system to define different binning schemes for each axis.
* Large vectors are pre-allocated and with random numbers drawn from a uniform or normal distribution for all tests. In the timed part, these numbers are read from the vector and put into the histograms. This reduces the overhead merely to memory access.
* The test with uniform random numbers never fills the overflow and underflow bins, while the test with random numbers from a normal distribution does. This explains some of the differences between the two distributions.
* All tests are repeated 10 times, the minimum is shown.
Test system: Intel Core i7-4500U CPU clocked at 1.8 GHz, 8 GB of DDR3 RAM
============ ======= ======= ======= ======= ======= =======
distribution uniform normal
------------ ------------------------- -------------------------
dimension 1D 3D 6D 1D 3D 6D
============ ======= ======= ======= ======= ======= =======
ROOT 0.01046 0.02453 0.01050 0.01406 0.01766 0.01028
boost 0.01603 0.01922 0.00662 0.01604 0.01836 0.00750
============ ======= ======= ======= ======= ======= =======
================= ======= ======= ======= ======= ======= =======
distribution uniform normal
----------------- ------------------------- -------------------------
dimension 1D 3D 6D 1D 3D 6D
================= ======= ======= ======= ======= ======= =======
No. of fills 12M 4M 2M 12M 4M 2M
C++: ROOT [t/s] 0.127 0.199 0.185 0.168 0.143 0.179
C++: boost [t/s] 0.172 0.177 0.155 0.172 0.171 0.150
Py: numpy [t/s] 0.825 0.727 0.436 0.824 0.426 0.401
Py: boost [t/s] 0.209 0.229 0.192 0.207 0.194 0.168
================= ======= ======= ======= ======= ======= =======
:cpp:class:`boost::histogram` shows consistent performance comparable to the specialized ROOT histograms. It is faster than ROOT's implementation of a N-dimensional histogram :cpp:type:`THnI`. The performance of :cpp:class:`boost::histogram` is similar in C++ and Python, showing only a small overhead in Python. It is consistently faster than numpy's histogram functions.