mirror of
https://github.com/boostorg/histogram.git
synced 2025-05-09 14:57:57 +00:00
switch to boost license
This commit is contained in:
parent
916e687376
commit
4267564d30
34
LICENSE
34
LICENSE
@ -1,21 +1,23 @@
|
|||||||
The MIT License (MIT)
|
Boost Software License - Version 1.0 - August 17th, 2003
|
||||||
|
|
||||||
Copyright (c) 2016 Hans Dembinski
|
Permission is hereby granted, free of charge, to any person or organization
|
||||||
|
obtaining a copy of the software and accompanying documentation covered by
|
||||||
|
this license (the "Software") to use, reproduce, display, distribute,
|
||||||
|
execute, and transmit the Software, and to prepare derivative works of the
|
||||||
|
Software, and to permit third-parties to whom the Software is furnished to
|
||||||
|
do so, all subject to the following:
|
||||||
|
|
||||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
The copyright notices in the Software and this entire statement, including
|
||||||
of this software and associated documentation files (the "Software"), to deal
|
the above license grant, this restriction and the following disclaimer,
|
||||||
in the Software without restriction, including without limitation the rights
|
must be included in all copies of the Software, in whole or in part, and
|
||||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
all derivative works of the Software, unless such copies or derivative
|
||||||
copies of the Software, and to permit persons to whom the Software is
|
works are solely in the form of machine-executable object code generated by
|
||||||
furnished to do so, subject to the following conditions:
|
a source language processor.
|
||||||
|
|
||||||
The above copyright notice and this permission notice shall be included in all
|
|
||||||
copies or substantial portions of the Software.
|
|
||||||
|
|
||||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
|
||||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
|
||||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
|
||||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||||
SOFTWARE.
|
DEALINGS IN THE SOFTWARE.
|
||||||
|
26
README.md
26
README.md
@ -4,7 +4,7 @@ Fast n-dimensional histogram with convenient interface for C++ and Python
|
|||||||
|
|
||||||
This project contains an easy-to-use powerful n-dimensional histogram class implemented in `C++0x`, optimized for convenience and excellent performance under heavy duty. The histogram has a complete C++ and a [Python](http://www.python.org) interface, and can be moved over the language boundary with ease. [Numpy](http://www.numpy.org) is fully supported; histograms can be filled with Numpy arrays at C speeds and are convertible into Numpy arrays without copying data. Histograms can be streamed from/to files and pickled in Python.
|
This project contains an easy-to-use powerful n-dimensional histogram class implemented in `C++0x`, optimized for convenience and excellent performance under heavy duty. The histogram has a complete C++ and a [Python](http://www.python.org) interface, and can be moved over the language boundary with ease. [Numpy](http://www.numpy.org) is fully supported; histograms can be filled with Numpy arrays at C speeds and are convertible into Numpy arrays without copying data. Histograms can be streamed from/to files and pickled in Python.
|
||||||
|
|
||||||
My goal is to submit this project to the [Boost](http://www.boost.org) libraries, that's why it uses the boost directory structure and namespace. The code is released under the MIT License, making it free to use in open- and closed-source projects.
|
My goal is to submit this project to the [Boost](http://www.boost.org) libraries, that's why it uses the boost directory structure and namespace. The code is released under the [Boost Software License](http://www.boost.org/LICENSE_1_0.txt).
|
||||||
|
|
||||||
### Dependencies
|
### Dependencies
|
||||||
|
|
||||||
@ -35,9 +35,9 @@ My goal is to submit this project to the [Boost](http://www.boost.org) libraries
|
|||||||
|
|
||||||
`cmake ../histogram.git/CMake`
|
`cmake ../histogram.git/CMake`
|
||||||
|
|
||||||
`make install` (or just `make` to run the tests afterwards)
|
`make install` (or just `make` to run the tests)
|
||||||
|
|
||||||
To run the tests, do `ctest`.
|
To run the tests, do `make test` or `ctest -V` for more output.
|
||||||
|
|
||||||
## Rationale
|
## Rationale
|
||||||
|
|
||||||
@ -52,16 +52,28 @@ I designed the histogram based on a decade of experience collected in working wi
|
|||||||
|
|
||||||
### Interface convenience, language transparency
|
### Interface convenience, language transparency
|
||||||
|
|
||||||
A histogram should have the same consistent interface whatever the dimension. Like `std::vector` it should *just work*, users shouldn't be forced to make *a priori* choices among several histogram classes and options everytime they encounter a new data set. Python is a great language for data analysis, so the histogram needs Python bindings. Data analysis in Python is Numpy-based, so Numpy support is a must. The histogram should be usable as an interface between a complex simulation or data-storage system written in C++ and data-analysis/plotting in Python: define the histogram in Python, let it be filled on the C++ side, and then get it back for further data analysis or plotting.
|
A histogram should have the same consistent interface whatever the dimension. Like `std::vector` it should *just work*, users shouldn't be forced to make *a priori* choices among several histogram classes and options everytime they encounter a new data set. Python is a great language for data analysis, so the histogram needs Python bindings.
|
||||||
|
|
||||||
|
Data analysis in Python is Numpy-based, so Numpy support is a must. The histogram should be usable as an interface between a complex simulation or data-storage system written in C++ and data-analysis/plotting in Python: define the histogram in Python, let it be filled on the C++ side, and then get it back for further data analysis or plotting.
|
||||||
|
|
||||||
### Powerful binning strategies
|
### Powerful binning strategies
|
||||||
|
|
||||||
The histogram supports half a dozent different binning strategies, conveniently encapsulated in axis objects. There is the standard sorting of real-valued data into bins of equal or varying width, but also binning of angles or integer values. Extra bins that count over- and underflow values are added by default. This feature can be turned off individually for each dimension. The extra bins do not disturb normal counting. On an axis with n-bins, the first bin has the index `0`, the last bin `n-1`, while the under- and overflow bins are accessible at `-1` and `n`, respectively.
|
The histogram supports half a dozent different binning strategies, conveniently encapsulated in axis objects. There is the standard sorting of real-valued data into bins of equal or varying width, but also binning of angles or integer values.
|
||||||
|
|
||||||
|
Extra bins that count over- and underflow values are added by default. This feature can be turned off individually for each dimension. The extra bins do not disturb normal counting. On an axis with n-bins, the first bin has the index `0`, the last bin `n-1`, while the under- and overflow bins are accessible at `-1` and `n`, respectively.
|
||||||
|
|
||||||
### Performance, cache-friendliness and memory-efficiency
|
### Performance, cache-friendliness and memory-efficiency
|
||||||
|
|
||||||
Dense storage in memory is a must for high performance. Unfortunately, the [curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality) quickly become a problem as the number of dimensions grows, leading to histograms which consume large amounts (up to GBs) of memory. Fortunately, having many dimensions typically reduces the number of counts per bin, since tuples get spread over many dimensions. The histogram uses an adaptive count size per bin to exploit this, which starts with the smallest size per bin of 1 byte and increases transparently as needed up to 8 byte per bin. A `std::vector` grows in *length* as new elements are added, while the count storage grows in *depth*.
|
Dense storage in memory is a must for high performance. Unfortunately, the [curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality) quickly become a problem as the number of dimensions grows, leading to histograms which consume large amounts (up to GBs) of memory.
|
||||||
|
|
||||||
|
Fortunately, having many dimensions typically reduces the number of counts per bin, since tuples get spread over many dimensions. The histogram uses an adaptive count size per bin to exploit this, which starts with the smallest size per bin of 1 byte and increases transparently as needed up to 8 byte per bin. A `std::vector` grows in *length* as new elements are added, while the count storage grows in *depth*.
|
||||||
|
|
||||||
|
### Support for weighted counts and variance estimates
|
||||||
|
|
||||||
|
A histogram categorizes and counts, so the natural choice for the data type of the counts are integers. However, in particle physics, histograms are often filled with weighted events, for example, to make sure that two histograms look the same in one variable, while the distribution of another, correlated variable is a subject of study.
|
||||||
|
|
||||||
|
This histogram can be filled with either weighted or unweighted counts. In the weighted case, the sum of weights is stored in a double. The histogram provides a variance estimate is both cases. In the unweighted case, the estimate is computed from the count itself, using Poisson-theory. In the weighted case, the sum of squared weights is stored alongside the sum of weights, and used to compute a variance estimate.
|
||||||
|
|
||||||
## State of project
|
## State of project
|
||||||
|
|
||||||
The histogram is feature-complete for 1.0 version. Roughly 300 unit tests make sure that the implementation works as expected. Comprehensive documentation is a to-do. To grow further, the project needs test users, code review, and feedback.
|
The histogram is feature-complete for 1.0 version. More than 300 individual tests make sure that the implementation works as expected. Comprehensive documentation is a to-do. To grow further, the project needs test users, code review, and feedback.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user