tree: bfddfe257210dc42be1de267405a0e4e9264172e [path history] [tgz]
  1. .github/
  2. .gitignore
  3. .gitmodules
  4. CMakeLists.txt
  5. DISCLAIMER-WIP
  6. LICENSE
  7. MANIFEST.in
  8. NOTICE
  9. README.md
  10. common/
  11. cpc/
  12. fi/
  13. hll/
  14. kll/
  15. pyproject.toml
  16. python/
  17. sampling/
  18. setup.py
  19. theta/
README.md

This is a C++ version of the DataSketches core library. See Apache DataSketches home

Apache DataSketches is an open source, high-performance library of stochastic streaming algorithms commonly called “sketches” in the data sciences. Sketches are small, stateful programs that process massive data as a stream and can provide approximate answers, with mathematical guarantees, to computationally difficult queries orders-of-magnitude faster than traditional, exact methods.

This code requires C++11. It was tested with GCC 4.8.5 (standard in RedHat at the time of this writing), GCC 8.2.0 and Apple LLVM version 10.0.1 (clang-1001.0.46.4)

This includes Python bindings. For the Python interface, see the README notes in the python subdirectory.

This library is header-only. The build process provided is only for building unit tests and the python library.

Building the unit tests requires cmake 3.12.0 or higher.

Installing the latest cmake on OSX: brew install cmake

Building and running unit tests using cmake for OSX and Linux:

$ mkdir build
$ cd build
$ cmake ..
$ make
$ make test

Building and running unit tests using cmake for Windows from the command line:

$ mkdir build $ cd build $ cmake .. $ cd .. $ cmake --build build --config Release $ cmake --build build --config Release --target RUN_TESTS

How to Contact Us