We define characterization as the task of comprehensively measuring accuracy or speed performance of our library. These characterization tests are often long running (some can run for days) and very resource intensive, which makes them unsuitable for including in unit tests. The code in this repository are some of the test suites we use to create some of the plots on our website and provide evidence for our speed and accuracy claims. This code is shared here so that others can duplicate our own characterizations.
The code here is shared “as-is” and does not pretend to have the same level of quality as the primary repositories (jave, pig, hive and vector). This code is not archived to Maven Central and will change from time-to-time as we grow these characterization suites.
This DataSketches component is pure Java and you must compile using JDK 8.
This DataSketches component is structured as a Maven project and Maven is the recommended Build Tool.
There are two types of tests: normal unit tests and tests run by the strict profile.
To run normal unit tests:
$ mvn clean test
To run the strict profile tests:
$ mvn clean test -P strict
See the pom.xml for the top-level dependencies.
See the pom.xml file for test dependencies.