fabs for gcc4 compatibility
2 files changed
tree: eeeb64ea9a86d70248d36b85ac07b4914e2dcf12
  1. cpp/
  2. results/
  3. src/
  4. tools/
  5. .gitattributes
  6. .gitignore
  7. .travis.yml
  10. NOTICE
  11. pom.xml
  12. README.md

Characterization Java & C++ Component

We define characterization as the task of comprehensively measuring accuracy or speed performance of our library. These characterization tests are often long running (some can run for days) and very resource intensive, which makes them unsuitable for including in unit tests. The code in this repository are some of the test suites we use to create some of the plots on our website and provide evidence for our speed and accuracy claims.
This code is shared here so that others can duplicate our own characterizations.

The code here is shared “as-is” and does not pretend to have the same level of quality as the primary repositories (java, pig, hive and vector). This code is not archived to Maven Central and will change from time-to-time as we grow these characterization suites.


DataSketches Library Website

Java Core Overview

Java Core Javadocs

Build / Run Instructions (Java)

JDK8 is required to compile

This Java classes of this DataSketches component must be compiled using JDK 8.

Recommended Build Tool

This DataSketches component is structured as a Maven project and Maven is the recommended Build Tool.

There are two types of tests: normal unit tests and tests run by the strict profile.

To run normal unit tests:

$ mvn clean test

To run the strict profile tests:

$ mvn clean test -P strict



See the pom.xml for the top-level dependencies.


See the pom.xml file for test dependencies.


  • The characterization tests are called profiles and are located by type under the directories:
    • src/main/java/org/apache/datasketches/characterization/<type>/<test>.java
  • These tests have many parameters that are specified in a corresponding configuration “.conf” file located in directories:
    • src/main/resources/<type>/<test>.conf
    • One of the parameters specified by the .conf file is the specific “Job Profile” that is to be run using that configuration.
  • It is recommended that you use your IDE and run the test by executing org.apache.datasketches.job.main(<location of .conf file>). The IDE should resolve all the required dependencies specified by the pom.xml file for you. With Eclipse, the command is “run as java application”. IntelliJ should have something similar. The output is sent to Standard Out.

Build Instructions (C++)

From within Eclipse

  1. After your project is created, from “Project Properties”
  2. From the Eclipse C++ Build Menu, check “Generate Makefiles automatically”.
  3. Under “Settings”, select “Compiler”, then “Includes” and add incude directories for the appropriate sketches and common.
  4. Under “Optimization” select “-O3” and “-DNDEBUG”.

How to Contact Us