tree: db6b254de560d3b3ef4240a3351f23c0a69c0426 [path history] [tgz]
  2. pom.xml
  3. src/

Pinot Hadoop


Pinot supports data segment generation from Hadoop.


To build the project:

mvn clean install -DskipTests

This will create a fat jar for pinot hadoop jar.


Create a job properties configuration file, e.g.:

# Segment creation job configs:

# Segment tar push job configs:,controller_host_1

Pinot data schema file needs to be checked in locally and put the schema file in job properties file.

The org.apache.pinot.hadoop.PinotHadoopJobLauncher class (the main class of the shaded JAR in pinot-hadoop) should be run to accomplish this:

# Segment creation
    hadoop jar  pinot-hadoop-1.0-SNAPSHOT.jar SegmentCreation
After this point, we have built the data segment from the raw data file.
Next step is to push those data into pinot controller

# Segment tar push
    hadoop jar  pinot-hadoop-1.0-SNAPSHOT.jar SegmentTarPush

There is also a job that combines the two jobs together.

# Segment creation and tar push
    hadoop jar  pinot-hadoop-1.0-SNAPSHOT.jar SegmentCreationAndTarPush