Prerequisites

Caution

You need to use a specific hivemall-spark-xxx-with-dependencies.jar for each Spark version.

Installation

First, you download a compiled Spark package from the Spark official web page and invoke spark-shell with a compiled Hivemall binary.

$ ./bin/spark-shell --jars hivemall-spark-xxx-with-dependencies.jar

Notice

If you would like to try Hivemall functions on the latest release of Spark, you just say bin/spark-shell in a Hivemall package. This command automatically downloads the latest Spark version, compiles Hivemall for the version, and invokes spark-shell with the compiled Hivemall binary.

Then, you load scripts for Hivemall functions.

scala> :load resources/ddl/define-all.spark
scala> :load resources/ddl/import-packages.spark

Installation via Spark Packages

In another way to install Hivemall, you can use a --packages option.

$ ./bin/spark-shell --packages apache-hivemall:apache-hivemall:0.5.1-<spark version>

You need to set your Spark version at <spark version>, e.g., spark2.2 for Spark v2.2.x.