[maven-release-plugin] prepare release v0.5.5
20 files changed
tree: 6ca4db263d981ff7dbaba34d550feb85bd57fd58
  1. _tools/
  2. angular/
  3. bin/
  4. cassandra/
  5. conf/
  6. dev/
  7. docs/
  8. flink/
  9. geode/
  10. hive/
  11. ignite/
  12. kylin/
  13. lens/
  14. licenses/
  15. markdown/
  16. notebook/
  17. phoenix/
  18. postgresql/
  19. shell/
  20. spark/
  21. spark-dependencies/
  22. tajo/
  23. testing/
  24. zeppelin-distribution/
  25. zeppelin-interpreter/
  26. zeppelin-server/
  27. zeppelin-web/
  28. zeppelin-zengine/
  29. .gitignore
  30. .travis.yml
  31. CONTRIBUTING.md
  32. DEPLOY.md
  33. DISCLAIMER
  34. LICENSE
  35. NOTICE
  36. pom.xml
  37. README.md
  38. Roadmap.md
  39. STYLE.md
README.md

#Zeppelin

Documentation: User Guide
Mailing List: User and Dev mailing list
Continuous Integration: Build Status
Contributing: Contribution Guide
License: Apache 2.0

Zeppelin, a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more.

Core feature:

  • Web based notebook style editor.
  • Built-in Apache Spark support

To know more about Zeppelin, visit our web site http://zeppelin.incubator.apache.org

Requirements

  • Java 1.7
  • Tested on Mac OSX, Ubuntu 14.X, CentOS 6.X
  • Maven (if you want to build from the source code)
  • Node.js Package Manager

Getting Started

Before Build

If you don't have requirements prepared, install it. (The installation method may vary according to your environment, example is for Ubuntu.)

sudo apt-get update
sudo apt-get install git
sudo apt-get install openjdk-7-jdk
sudo apt-get install npm
sudo apt-get install libfontconfig

# install maven
wget http://www.eu.apache.org/dist/maven/maven-3/3.3.3/binaries/apache-maven-3.3.3-bin.tar.gz
sudo tar -zxf apache-maven-3.3.3-bin.tar.gz -C /usr/local/
sudo ln -s /usr/local/apache-maven-3.3.3/bin/mvn /usr/local/bin/mvn

Notes:

  • Ensure node is installed by running node --version
  • Ensure maven is running version 3.1.x or higher with mvn -version

Build

If you want to build Zeppelin from the source, please first clone this repository, then:

mvn clean package -DskipTests

To build with a specific Spark version, Hadoop version or specific features, define one or more of the spark, pyspark, hadoop and yarn profiles, such as:

-Pspark-1.5   [Version to run in local spark mode]
-Ppyspark     [optional: enable PYTHON support in spark via the %pyspark interpreter]
-Pyarn        [optional: enable YARN support]
-Dhadoop.version=2.2.0  [hadoop distribution]
-Phadoop-2.2            [hadoop version]

Currently, final/full distributions run with:

mvn clean package -Pspark-1.5 -Phadoop-2.4 -Pyarn -Ppyspark

Spark 1.5.x

mvn clean package -Pspark-1.5 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests

Spark 1.4.x

mvn clean package -Pspark-1.4 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests

Spark 1.3.x

mvn clean package -Pspark-1.3 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests

Spark 1.2.x

mvn clean package -Pspark-1.2 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests 

Spark 1.1.x

mvn clean package -Pspark-1.1 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests 

CDH 5.X

mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0 -Phadoop-2.4 -DskipTests

Yarn (Hadoop 2.7.x)

mvn clean package -Pspark-1.4 -Dspark.version=1.4.1 -Dhadoop.version=2.7.0 -Phadoop-2.6 -Pyarn -DskipTests

Yarn (Hadoop 2.6.x)

mvn clean package -Pspark-1.1 -Dhadoop.version=2.6.0 -Phadoop-2.6 -Pyarn -DskipTests

Yarn (Hadoop 2.4.x)

mvn clean package -Pspark-1.1 -Dhadoop.version=2.4.0 -Phadoop-2.4 -Pyarn -DskipTests

Yarn (Hadoop 2.3.x)

mvn clean package -Pspark-1.1 -Dhadoop.version=2.3.0 -Phadoop-2.3 -Pyarn -DskipTests

Yarn (Hadoop 2.2.x)

mvn clean package -Pspark-1.1 -Dhadoop.version=2.2.0 -Phadoop-2.2 -Pyarn -DskipTests

Ignite (1.1.0-incubating and later)

mvn clean package -Dignite.version=1.1.0-incubating -DskipTests

Configure

If you wish to configure Zeppelin option (like port number), configure the following files:

./conf/zeppelin-env.sh
./conf/zeppelin-site.xml

(You can copy ./conf/zeppelin-env.sh.template into ./conf/zeppelin-env.sh. Same for zeppelin-site.xml.)

Setting SPARK_HOME and HADOOP_HOME

Without SPARK_HOME and HADOOP_HOME, Zeppelin uses embedded Spark and Hadoop binaries that you have specified with mvn build option. If you want to use system provided Spark and Hadoop, export SPARK_HOME and HADOOP_HOME in zeppelin-env.sh You can use any supported version of spark without rebuilding Zeppelin.

# ./conf/zeppelin-env.sh
export SPARK_HOME=...
export HADOOP_HOME=...

External cluster configuration

Mesos

# ./conf/zeppelin-env.sh
export MASTER=mesos://...
export ZEPPELIN_JAVA_OPTS="-Dspark.executor.uri=/path/to/spark-*.tgz" or SPARK_HOME="/path/to/spark_home"
export MESOS_NATIVE_LIBRARY=/path/to/libmesos.so

If you set SPARK_HOME, you should deploy spark binary on the same location to all worker nodes. And if you set spark.executor.uri, every worker can read that file on its node.

Yarn

# ./conf/zeppelin-env.sh
export SPARK_HOME=/path/to/spark_dir

Run

./bin/zeppelin-daemon.sh start

browse localhost:8080 in your browser.

For configuration details check ./conf subdirectory.

Package

To package the final distribution including the compressed archive, run:

  mvn clean package -Pbuild-distr

To build a distribution with specific profiles, run:

  mvn clean package -Pbuild-distr -Pspark-1.5 -Phadoop-2.4 -Pyarn -Ppyspark

The profiles -Pspark-1.5 -Phadoop-2.4 -Pyarn -Ppyspark can be adjusted if you wish to build to a specific spark versions, or omit support such as yarn.

The archive is generated under zeppelin-distribution/target directory

###Run end-to-end tests Zeppelin comes with a set of end-to-end acceptance tests driving headless selenium browser

  #assumes zeppelin-server running on localhost:8080 (use -Durl=.. to override)
  mvn verify

  #or take care of starting\stoping zeppelin-server from packaged _zeppelin-distribuion/target_
  mvn verify -P using-packaged-distr

Analytics