{% include JB/setup %}
If you want to build from source, you must first install the following dependencies:
If you haven't installed Git and Maven yet, check the Build requirements section and follow the step by step instructions from there.
git clone https://github.com/apache/zeppelin.git
You can build Zeppelin with following maven command:
./mvnw clean package -DskipTests [Options]
Check build-profiles section for further build options. If you are behind proxy, follow instructions in Proxy setting section.
If you're interested in contribution, please check Contributing to Apache Zeppelin (Code) and Contributing to Apache Zeppelin (Website).
You can directly start Zeppelin by running the following command after successful build:
./bin/zeppelin-daemon.sh start
To be noticed, the spark profiles here only affect the unit test (no need to specify SPARK_HOME) of spark interpreter. Zeppelin doesn't require you to build with different spark to make different versions of spark work in Zeppelin. You can run different versions of Spark in Zeppelin as long as you specify SPARK_HOME. Actually Zeppelin supports all the versions of Spark from 3.3 to 3.5.
To build with a specific Spark version or scala versions, define one or more of the following profiles and options:
-Pspark-[version]Set spark major version
Available profiles are
-Pspark-3.5 -Pspark-3.4 -Pspark-3.3
minor version can be adjusted by -Dspark.version=x.x.x
-Pspark-scala-[version] (optional)To be noticed, these profiles also only affect the unit test (no need to specify SPARK_HOME) of Spark interpreter. Actually Zeppelin supports all the versions of scala (2.12, 2.13) in Spark interpreter as long as you specify SPARK_HOME.
Available profiles are
-Pspark-scala-2.12 -Pspark-scala-2.13
To be noticed, hadoop profiles only affect Zeppelin server, it doesn't affect any interpreter. Zeppelin server use hadoop in some cases, such as using hdfs as notebook storage. You can check this page for more details about how to configure hadoop in Zeppelin.
Hadoop version can be adjusted by -Dhadoop.version=x.x.x
-Pvendor-repo (optional)enable 3rd party vendor repository (Cloudera, Hortonworks)
Build examples under zeppelin-examples directory
Here are some examples with several options:
# build with spark-3.5, spark-scala-2.12 ./mvnw clean package -Pspark-3.5 -Pspark-scala-2.12 -DskipTests # build with spark-3.5, spark-scala-2.13 ./mvnw clean package -Pspark-3.5 -Pspark-scala-2.13 -DskipTests
Ignite Interpreter
./mvnw clean package -Dignite.version=1.9.0 -DskipTests
Here are additional configurations that could be optionally tuned using the trailing -D option for maven commands
Spark package
spark.archive # default spark-${spark.version} spark.src.download.url # default http://d3kbcqa49mib13.cloudfront.net/${spark.archive}.tgz spark.bin.download.url # default http://d3kbcqa49mib13.cloudfront.net/${spark.archive}-bin-without-hadoop.tgz
Py4J package
python.py4j.version # default 0.10.9.7 pypi.repo.url # default https://pypi.python.org/packages python.py4j.repo.folder # default /64/5c/01e13b68e8caafece40d549f232c9b5677ad1016071a48d04cc3895acaa3
final URL location for Py4J package will be produced as following:
${pypi.repo.url}${python.py4j.repo.folder}py4j-${python.py4j.version}.zip
Frontend Maven Plugin configurations
plugin.frontend.nodeDownloadRoot # default https://nodejs.org/dist/ plugin.frontend.npmDownloadRoot # default https://registry.npmjs.org/npm/-/ plugin.frontend.yarnDownloadRoot # default https://github.com/yarnpkg/yarn/releases/download/
If you don't have requirements prepared, install it. (The installation method may vary according to your environment, example is for Ubuntu.)
sudo apt-get update sudo apt-get install git sudo apt-get install openjdk-11-jdk sudo apt-get install npm sudo apt-get install libfontconfig sudo apt-get install r-base-dev sudo apt-get install r-cran-evaluate
Notes:
node --version./mvnw -versionexport MAVEN_OPTS="-Xmx2g -XX:MaxMetaspaceSize=512m"If you‘re behind the proxy, you’ll need to configure maven and npm to pass through it.
First of all, configure maven in your ~/.m2/settings.xml.
<settings> <proxies> <proxy> <id>proxy-http</id> <active>true</active> <protocol>http</protocol> <host>localhost</host> <port>3128</port> <!-- <username>usr</username> <password>pwd</password> --> <nonProxyHosts>localhost|127.0.0.1</nonProxyHosts> </proxy> <proxy> <id>proxy-https</id> <active>true</active> <protocol>https</protocol> <host>localhost</host> <port>3128</port> <!-- <username>usr</username> <password>pwd</password> --> <nonProxyHosts>localhost|127.0.0.1</nonProxyHosts> </proxy> </proxies> </settings>
Then, next commands will configure npm.
npm config set proxy http://localhost:3128 npm config set https-proxy http://localhost:3128 npm config set registry "http://registry.npmjs.org/" npm config set strict-ssl false
Configure git as well
git config --global http.proxy http://localhost:3128 git config --global https.proxy http://localhost:3128 git config --global url."http://".insteadOf git://
To clean up, set active false in Maven settings.xml and run these commands.
npm config rm proxy npm config rm https-proxy git config --global --unset http.proxy git config --global --unset https.proxy git config --global --unset url."http://".insteadOf
Notes:
localhost:3128 with the standard pattern http://user:pwd@host:port.To package the final distribution including the compressed archive, run:
./mvnw clean package -Pbuild-distr
To build a distribution with specific profiles, run:
./mvnw clean package -Pbuild-distr -Pspark-3.5
The profiles -Pspark-3.5 can be adjusted if you wish to build to a specific spark versions.
The archive is generated under zeppelin-distribution/target directory
Zeppelin comes with a set of end-to-end acceptance tests driving headless selenium browser
# assumes zeppelin-server running on localhost:8080 (use -Durl=.. to override) mvn verify # or take care of starting/stoping zeppelin-server from packaged zeppelin-distribuion/target mvn verify -P using-packaged-distr