blob: c5644c245f9ef66d92d5641d41ec0dabe65b3005 [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Apache Zeppelin 0.10.1 Documentation: Install</title>
<meta name="description" content="This page will help you get started and will guide you through installing Apache Zeppelin and running it in the command line.">
<meta name="author" content="The Apache Software Foundation">
<!-- Enable responsive viewport -->
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<!-- Le HTML5 shim, for IE6-8 support of HTML elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<link href="/docs/0.10.1/assets/themes/zeppelin/font-awesome.min.css" rel="stylesheet">
<!-- Le styles -->
<link href="/docs/0.10.1/assets/themes/zeppelin/bootstrap/css/bootstrap.css" rel="stylesheet">
<link href="/docs/0.10.1/assets/themes/zeppelin/css/style.css?body=1" rel="stylesheet" type="text/css">
<link href="/docs/0.10.1/assets/themes/zeppelin/css/syntax.css" rel="stylesheet" type="text/css" media="screen" />
<!-- Le fav and touch icons -->
<!-- Update these with your own images
<link rel="shortcut icon" href="images/favicon.ico">
<link rel="apple-touch-icon" href="images/apple-touch-icon.png">
<link rel="apple-touch-icon" sizes="72x72" href="images/apple-touch-icon-72x72.png">
<link rel="apple-touch-icon" sizes="114x114" href="images/apple-touch-icon-114x114.png">
-->
<!-- Js -->
<script src="/docs/0.10.1/assets/themes/zeppelin/jquery-1.10.2.min.js"></script>
<script src="/docs/0.10.1/assets/themes/zeppelin/bootstrap/js/bootstrap.min.js"></script>
<script src="/docs/0.10.1/assets/themes/zeppelin/js/docs.js"></script>
<script src="/docs/0.10.1/assets/themes/zeppelin/js/anchor.min.js"></script>
<script src="/docs/0.10.1/assets/themes/zeppelin/js/toc.js"></script>
<script src="/docs/0.10.1/assets/themes/zeppelin/js/lunr.min.js"></script>
<script src="/docs/0.10.1/assets/themes/zeppelin/js/search.js"></script>
<!-- atom & rss feed -->
<link href="/docs/0.10.1/atom.xml" type="application/atom+xml" rel="alternate" title="Sitewide ATOM Feed">
<link href="/docs/0.10.1/rss.xml" type="application/rss+xml" rel="alternate" title="Sitewide RSS Feed">
<!-- Matomo -->
<script>
var _paq = window._paq = window._paq || [];
/* tracker methods like "setCustomDimension" should be called before "trackPageView" */
_paq.push(["setDoNotTrack", true]);
_paq.push(["disableCookies"]);
_paq.push(['trackPageView']);
_paq.push(['enableLinkTracking']);
(function() {
var u="https://analytics.apache.org/";
_paq.push(['setTrackerUrl', u+'matomo.php']);
_paq.push(['setSiteId', '69']);
var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s);
})();
</script>
<!-- End Matomo Code -->
</head>
<body>
<div id="menu" class="navbar navbar-inverse navbar-fixed-top" role="navigation">
<div class="container navbar-container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<div class="navbar-brand">
<a class="navbar-brand-main" href="http://zeppelin.apache.org">
<img src="/docs/0.10.1/assets/themes/zeppelin/img/zeppelin_logo.png" width="50"
style="margin-top: -2px;" alt="I'm zeppelin">
<span style="margin-left: 5px; font-size: 27px;">Zeppelin</span>
<a class="navbar-brand-version" href="/docs/0.10.1"
style="font-size: 15px; color: white;"> 0.10.1
</a>
</a>
</div>
</div>
<nav class="navbar-collapse collapse" role="navigation">
<ul class="nav navbar-nav">
<li>
<a href="#" data-toggle="dropdown" class="dropdown-toggle">Quick Start <b class="caret"></b></a>
<ul class="dropdown-menu">
<li class="title"><span>Getting Started</span></li>
<li><a href="/docs/0.10.1/quickstart/install.html">Install</a></li>
<li><a href="/docs/0.10.1/quickstart/explore_ui.html">Explore UI</a></li>
<li><a href="/docs/0.10.1/quickstart/tutorial.html">Tutorial</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Run Mode</span></li>
<li><a href="/docs/0.10.1/quickstart/kubernetes.html">Kubernetes</a></li>
<li><a href="/docs/0.10.1/quickstart/docker.html">Docker</a></li>
<li><a href="/docs/0.10.1/quickstart/yarn.html">Yarn</a></li>
<li role="separator" class="divider"></li>
<li><a href="/docs/0.10.1/quickstart/spark_with_zeppelin.html">Spark with Zeppelin</a></li>
<li><a href="/docs/0.10.1/quickstart/flink_with_zeppelin.html">Flink with Zeppelin</a></li>
<li><a href="/docs/0.10.1/quickstart/sql_with_zeppelin.html">SQL with Zeppelin</a></li>
<li><a href="/docs/0.10.1/quickstart/python_with_zeppelin.html">Python with Zeppelin</a></li>
<li><a href="/docs/0.10.1/quickstart/r_with_zeppelin.html">R with Zeppelin</a></li>
</ul>
</li>
<li>
<a href="#" data-toggle="dropdown" class="dropdown-toggle">Usage<b class="caret"></b></a>
<ul class="dropdown-menu scrollable-menu">
<li class="title"><span>Dynamic Form</span></li>
<li><a href="/docs/0.10.1/usage/dynamic_form/intro.html">What is Dynamic Form?</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Display System</span></li>
<li><a href="/docs/0.10.1/usage/display_system/basic.html#text">Text Display</a></li>
<li><a href="/docs/0.10.1/usage/display_system/basic.html#html">HTML Display</a></li>
<li><a href="/docs/0.10.1/usage/display_system/basic.html#table">Table Display</a></li>
<li><a href="/docs/0.10.1/usage/display_system/basic.html#network">Network Display</a></li>
<li><a href="/docs/0.10.1/usage/display_system/angular_backend.html">Angular Display using Backend API</a></li>
<li><a href="/docs/0.10.1/usage/display_system/angular_frontend.html">Angular Display using Frontend API</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Interpreter</span></li>
<li><a href="/docs/0.10.1/usage/interpreter/overview.html">Overview</a></li>
<li><a href="/docs/0.10.1/usage/interpreter/interpreter_binding_mode.html">Interpreter Binding Mode</a></li>
<li><a href="/docs/0.10.1/usage/interpreter/user_impersonation.html">User Impersonation</a></li>
<li><a href="/docs/0.10.1/usage/interpreter/dependency_management.html">Dependency Management</a></li>
<li><a href="/docs/0.10.1/usage/interpreter/installation.html">Installing Interpreters</a></li>
<!--<li><a href="/docs/0.10.1/usage/interpreter/dynamic_loading.html">Dynamic Interpreter Loading (Experimental)</a></li>-->
<li><a href="/docs/0.10.1/usage/interpreter/execution_hooks.html">Execution Hooks (Experimental)</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Other Features</span></li>
<li><a href="/docs/0.10.1/usage/other_features/publishing_paragraphs.html">Publishing Paragraphs</a></li>
<li><a href="/docs/0.10.1/usage/other_features/personalized_mode.html">Personalized Mode</a></li>
<li><a href="/docs/0.10.1/usage/other_features/customizing_homepage.html">Customizing Zeppelin Homepage</a></li>
<li><a href="/docs/0.10.1/usage/other_features/notebook_actions.html">Notebook Actions</a></li>
<li><a href="/docs/0.10.1/usage/other_features/cron_scheduler.html">Cron Scheduler</a></li>
<li><a href="/docs/0.10.1/usage/other_features/zeppelin_context.html">Zeppelin Context</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>REST API</span></li>
<li><a href="/docs/0.10.1/usage/rest_api/interpreter.html">Interpreter API</a></li>
<li><a href="/docs/0.10.1/usage/rest_api/zeppelin_server.html">Zeppelin Server API</a></li>
<li><a href="/docs/0.10.1/usage/rest_api/notebook.html">Notebook API</a></li>
<li><a href="/docs/0.10.1/usage/rest_api/notebook_repository.html">Notebook Repository API</a></li>
<li><a href="/docs/0.10.1/usage/rest_api/configuration.html">Configuration API</a></li>
<li><a href="/docs/0.10.1/usage/rest_api/credential.html">Credential API</a></li>
<li><a href="/docs/0.10.1/usage/rest_api/helium.html">Helium API</a></li>
<li class="title"><span>Zeppelin SDK</span></li>
<li><a href="/docs/0.10.1/usage/zeppelin_sdk/client_api.html">Client API</a></li>
<li><a href="/docs/0.10.1/usage/zeppelin_sdk/session_api.html">Session API</a></li>
</ul>
</li>
<li>
<a href="#" data-toggle="dropdown" class="dropdown-toggle">Setup<b class="caret"></b></a>
<ul class="dropdown-menu scrollable-menu">
<li class="title"><span>Basics</span></li>
<li><a href="/docs/0.10.1/setup/basics/how_to_build.html">How to Build Zeppelin</a></li>
<li><a href="/docs/0.10.1/setup/basics/hadoop_integration.html">Hadoop Integration</a></li>
<li><a href="/docs/0.10.1/setup/basics/multi_user_support.html">Multi-user Support</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Deployment</span></li>
<!--<li><a href="/docs/0.10.1/setup/deployment/docker.html">Docker Image for Zeppelin</a></li>-->
<li><a href="/docs/0.10.1/setup/deployment/spark_cluster_mode.html#spark-standalone-mode">Spark Cluster Mode: Standalone</a></li>
<li><a href="/docs/0.10.1/setup/deployment/spark_cluster_mode.html#spark-on-yarn-mode">Spark Cluster Mode: YARN</a></li>
<li><a href="/docs/0.10.1/setup/deployment/spark_cluster_mode.html#spark-on-mesos-mode">Spark Cluster Mode: Mesos</a></li>
<li><a href="/docs/0.10.1/setup/deployment/flink_and_spark_cluster.html">Zeppelin with Flink, Spark Cluster</a></li>
<li><a href="/docs/0.10.1/setup/deployment/cdh.html">Zeppelin on CDH</a></li>
<li><a href="/docs/0.10.1/setup/deployment/virtual_machine.html">Zeppelin on VM: Vagrant</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Security</span></li>
<li><a href="/docs/0.10.1/setup/security/authentication_nginx.html">HTTP Basic Auth using NGINX</a></li>
<li><a href="/docs/0.10.1/setup/security/shiro_authentication.html">Shiro Authentication</a></li>
<li><a href="/docs/0.10.1/setup/security/notebook_authorization.html">Notebook Authorization</a></li>
<li><a href="/docs/0.10.1/setup/security/datasource_authorization.html">Data Source Authorization</a></li>
<li><a href="/docs/0.10.1/setup/security/http_security_headers.html">HTTP Security Headers</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Notebook Storage</span></li>
<li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-local-git-repository">Git Storage</a></li>
<li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-s3">S3 Storage</a></li>
<li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-azure">Azure Storage</a></li>
<li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-oss">OSS Storage</a></li>
<li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-zeppelinhub">ZeppelinHub Storage</a></li>
<li><a href="/docs/0.10.1/setup/storage/storage.html#notebook-storage-in-mongodb">MongoDB Storage</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Operation</span></li>
<li><a href="/docs/0.10.1/setup/operation/configuration.html">Configuration</a></li>
<li><a href="/docs/0.10.1/setup/operation/proxy_setting.html">Proxy Setting</a></li>
<li><a href="/docs/0.10.1/setup/operation/upgrading.html">Upgrading</a></li>
<li><a href="/docs/0.10.1/setup/operation/trouble_shooting.html">Trouble Shooting</a></li>
</ul>
</li>
<li>
<a href="#" data-toggle="dropdown" class="dropdown-toggle">Interpreter <b class="caret"></b></a>
<ul class="dropdown-menu scrollable-menu">
<li class="title"><span>Interpreters</span></li>
<li><a href="/docs/0.10.1/usage/interpreter/overview.html">Overview</a></li>
<li role="separator" class="divider"></li>
<li><a href="/docs/0.10.1/interpreter/spark.html">Spark</a></li>
<li><a href="/docs/0.10.1/interpreter/flink.html">Flink</a></li>
<li><a href="/docs/0.10.1/interpreter/jdbc.html">JDBC</a></li>
<li><a href="/docs/0.10.1/interpreter/python.html">Python</a></li>
<li><a href="/docs/0.10.1/interpreter/r.html">R</a></li>
<li role="separator" class="divider"></li>
<li><a href="/docs/0.10.1/interpreter/alluxio.html">Alluxio</a></li>
<li><a href="/docs/0.10.1/interpreter/beam.html">Beam</a></li>
<li><a href="/docs/0.10.1/interpreter/bigquery.html">BigQuery</a></li>
<li><a href="/docs/0.10.1/interpreter/cassandra.html">Cassandra</a></li>
<li><a href="/docs/0.10.1/interpreter/elasticsearch.html">Elasticsearch</a></li>
<li><a href="/docs/0.10.1/interpreter/geode.html">Geode</a></li>
<li><a href="/docs/0.10.1/interpreter/groovy.html">Groovy</a></li>
<li><a href="/docs/0.10.1/interpreter/hazelcastjet.html">Hazelcast Jet</a></li>
<li><a href="/docs/0.10.1/interpreter/hbase.html">HBase</a></li>
<li><a href="/docs/0.10.1/interpreter/hdfs.html">HDFS</a></li>
<li><a href="/docs/0.10.1/interpreter/hive.html">Hive</a></li>
<li><a href="/docs/0.10.1/interpreter/ignite.html">Ignite</a></li>
<li><a href="/docs/0.10.1/interpreter/influxdb.html">influxDB</a></li>
<li><a href="/docs/0.10.1/interpreter/java.html">Java</a></li>
<li><a href="/docs/0.10.1/interpreter/jupyter.html">Jupyter</a></li>
<li><a href="/docs/0.10.1/interpreter/kotlin.html">Kotlin</a></li>
<li><a href="/docs/0.10.1/interpreter/ksql.html">KSQL</a></li>
<li><a href="/docs/0.10.1/interpreter/kylin.html">Kylin</a></li>
<li><a href="/docs/0.10.1/interpreter/lens.html">Lens</a></li>
<li><a href="/docs/0.10.1/interpreter/livy.html">Livy</a></li>
<li><a href="/docs/0.10.1/interpreter/mahout.html">Mahout</a></li>
<li><a href="/docs/0.10.1/interpreter/markdown.html">Markdown</a></li>
<li><a href="/docs/0.10.1/interpreter/mongodb.html">MongoDB</a></li>
<li><a href="/docs/0.10.1/interpreter/neo4j.html">Neo4j</a></li>
<li><a href="/docs/0.10.1/interpreter/pig.html">Pig</a></li>
<li><a href="/docs/0.10.1/interpreter/postgresql.html">Postgresql, HAWQ</a></li>
<li><a href="/docs/0.10.1/interpreter/sap.html">SAP</a></li>
<li><a href="/docs/0.10.1/interpreter/scalding.html">Scalding</a></li>
<li><a href="/docs/0.10.1/interpreter/scio.html">Scio</a></li>
<li><a href="/docs/0.10.1/interpreter/shell.html">Shell</a></li>
<li><a href="/docs/0.10.1/interpreter/sparql.html">Sparql</a></li>
<li><a href="/docs/0.10.1/interpreter/submarine.html">Submarine</a></li>
</ul>
</li>
<li>
<a href="#" data-toggle="dropdown" class="dropdown-toggle">More<b class="caret"></b></a>
<ul class="dropdown-menu scrollable-menu" style="right: 0; left: auto;">
<li class="title"><span>Extending Zeppelin</span></li>
<li><a href="/docs/0.10.1/development/writing_zeppelin_interpreter.html">Writing Zeppelin Interpreter</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Helium (Experimental)</span></li>
<li><a href="/docs/0.10.1/development/helium/overview.html">Overview</a></li>
<li><a href="/docs/0.10.1/development/helium/writing_application.html">Writing Helium Application</a></li>
<li><a href="/docs/0.10.1/development/helium/writing_spell.html">Writing Helium Spell</a></li>
<li><a href="/docs/0.10.1/development/helium/writing_visualization_basic.html">Writing Helium Visualization: Basics</a></li>
<li><a href="/docs/0.10.1/development/helium/writing_visualization_transformation.html">Writing Helium Visualization: Transformation</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>Contributing to Zeppelin</span></li>
<li><a href="/docs/0.10.1/setup/basics/how_to_build.html">How to Build Zeppelin</a></li>
<li><a href="/docs/0.10.1/development/contribution/useful_developer_tools.html">Useful Developer Tools</a></li>
<li><a href="/docs/0.10.1/development/contribution/how_to_contribute_code.html">How to Contribute (code)</a></li>
<li><a href="/docs/0.10.1/development/contribution/how_to_contribute_website.html">How to Contribute (website)</a></li>
<li role="separator" class="divider"></li>
<li class="title"><span>External Resources</span></li>
<li><a target="_blank" rel="noopener noreferrer" href="https://zeppelin.apache.org/community.html">Mailing List</a></li>
<li><a target="_blank" rel="noopener noreferrer" href="https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Home">Apache Zeppelin Wiki</a></li>
<li><a target="_blank" rel="noopener noreferrer" href="http://stackoverflow.com/questions/tagged/apache-zeppelin">Stackoverflow Questions about Zeppelin</a></li>
</ul>
</li>
<li>
<a href="/docs/0.10.1/search.html" class="nav-search-link">
<span class="fa fa-search nav-search-icon"></span>
</a>
</li>
</ul>
</nav><!--/.navbar-collapse -->
</div>
</div>
<div class="content">
<!--<div class="hero-unit Install">
<h1></h1>
</div>
-->
<div class="row">
<div class="col-md-12">
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<h1>Zeppelin interpreter on Docker</h1>
<p>Zeppelin service runs on local server. zeppelin is able to run the interpreter in the docker container, Isolating the operating environment of the interpreter through the docker container. Zeppelin can be easily used without having to install python, spark, etc. on the local node.</p>
<p>Key benefits are</p>
<ul>
<li>Interpreter environment isolating</li>
<li>Not need to install python, spark, etc. environment on the local node</li>
<li>Docker does not need to pre-install zeppelin binary package, Automatically upload local zeppelin interpreter lib files to the container</li>
<li>Automatically upload local configuration files (such as spark-conf, hadoop-conf-dir, keytab file, ...) to the container, so that the running environment in the container is exactly the same as the local.</li>
<li>Zeppelin server runs locally, making it easier to manage and maintain</li>
</ul>
<h2>Prerequisites</h2>
<ul>
<li>apache/zeppelin docker image</li>
<li>Spark &gt;= 2.2.0 docker image (in case of using Spark Interpreter)</li>
<li>Docker 1.6+ <a href="https://docs.docker.com/v17.12/install/">Install Docker</a></li>
<li>Use docker&#39;s host network, so there is no need to set up a network specifically</li>
</ul>
<h3>Docker Configuration</h3>
<p>Because <code>DockerInterpreterProcess</code> communicates via docker&#39;s tcp interface.</p>
<p>By default, docker provides an interface as a sock file, so you need to modify the configuration file to open the tcp interface remotely.</p>
<p>vi <code>/etc/docker/daemon.json</code>, Add <code>tcp://0.0.0.0:2375</code> to the <code>hosts</code> configuration item.</p>
<div class="highlight"><pre><code class="json language-json" data-lang="json"><span class="p">{</span>
<span class="err">...</span>
<span class="nt">&quot;hosts&quot;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&quot;tcp://0.0.0.0:2375&quot;</span><span class="p">,</span><span class="s2">&quot;unix:///var/run/docker.sock&quot;</span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<p><code>hosts</code> property reference: https://docs.docker.com/engine/reference/commandline/dockerd/</p>
<h2>Quickstart</h2>
<ol>
<li><p>Modify these 2 configuration items in <code>zeppelin-site.xml</code>.</p>
<div class="highlight"><pre><code class="xml language-xml" data-lang="xml"><span class="nt">&lt;property&gt;</span>
<span class="nt">&lt;name&gt;</span>zeppelin.run.mode<span class="nt">&lt;/name&gt;</span>
<span class="nt">&lt;value&gt;</span>docker<span class="nt">&lt;/value&gt;</span>
<span class="nt">&lt;description&gt;</span>&#39;auto|local|k8s|docker&#39;<span class="nt">&lt;/description&gt;</span>
<span class="nt">&lt;/property&gt;</span>
<span class="nt">&lt;property&gt;</span>
<span class="nt">&lt;name&gt;</span>zeppelin.docker.container.image<span class="nt">&lt;/name&gt;</span>
<span class="nt">&lt;value&gt;</span>apache/zeppelin<span class="nt">&lt;/value&gt;</span>
<span class="nt">&lt;description&gt;</span>Docker image for interpreters<span class="nt">&lt;/description&gt;</span>
<span class="nt">&lt;/property&gt;</span>
</code></pre></div></li>
<li><p>set timezone in zeppelin-env.sh</p>
<p>Set to the same time zone as the zeppelin server, keeping the time zone in the interpreter docker container the same as the server. E.g, <code>&quot;America/New_York&quot;</code> or <code>&quot;Asia/Shanghai&quot;</code></p>
<div class="highlight"><pre><code class="bash language-bash" data-lang="bash"><span class="nb">export </span><span class="nv">DOCKER_TIME_ZONE</span><span class="o">=</span><span class="s2">&quot;America/New_York&quot;</span>
</code></pre></div></li>
</ol>
<h2>Build Zeppelin image manually</h2>
<p>To build Zeppelin image, support Kerberos certification &amp; install spark binary.</p>
<p>Use the <code>/scripts/docker/interpreter/Dockerfile</code> to build the image.</p>
<div class="highlight"><pre><code class="text language-text" data-lang="text">FROM apache/zeppelin:0.8.0
MAINTAINER Apache Software Foundation &lt;dev@zeppelin.apache.org&gt;
ENV SPARK_VERSION=2.3.3
ENV HADOOP_VERSION=2.7
# support Kerberos certification
RUN export DEBIAN_FRONTEND=noninteractive &amp;&amp; apt-get update &amp;&amp; apt-get install -yq krb5-user libpam-krb5 &amp;&amp; apt-get clean
RUN apt-get update &amp;&amp; apt-get install -y curl unzip wget grep sed vim tzdata &amp;&amp; apt-get clean
# auto upload zeppelin interpreter lib
RUN rm -rf /zeppelin
RUN rm -rf /spark
RUN wget https://www-us.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
RUN tar zxvf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
RUN mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION} spark
RUN rm spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
</code></pre></div>
<p>Then build docker image.</p>
<div class="highlight"><pre><code class="text language-text" data-lang="text"># build image. Replace &lt;tag&gt;.
$ docker build -t &lt;tag&gt; .
</code></pre></div>
<h2>How it works</h2>
<h3>Zeppelin interpreter on Docker</h3>
<p>Zeppelin service runs on local server, it auto configure itself to use <code>DockerInterpreterLauncher</code>.</p>
<p><code>DockerInterpreterLauncher</code> via <code>DockerInterpreterProcess</code> launcher creates each interpreter in a container using docker image.</p>
<p><code>DockerInterpreterProcess</code> uploads the binaries and configuration files of the local zeppelin service to the container:</p>
<ul>
<li>${ZEPPELIN_HOME}/bin</li>
<li>${ZEPPELIN_HOME}/lib</li>
<li>${ZEPPELIN_HOME}/interpreter/${interpreterGroupName}</li>
<li>${ZEPPELIN_HOME}/conf/zeppelin-site.xml</li>
<li>${ZEPPELIN_HOME}/conf/log4j.properties</li>
<li>${ZEPPELIN_HOME}/conf/log4j_yarn_cluster.properties</li>
<li>HADOOP_CONF_DIR</li>
<li>SPARK_CONF_DIR</li>
<li>/etc/krb5.conf</li>
<li>Keytab file configured in the interpreter properties
<ul>
<li>zeppelin.shell.keytab.location</li>
<li>spark.yarn.keytab</li>
<li>submarine.hadoop.keytab</li>
<li>zeppelin.jdbc.keytab.location</li>
<li>zeppelin.server.kerberos.keytab</li>
</ul></li>
</ul>
<p>All file paths uploaded to the container, Keep the same path as the local one. This will ensure that all configurations are used correctly.</p>
<h3>Spark interpreter on Docker</h3>
<p>When interpreter group is <code>spark</code>, Zeppelin sets necessary spark configuration automatically to use Spark on Docker.
Supports all running modes of <code>local[*]</code>, <code>yarn-client</code>, and <code>yarn-cluster</code> of zeppelin spark interpreter.</p>
<h4>SPARK_CONF_DIR</h4>
<ol>
<li><p>Configuring in the zeppelin-env.sh</p>
<p>Because there are only spark binary files in the interpreter image, no spark conf files are included.
The configuration file in the <code>spark-&lt;version&gt;/conf/</code> local to the zeppelin service needs to be uploaded to the <code>/spark/conf/</code> directory in the spark interpreter container.
So you need to setting <code>export SPARK_CONF_DIR=/spark-&lt;version&gt;-path/conf/</code> in the <code>zeppelin-env.sh</code> file.</p></li>
<li><p>Configuring in the spark Properties</p>
<p>You can also configure it in the spark interpreter properties.</p>
<table><thead>
<tr>
<th>properties name</th>
<th>Value</th>
<th>Description</th>
</tr>
</thead><tbody>
<tr>
<td>SPARK_CONF_DIR</td>
<td>/spark-<version>-path.../conf/</td>
<td>Spark-<version>-path/conf/ path local on the zeppelin service</td>
</tr>
</tbody></table></li>
</ol>
<h4>HADOOP_CONF_DIR</h4>
<ol>
<li><p>Configuring in the zeppelin-env.sh</p>
<p>Because there are only spark binary files in the interpreter image, no configuration files are included.
The configuration file in the <code>hadoop-&lt;version&gt;/etc/hadoop</code> local to the zeppelin service needs to be uploaded to the spark interpreter container.
So you need to setting <code>export HADOOP_CONF_DIR=hadoop-&lt;version&gt;-path/etc/hadoop</code> in the <code>zeppelin-env.sh</code> file.</p></li>
<li><p>Configuring in the spark Properties</p>
<p>You can also configure it in the spark interpreter properties.</p>
<table><thead>
<tr>
<th>properties name</th>
<th>Value</th>
<th>Description</th>
</tr>
</thead><tbody>
<tr>
<td>HADOOP_CONF_DIR</td>
<td>hadoop-<version>-path/etc/hadoop</td>
<td>hadoop-<version>-path/etc/hadoop path local on the zeppelin service</td>
</tr>
</tbody></table></li>
</ol>
<h4>Accessing Spark UI (or Service running in interpreter container)</h4>
<p>Because the zeppelin interpreter container uses the host network, the spark.ui.port port is automatically allocated, so do not configure <code>spark.ui.port=xxxx</code> in <code>spark-defaults.conf</code></p>
<h2>Future work</h2>
<ul>
<li>Configuring container resources that can be used by different interpreters by configuration.</li>
</ul>
<h2>Development</h2>
<p>Instead of build Zeppelin distribution package and docker image everytime during development,
Zeppelin can run locally (such as inside your IDE in debug mode) and able to run Interpreter using <a href="https://github.com/apache/zeppelin/blob/master/zeppelin-plugins/launcher/docker/src/main/java/org/apache/zeppelin/interpreter/launcher/DockerInterpreterLauncher.java">DockerInterpreterLauncher</a> by configuring following environment variables.</p>
<ol>
<li>zeppelin-site.xml</li>
</ol>
<table><thead>
<tr>
<th>Configuration variable</th>
<th>Value</th>
<th>Description</th>
</tr>
</thead><tbody>
<tr>
<td><code>ZEPPELIN_RUN_MODE</code></td>
<td><code>docker</code></td>
<td>Make Zeppelin run interpreter on Docker</td>
</tr>
<tr>
<td><code>ZEPPELIN_DOCKER_CONTAINER_IMAGE</code></td>
<td><code>&lt;image&gt;:&lt;version&gt;</code></td>
<td>Zeppelin interpreter docker image to use</td>
</tr>
</tbody></table>
</div>
</div>
<hr>
<footer>
<!-- <p>&copy; 2022 The Apache Software Foundation</p>-->
</footer>
</div>
</body>
</html>