blob: c4bc9b74635f90d3b21f1262768630cdd4edccef [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia at 2018-03-12
| Rendered using Apache Maven Fluido Skin 1.3.0
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="Date-Revision-yyyymmdd" content="20180312" />
<meta http-equiv="Content-Language" content="en" />
<title>Falcon - Embedded Mode</title>
<link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
<link rel="stylesheet" href="./css/site.css" />
<link rel="stylesheet" href="./css/print.css" media="print" />
<script type="text/javascript" src="./js/apache-maven-fluido-1.3.0.min.js"></script>
<script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script>
</head>
<body class="topBarDisabled">
<div class="container">
<div id="banner">
<div class="pull-left">
<div id="bannerLeft">
<img src="images/falcon-logo.png" alt="Apache Falcon" width="200px" height="45px"/>
</div>
</div>
<div class="pull-right"> </div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li class="">
<a href="index.html" title="Falcon">
Falcon</a>
</li>
<li class="divider ">/</li>
<li class="">Embedded Mode</li>
<li id="publishDate" class="pull-right">Last Published: 2018-03-12</li> <li class="divider pull-right">|</li>
<li id="projectVersion" class="pull-right">Version: 0.11</li>
</ul>
</div>
<div id="bodyColumn" >
<div class="section">
<h2>Embedded Mode<a name="Embedded_Mode"></a></h2>
<p>Following are the steps needed to package and deploy Falcon in Embedded Mode. You need to complete Steps 1-3 mentioned <a href="./InstallationSteps.html">here</a> before proceeding further.</p></div>
<div class="section">
<h3>Package Falcon<a name="Package_Falcon"></a></h3>
<p>Ensure that you are in the base directory (where you cloned Falcon). Let&#xe2;&#x80;&#x99;s call it {project dir}</p>
<div class="source">
<pre>
$mvn clean assembly:assembly -DskipTests -DskipCheck=true
</pre></div>
<div class="source">
<pre>
$ls {project dir}/distro/target/
</pre></div>
<p>It should give an output like below :</p>
<div class="source">
<pre>
apache-falcon-${project.version}-bin.tar.gz
apache-falcon-${project.version}-sources.tar.gz
archive-tmp
maven-shared-archive-resources
</pre></div>
<p>* apache-falcon-${project.version}-sources.tar.gz contains source files of Falcon repo.</p>
<p>* apache-falcon-${project.version}-bin.tar.gz package contains project artifacts along with it's dependencies, configuration files and scripts required to deploy Falcon.</p>
<p>Tar can be found in {project dir}/target/apache-falcon-${project.version}-bin.tar.gz</p>
<p>Tar is structured as follows :</p>
<div class="source">
<pre>
|- bin
|- falcon
|- falcon-start
|- falcon-stop
|- falcon-status
|- falcon-config.sh
|- service-start.sh
|- service-stop.sh
|- service-status.sh
|- conf
|- startup.properties
|- runtime.properties
|- prism.keystore
|- client.properties
|- log4j.xml
|- falcon-env.sh
|- docs
|- client
|- lib (client support libs)
|- server
|- webapp
|- falcon.war
|- data
|- falcon-store
|- graphdb
|- localhost
|- examples
|- app
|- hive
|- oozie-mr
|- pig
|- data
|- entity
|- filesystem
|- hcat
|- oozie
|- conf
|- libext
|- logs
|- hadooplibs
|- README
|- NOTICE.txt
|- LICENSE.txt
|- DISCLAIMER.txt
|- CHANGES.txt
</pre></div></div>
<div class="section">
<h3>Installing &amp; running Falcon<a name="Installing__running_Falcon"></a></h3>
<p>Running Falcon in embedded mode requires bringing up server.</p>
<div class="source">
<pre>
$tar -xzvf {falcon package}
$cd falcon-${project.version}
</pre></div></div>
<div class="section">
<h4>Starting Falcon Server<a name="Starting_Falcon_Server"></a></h4>
<div class="source">
<pre>
$cd falcon-${project.version}
$bin/falcon-start [-port &lt;port&gt;]
</pre></div>
<p>By default, * If falcon.enableTLS is set to true explicitly or not set at all, Falcon starts at port 15443 on <a class="externalLink" href="https://">https://</a> by default.</p>
<p>* If falcon.enableTLS is set to false explicitly, Falcon starts at port 15000 on <a class="externalLink" href="http://.">http://.</a></p>
<p>* To change the port, use -port option.</p>
<p>* If falcon.enableTLS is not set explicitly, port that ends with 443 will automatically put Falcon on <a class="externalLink" href="https://.">https://.</a> Any other port will put Falcon on <a class="externalLink" href="http://.">http://.</a></p>
<p>* Server starts with conf from {falcon-server-dir}/falcon-distributed-${project.version}/conf. To override this (to use the same conf with multiple server upgrades), set environment variable FALCON_CONF to the path of conf dir. You can find the instructions for configuring Falcon <a href="./Configuration.html">here</a>.</p></div>
<div class="section">
<h4>Enabling server-client<a name="Enabling_server-client"></a></h4>
<p>If server is not started using default-port 15443 then edit the following property in {falcon-server-dir}/falcon-${project.version}/conf/client.properties</p>
<p>falcon.url=http://{machine-ip}:{server-port}/</p></div>
<div class="section">
<h4>Using Falcon<a name="Using_Falcon"></a></h4>
<div class="source">
<pre>
$cd falcon-${project.version}
$bin/falcon admin -version
Falcon server build version: {Version:&quot;${project.version}-SNAPSHOT-rd7e2be9afa2a5dc96acd1ec9e325f39c6b2f17f7&quot;,Mode:
&quot;embedded&quot;,Hadoop:&quot;${hadoop.version}&quot;}
$bin/falcon help
(for more details about Falcon cli usage)
</pre></div>
<p><b>Note</b> : https is the secure version of HTTP, the protocol over which data is sent between your browser and the website that you are connected to. By default Falcon runs in https mode. But user can configure it to http.</p></div>
<div class="section">
<h4>Dashboard<a name="Dashboard"></a></h4>
<p>Once Falcon server is started, you can view the status of Falcon entities using the Web-based dashboard. You can open your browser at the corresponding port to use the web UI.</p>
<p>Falcon dashboard makes the REST api calls as user &quot;falcon-dashboard&quot;. If this user does not exist on your Falcon and Oozie servers, please create the user.</p>
<div class="source">
<pre>
## create user.
[root@falconhost ~] useradd -U -m falcon-dashboard -G users
## verify user is created with membership in correct groups.
[root@falconhost ~] groups falcon-dashboard
falcon-dashboard : falcon-dashboard users
[root@falconhost ~]
</pre></div></div>
<div class="section">
<h3>Running Examples using embedded package<a name="Running_Examples_using_embedded_package"></a></h3>
<div class="source">
<pre>
$cd falcon-${project.version}
$bin/falcon-start
</pre></div>
<p>Make sure the Hadoop and Oozie endpoints are according to your setup in examples/entity/filesystem/standalone-cluster.xml The cluster locations,staging and working dirs, MUST be created prior to submitting a cluster entity to Falcon. <b>staging</b> must have 777 permissions and the parent dirs must have execute permissions <b>working</b> must have 755 permissions and the parent dirs must have execute permissions</p>
<div class="source">
<pre>
$bin/falcon entity -submit -type cluster -file examples/entity/filesystem/standalone-cluster.xml
</pre></div>
<p>Submit input and output feeds:</p>
<div class="source">
<pre>
$bin/falcon entity -submit -type feed -file examples/entity/filesystem/in-feed.xml
$bin/falcon entity -submit -type feed -file examples/entity/filesystem/out-feed.xml
</pre></div>
<p>Set-up workflow for the process:</p>
<div class="source">
<pre>
$hadoop fs -put examples/app /
</pre></div>
<p>Submit and schedule the process:</p>
<div class="source">
<pre>
$bin/falcon entity -submitAndSchedule -type process -file examples/entity/filesystem/oozie-mr-process.xml
$bin/falcon entity -submitAndSchedule -type process -file examples/entity/filesystem/pig-process.xml
$bin/falcon entity -submitAndSchedule -type process -file examples/entity/spark/spark-process.xml
</pre></div>
<p>Generate input data:</p>
<div class="source">
<pre>
$examples/data/generate.sh &lt;&lt;hdfs endpoint&gt;&gt;
</pre></div>
<p>Get status of instances:</p>
<div class="source">
<pre>
$bin/falcon instance -status -type process -name oozie-mr-process -start 2013-11-15T00:05Z -end 2013-11-15T01:00Z
</pre></div>
<p>HCat based example entities are in examples/entity/hcat. Spark based example entities are in examples/entity/spark.</p></div>
<div class="section">
<h4>Stopping Falcon Server<a name="Stopping_Falcon_Server"></a></h4>
<div class="source">
<pre>
$cd falcon-${project.version}
$bin/falcon-stop
</pre></div></div>
</div>
</div>
<hr/>
<footer>
<div class="container">
<div class="row span12">Copyright &copy; 2013-2018
<a href="http://www.apache.org">Apache Software Foundation</a>.
All Rights Reserved.
</div>
<p id="poweredBy" class="pull-right">
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" />
</a>
</p>
</div>
</footer>
</body>
</html>