blob: d5433f34a0d5ef5eb11a6e1acdd15189a790fa08 [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia at 2018-03-12
| Rendered using Apache Maven Fluido Skin 1.3.0
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="Date-Revision-yyyymmdd" content="20180312" />
<meta http-equiv="Content-Language" content="en" />
<title>Falcon - Falcon Recipes</title>
<link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
<link rel="stylesheet" href="./css/site.css" />
<link rel="stylesheet" href="./css/print.css" media="print" />
<script type="text/javascript" src="./js/apache-maven-fluido-1.3.0.min.js"></script>
<script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script>
</head>
<body class="topBarDisabled">
<div class="container-fluid">
<div id="banner">
<div class="pull-left">
<a href="../index.html" id="bannerLeft">
<img src="images/falcon-logo.png" alt="Apache Falcon" width="200px" height="45px"/>
</a>
</div>
<div class="pull-right"> <a href="http://www.apache.org" id="bannerRight">
<img src="images/apache-feather-tm.gif" alt="Falcon" height="45px"/>
</a>
</div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li class="">
<a href="http://www.apache.org" class="externalLink" title="Apache">
Apache</a>
</li>
<li class="divider ">/</li>
<li class="">
<a href="index.html" title="Falcon">
Falcon</a>
</li>
<li class="divider ">/</li>
<li class="">Falcon Recipes</li>
<li id="publishDate" class="pull-right">Last Published: 2018-03-12</li>
</ul>
</div>
<div class="row-fluid">
<div id="leftColumn" class="span3">
<div class="well sidebar-nav">
<ul class="nav nav-list">
<li class="nav-header">Falcon</li>
<li>
<a href="index.html" title="About">
<i class="none"></i>
About</a>
</li>
<li>
<a href="slides/falcon-overview.html" title="Overview">
<i class="none"></i>
Overview</a>
</li>
<li>
<a href="slides/falcon-user-guide.html" title="User Guide">
<i class="none"></i>
User Guide</a>
</li>
<li>
<a href="GettingStarted.html" title="Getting Started">
<i class="none"></i>
Getting Started</a>
</li>
<li>
<a href="FalconDocumentation.html" title="Architecture">
<i class="none"></i>
Architecture</a>
</li>
<li>
<a href="InstallationSteps.html" title="Installation">
<i class="none"></i>
Installation</a>
</li>
<li>
<a href="OnBoarding.html" title="On Boarding">
<i class="none"></i>
On Boarding</a>
</li>
<li>
<a href="MigrationInstructions.html" title="Migrate to 0.10">
<i class="none"></i>
Migrate to 0.10</a>
</li>
<li>
<a href="Operability.html" title="Operability">
<i class="none"></i>
Operability</a>
</li>
<li>
<a href="EntitySpecification.html" title="Entity Specification">
<i class="none"></i>
Entity Specification</a>
</li>
<li>
<a href="falconcli/FalconCLI.html" title="Client (Falcon CLI)">
<i class="none"></i>
Client (Falcon CLI)</a>
</li>
<li>
<a href="restapi/ResourceList.html" title="Rest API">
<i class="icon-chevron-right"></i>
Rest API</a>
</li>
<li>
<a href="HiveIntegration.html" title="Hive Integration">
<i class="none"></i>
Hive Integration</a>
</li>
<li>
<a href="Extensions.html" title="Server side Extensions">
<i class="none"></i>
Server side Extensions</a>
</li>
<li>
<a href="Security.html" title="Security">
<i class="none"></i>
Security</a>
</li>
<li class="nav-header">Project Information</li>
<li>
<a href="project-info.html" title="Summary">
<i class="none"></i>
Summary</a>
</li>
<li>
<a href="mail-lists.html" title="Mailing Lists">
<i class="none"></i>
Mailing Lists</a>
</li>
<li>
<a href="http://webchat.freenode.net?channels=apachefalcon&uio=d4" class="externalLink" title="IRC">
<i class="none"></i>
IRC</a>
</li>
<li>
<a href="team-list.html" title="Team">
<i class="none"></i>
Team</a>
</li>
<li>
<a href="issue-tracking.html" title="Issue Tracking">
<i class="none"></i>
Issue Tracking</a>
</li>
<li>
<a href="source-repository.html" title="Source Repository">
<i class="none"></i>
Source Repository</a>
</li>
<li>
<a href="https://cwiki.apache.org/confluence/display/FALCON/Index" class="externalLink" title="Wiki">
<i class="none"></i>
Wiki</a>
</li>
<li>
<a href="license.html" title="License">
<i class="none"></i>
License</a>
</li>
<li>
<a href="https://cwiki.apache.org/confluence/display/FALCON/News" class="externalLink" title="News">
<i class="none"></i>
News</a>
</li>
<li>
<a href="https://cwiki.apache.org/confluence/display/FALCON/PoweredBy" class="externalLink" title="Powered by">
<i class="none"></i>
Powered by</a>
</li>
<li>
<a href="https://cwiki.apache.org/confluence/display/FALCON/Acknowledgements" class="externalLink" title="Acknowledgements">
<i class="none"></i>
Acknowledgements</a>
</li>
<li>
<a href="http://blogs.apache.org/falcon/" class="externalLink" title="Blog">
<i class="none"></i>
Blog</a>
</li>
<li class="nav-header">Releases</li>
<li>
<a href="http://www.apache.org/dyn/closer.lua/falcon/0.11" class="externalLink" title="0.11">
<i class="none"></i>
0.11</a>
</li>
<li>
<a href="http://www.apache.org/dyn/closer.lua/falcon/0.10" class="externalLink" title="0.10">
<i class="none"></i>
0.10</a>
</li>
<li>
<a href="http://www.apache.org/dyn/closer.lua/falcon/0.9" class="externalLink" title="0.9">
<i class="none"></i>
0.9</a>
</li>
<li>
<a href="http://www.apache.org/dyn/closer.lua/falcon/0.8" class="externalLink" title="0.8">
<i class="none"></i>
0.8</a>
</li>
<li>
<a href="http://www.apache.org/dyn/closer.lua/falcon/0.7" class="externalLink" title="0.7">
<i class="none"></i>
0.7</a>
</li>
<li>
<a href="http://archive.apache.org/dist/falcon/0.6.1" class="externalLink" title="0.6.1">
<i class="none"></i>
0.6.1</a>
</li>
<li>
<a href="http://archive.apache.org/dist/incubator/falcon/0.6-incubating" class="externalLink" title="0.6-incubating">
<i class="none"></i>
0.6-incubating</a>
</li>
<li>
<a href="http://archive.apache.org/dist/incubator/falcon/0.5-incubating" class="externalLink" title="0.5-incubating">
<i class="none"></i>
0.5-incubating</a>
</li>
<li>
<a href="http://archive.apache.org/dist/incubator/falcon/0.4-incubating" class="externalLink" title="0.4-incubating">
<i class="none"></i>
0.4-incubating</a>
</li>
<li>
<a href="http://archive.apache.org/dist/incubator/falcon/0.3-incubating" class="externalLink" title="0.3-incubating">
<i class="none"></i>
0.3-incubating</a>
</li>
<li>
<a href="https://cwiki.apache.org/confluence/display/FALCON/Roadmap" class="externalLink" title="Coming soon">
<i class="none"></i>
Coming soon</a>
</li>
<li class="nav-header">Documentation</li>
<li>
<a href="0.11/index.html" title="0.11 (Current)">
<i class="none"></i>
0.11 (Current)</a>
</li>
<li>
<a href="0.10/index.html" title="0.10">
<i class="none"></i>
0.10</a>
</li>
<li>
<a href="0.9/index.html" title="0.9">
<i class="none"></i>
0.9</a>
</li>
<li>
<a href="0.8/index.html" title="0.8">
<i class="none"></i>
0.8</a>
</li>
<li>
<a href="0.7/index.html" title="0.7">
<i class="none"></i>
0.7</a>
</li>
<li>
<a href="0.6.1/index.html" title="0.6.1">
<i class="none"></i>
0.6.1</a>
</li>
<li>
<a href="0.6-incubating/index.html" title="0.6-incubating">
<i class="none"></i>
0.6-incubating</a>
</li>
<li>
<a href="0.5-incubating/index.html" title="0.5-incubating">
<i class="none"></i>
0.5-incubating</a>
</li>
<li>
<a href="0.4-incubating/index.html" title="0.4-incubating">
<i class="none"></i>
0.4-incubating</a>
</li>
<li>
<a href="0.3-incubating/index.html" title="0.3-incubating">
<i class="none"></i>
0.3-incubating</a>
</li>
<li class="nav-header">ASF</li>
<li>
<a href="http://www.apache.org/foundation/how-it-works.html" class="externalLink" title="How Apache Works">
<i class="none"></i>
How Apache Works</a>
</li>
<li>
<a href="http://www.apache.org/foundation/" class="externalLink" title="Foundation">
<i class="none"></i>
Foundation</a>
</li>
<li>
<a href="http://www.apache.org/foundation/sponsorship.html" class="externalLink" title="Sponsoring Apache">
<i class="none"></i>
Sponsoring Apache</a>
</li>
<li>
<a href="http://www.apache.org/foundation/thanks.html" class="externalLink" title="Thanks">
<i class="none"></i>
Thanks</a>
</li>
</ul>
<hr class="divider" />
<div id="poweredBy">
<div class="clear"></div>
<div class="clear"></div>
<div class="clear"></div>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" />
</a>
</div>
</div>
</div>
<div id="bodyColumn" class="span9" >
<div class="section">
<h2>Falcon Recipes<a name="Falcon_Recipes"></a></h2></div>
<div class="section">
<h3>Overview<a name="Overview"></a></h3>
<p>A Falcon recipe is a static process template with parameterized workflow to realize a specific use case. Recipes are defined in user space. Recipes will not have support for update or lifecycle management.</p>
<p>For example:</p>
<p></p>
<ul>
<li>Replicating directories from one HDFS cluster to another (not timed partitions)</li>
<li>Replicating hive metadata (database, table, views, etc.)</li>
<li>Replicating between HDFS and Hive - either way</li>
<li>Data masking etc.</li></ul></div>
<div class="section">
<h3>Proposal<a name="Proposal"></a></h3>
<p>Falcon provides a Process abstraction that encapsulates the configuration for a user workflow with scheduling controls. All recipes can be modeled as a Process with in Falcon which executes the user workflow periodically. The process and its associated workflow are parameterized. The user will provide a properties file with name value pairs that are substituted by falcon before scheduling it. Falcon translates these recipes as a process entity by replacing the parameters in the workflow definition.</p></div>
<div class="section">
<h3>Falcon CLI recipe support<a name="Falcon_CLI_recipe_support"></a></h3>
<p>Falcon CLI functionality to support recipes has been added. <a href="./Falconcli/FalconCLI.html">Falcon CLI</a> Recipe command usage is defined here.</p>
<p>CLI accepts recipe option with a recipe name and optional tool and does the following:</p>
<ul>
<li>Validates the options; name option is mandatory and tool is optional and should be provided if user wants to override the base recipe tool</li>
<li>Looks for &lt;name&gt;-workflow.xml, &lt;name&gt;-template.xml and &lt;name&gt;.properties file in the path specified by falcon.recipe.path in client.properties. If files cannot be found then Falcon CLI will fail</li>
<li>Invokes a Tool to substitute the properties in the templated process for the recipe. By default invokes base tool if tool option is not passed. Tool is responsible for generating process entity at the path specified by FalconCLI</li>
<li>Validates the generated entity</li>
<li>Submit and schedule this entity</li>
<li>Generated process entity files are stored in tmp directory</li></ul></div>
<div class="section">
<h3>Base Recipe tool<a name="Base_Recipe_tool"></a></h3>
<p>Falcon provides a base tool that recipes can override. Base Recipe tool does the following:</p>
<ul>
<li>Expects recipe template file path, recipe properties file path and path where process entity to be submitted should be generated. Validates these arguments</li>
<li>Validates the artifacts i.e. workflow and/or lib files specified in the recipe template exists on local filesystem or HDFS at the specified path else returns error</li>
<li>Copies if the artifacts exists on local filesystem
<ul>
<li>If workflow is on local FS then falcon.recipe.workflow.path in recipe property file is mandatory for it to be copied to HDFS. If templated process requires custom libs falcon.recipe.workflow.lib.path property is mandatory for them to be copied from Local FS to HDFS. Recipe tool will copy the local artifacts only if these properties are set in properties file</li></ul></li>
<li>Looks for the patten ##[A-Za-z0-9_.]*## in the templated process and substitutes it with the properties. Process entity generated after the substitution is written to the empty file passed by FalconCLI</li></ul></div>
<div class="section">
<h3>Recipe template file format<a name="Recipe_template_file_format"></a></h3>
<p></p>
<ul>
<li>Any templatized string should be in the format ##[A-Za-z0-9_.]*##.</li>
<li>There should be a corresponding entry in the recipe properties file &quot;falcon.recipe.&lt;templatized-string&gt; = &lt;value to be substituted&gt;&quot;</li></ul>
<div class="source">
<pre>
Example: If the entry in recipe template is &lt;workflow name=&quot;##workflow.name##&quot;&gt; there should be a corresponding entry in the recipe properties file falcon.recipe.workflow.name=hdfs-dr-workflow
</pre></div></div>
<div class="section">
<h3>Recipe properties file format<a name="Recipe_properties_file_format"></a></h3>
<p></p>
<ul>
<li>Regular key value pair properties file</li>
<li>Property key should be prefixed by &quot;falcon.recipe.&quot;</li></ul>
<div class="source">
<pre>
Example: falcon.recipe.workflow.name=hdfs-dr-workflow
Recipe template will have &lt;workflow name=&quot;##workflow.name##&quot;&gt;. Recipe tool will look for the patten ##workflow.name##
and replace it with the property value &quot;hdfs-dr-workflow&quot;. Substituted template will have &lt;workflow name=&quot;hdfs-dr-workflow&quot;&gt;
</pre></div></div>
<div class="section">
<h3>Metrics<a name="Metrics"></a></h3>
<p>HDFS DR and Hive DR recipes will capture the replication metrics like TIMETAKEN, BYTESCOPIED, COPY (number of files copied) for an instance and populate to the GraphDB.</p></div>
<div class="section">
<h3>Managing the scheduled recipe process<a name="Managing_the_scheduled_recipe_process"></a></h3>
<p></p>
<ul>
<li>Scheduled recipe process is similar to regular process
<ul>
<li>List : falcon entity -type process -name &lt;recipe-process-name&gt; -list</li>
<li>Status : falcon entity -type process -name &lt;recipe-process-name&gt; -status</li>
<li>Delete : falcon entity -type process -name &lt;recipe-process-name&gt; -delete</li></ul></li></ul></div>
<div class="section">
<h3>Sample recipes<a name="Sample_recipes"></a></h3>
<p></p>
<ul>
<li>Sample recipes are published in addons/recipes</li></ul></div>
<div class="section">
<h3>Types of recipes<a name="Types_of_recipes"></a></h3>
<p></p>
<ul>
<li><a href="./HDFSDR.html">HDFS Recipe</a></li>
<li><a href="./HiveDR.html">HiveDR Recipe</a></li></ul></div>
<div class="section">
<h3>Packaging<a name="Packaging"></a></h3>
<p></p>
<ul>
<li>There is no packaging for recipes at this time but will be added soon.</li></ul></div>
</div>
</div>
</div>
<hr/>
<footer>
<div class="container-fluid">
<div class="row span12">Copyright &copy; 2013-2018
<a href="http://www.apache.org">Apache Software Foundation</a>.
All Rights Reserved.
</div>
</div>
</footer>
</body>
</html>