blob: e084a3d87be61331dee2d74f0ced43ad24960c85 [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia at 2017-03-16
| Rendered using Apache Maven Fluido Skin 1.3.0
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="Date-Revision-yyyymmdd" content="20170316" />
<meta http-equiv="Content-Language" content="en" />
<title>Apache Atlas &#x2013; Falcon Atlas Bridge</title>
<link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
<link rel="stylesheet" href="./css/site.css" />
<link rel="stylesheet" href="./css/print.css" media="print" />
<script type="text/javascript" src="./js/apache-maven-fluido-1.3.0.min.js"></script>
<script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script>
</head>
<body class="topBarEnabled">
<div id="topbar" class="navbar navbar-fixed-top ">
<div class="navbar-inner">
<div class="container" style="width: 68%;"><div class="nav-collapse">
<ul class="nav">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Atlas <b class="caret"></b></a>
<ul class="dropdown-menu">
<li> <a href="index.html" title="About">About</a>
</li>
<li> <a href="https://cwiki.apache.org/confluence/display/ATLAS" title="Wiki">Wiki</a>
</li>
<li> <a href="https://cwiki.apache.org/confluence/display/ATLAS" title="News">News</a>
</li>
<li> <a href="https://git-wip-us.apache.org/repos/asf/incubator-atlas.git" title="Git">Git</a>
</li>
<li> <a href="https://issues.apache.org/jira/browse/ATLAS" title="Jira">Jira</a>
</li>
<li> <a href="https://cwiki.apache.org/confluence/display/ATLAS/PoweredBy" title="Powered by">Powered by</a>
</li>
<li> <a href="http://blogs.apache.org/atlas/" title="Blog">Blog</a>
</li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Project Information <b class="caret"></b></a>
<ul class="dropdown-menu">
<li> <a href="project-info.html" title="Summary">Summary</a>
</li>
<li> <a href="mail-lists.html" title="Mailing Lists">Mailing Lists</a>
</li>
<li> <a href="http://webchat.freenode.net?channels=apacheatlas&uio=d4" title="IRC">IRC</a>
</li>
<li> <a href="team-list.html" title="Team">Team</a>
</li>
<li> <a href="issue-tracking.html" title="Issue Tracking">Issue Tracking</a>
</li>
<li> <a href="source-repository.html" title="Source Repository">Source Repository</a>
</li>
<li> <a href="license.html" title="License">License</a>
</li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Releases <b class="caret"></b></a>
<ul class="dropdown-menu">
<li> <a href="http://www.apache.org/dyn/closer.cgi/incubator/atlas/0.8.0-incubating/" title="0.8-incubating">0.8-incubating</a>
</li>
<li> <a href="http://archive.apache.org/dist/incubator/atlas/0.7.1-incubating/" title="0.7.1-incubating">0.7.1-incubating</a>
</li>
<li> <a href="http://archive.apache.org/dist/incubator/atlas/0.7.0-incubating/" title="0.7-incubating">0.7-incubating</a>
</li>
<li> <a href="http://archive.apache.org/dist/incubator/atlas/0.6.0-incubating/" title="0.6-incubating">0.6-incubating</a>
</li>
<li> <a href="http://archive.apache.org/dist/incubator/atlas/0.5.0-incubating/" title="0.5-incubating">0.5-incubating</a>
</li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="../index.html" title="latest">latest</a></li>
<li><a href="../0.8.0-incubating/index.html" title="0.8-incubating">0.8-incubating</a></li>
<li><a href="../0.7.1-incubating/index.html" title="0.7.1-incubating">0.7.1-incubating</a></li>
<li><a href="../0.7.0-incubating/index.html" title="0.7-incubating">0.7-incubating</a></li>
<li><a href="../0.6.0-incubating/index.html" title="0.6-incubating">0.6-incubating</a></li>
<li><a href="../0.5.0-incubating/index.html" title="0.5-incubating">0.5-incubating</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">ASF <b class="caret"></b></a>
<ul class="dropdown-menu">
<li> <a href="http://www.apache.org/foundation/how-it-works.html" title="How Apache Works">How Apache Works</a>
</li>
<li> <a href="http://www.apache.org/foundation/" title="Foundation">Foundation</a>
</li>
<li> <a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsoring Apache">Sponsoring Apache</a>
</li>
<li> <a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a>
</li>
</ul>
</li>
</ul>
<form id="search-form" action="http://www.google.com/search" method="get" class="navbar-search pull-right" >
<input value="http://atlas.incubator.apache.org" name="sitesearch" type="hidden"/>
<input class="search-query" name="q" id="query" type="text" />
</form>
<script type="text/javascript" src="http://www.google.com/coop/cse/brand?form=search-form"></script>
<iframe src="http://www.facebook.com/plugins/like.php?href=http://atlas.incubator.apache.org/atlas-docs&send=false&layout=button_count&show-faces=false&action=like&colorscheme=dark"
scrolling="no" frameborder="0"
style="border:none; width:80px; height:20px; margin-top: 10px;" class="pull-right" ></iframe>
<script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script>
<ul class="nav pull-right"><li style="margin-top: 10px;">
<div class="g-plusone" data-href="http://atlas.incubator.apache.org/atlas-docs" data-size="medium" width="60px" align="right" ></div>
</li></ul>
</div>
</div>
</div>
</div>
<div class="container">
<div id="banner">
<div class="pull-left">
<a href=".." id="bannerLeft">
<img src="images/atlas-logo.png" alt="Apache Atlas" width="200px" height="45px"/>
</a>
</div>
<div class="pull-right"> <a href="http://incubator.apache.org" id="bannerRight">
<img src="images/apache-incubator-logo.png" alt="Apache Incubator"/>
</a>
</div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li class="">
<a href="http://www.apache.org" class="externalLink" title="Apache">
Apache</a>
</li>
<li class="divider ">/</li>
<li class="">
<a href="index.html" title="Atlas">
Atlas</a>
</li>
<li class="divider ">/</li>
<li class="">Falcon Atlas Bridge</li>
<li id="publishDate" class="pull-right">Last Published: 2017-03-16</li> <li class="divider pull-right">|</li>
<li id="projectVersion" class="pull-right">Version: 0.8-incubating</li>
</ul>
</div>
<div id="bodyColumn" >
<div class="section">
<h2><a name="Falcon_Atlas_Bridge"></a>Falcon Atlas Bridge</h2></div>
<div class="section">
<h3><a name="Falcon_Model"></a>Falcon Model</h3>
<p>The default falcon modelling is available in org.apache.atlas.falcon.model.FalconDataModelGenerator. It defines the following types:</p>
<div class="source">
<pre>
falcon_cluster(ClassType) - super types [Infrastructure] - attributes [timestamp, colo, owner, tags]
falcon_feed(ClassType) - super types [DataSet] - attributes [timestamp, stored-in, owner, groups, tags]
falcon_feed_creation(ClassType) - super types [Process] - attributes [timestamp, stored-in, owner]
falcon_feed_replication(ClassType) - super types [Process] - attributes [timestamp, owner]
falcon_process(ClassType) - super types [Process] - attributes [timestamp, runs-on, owner, tags, pipelines, workflow-properties]
</pre></div>
<p>One falcon_process entity is created for every cluster that the falcon process is defined for.</p>
<p>The entities are created and de-duped using unique qualifiedName attribute. They provide namespace and can be used for querying/lineage as well. The unique attributes are:</p>
<ul>
<li>falcon_process - &lt;process name&gt;@&lt;cluster name&gt;</li>
<li>falcon_cluster - &lt;cluster name&gt;</li>
<li>falcon_feed - &lt;feed name&gt;@&lt;cluster name&gt;</li>
<li>falcon_feed_creation - &lt;feed name&gt;</li>
<li>falcon_feed_replication - &lt;feed name&gt;</li></ul></div>
<div class="section">
<h3><a name="Falcon_Hook"></a>Falcon Hook</h3>
<p>Falcon supports listeners on falcon entity submission. This is used to add entities in Atlas using the model defined in org.apache.atlas.falcon.model.FalconDataModelGenerator. The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities.</p>
<ul>
<li>Add 'org.apache.atlas.falcon.service.AtlasService' to application.services in &lt;falcon-conf&gt;/startup.properties</li>
<li>Link falcon hook jars in falcon classpath - 'ln -s &lt;atlas-home&gt;/hook/falcon/* &lt;falcon-home&gt;/server/webapp/falcon/WEB-INF/lib/'</li>
<li>In &lt;falcon_conf&gt;/falcon-env.sh, set an environment variable as follows:</li></ul>
<div class="source">
<pre>
export FALCON_SERVER_OPTS=&quot;&lt;atlas_home&gt;/hook/falcon/*:$FALCON_SERVER_OPTS&quot;
</pre></div>
<p>The following properties in &lt;atlas-conf&gt;/atlas-application.properties control the thread pool and notification details:</p>
<ul>
<li>atlas.hook.falcon.synchronous - boolean, true to run the hook synchronously. default false</li>
<li>atlas.hook.falcon.numRetries - number of retries for notification failure. default 3</li>
<li>atlas.hook.falcon.minThreads - core number of threads. default 5</li>
<li>atlas.hook.falcon.maxThreads - maximum number of threads. default 5</li>
<li>atlas.hook.falcon.keepAliveTime - keep alive time in msecs. default 10</li>
<li>atlas.hook.falcon.queueSize - queue size for the threadpool. default 10000</li></ul>
<p>Refer <a href="./Configuration.html">Configuration</a> for notification related configurations</p></div>
<div class="section">
<h3><a name="Limitations"></a>Limitations</h3>
<p></p>
<ul>
<li>In falcon cluster entity, cluster name used should be uniform across components like hive, falcon, sqoop etc. If used with ambari, ambari cluster name should be used for cluster entity</li></ul></div>
</div>
</div>
<hr/>
<footer>
<div class="container">
<div class="row span12">Copyright &copy; 2015-2017
<a href="http://www.apache.org">Apache Software Foundation</a>.
All Rights Reserved.
</div>
<p id="poweredBy" class="pull-right">
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" />
</a>
</p>
</div>
</footer>
</body>
</html>