blob: b77eaf70a06c98a4a3ada46904119d4d38180ae2 [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia at 2015-12-30
| Rendered using Apache Maven Fluido Skin 1.3.0
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="Date-Revision-yyyymmdd" content="20151230" />
<meta http-equiv="Content-Language" content="en" />
<title>Apache Atlas &#x2013; Architecture</title>
<link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
<link rel="stylesheet" href="./css/site.css" />
<link rel="stylesheet" href="./css/print.css" media="print" />
<script type="text/javascript" src="./js/apache-maven-fluido-1.3.0.min.js"></script>
<script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script>
</head>
<body class="topBarEnabled">
<div id="topbar" class="navbar navbar-fixed-top ">
<div class="navbar-inner">
<div class="container" style="width: 68%;"><div class="nav-collapse">
<ul class="nav">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Atlas <b class="caret"></b></a>
<ul class="dropdown-menu">
<li> <a href="index.html" title="About">About</a>
</li>
<li> <a href="https://cwiki.apache.org/confluence/display/ATLAS" title="Wiki">Wiki</a>
</li>
<li> <a href="https://cwiki.apache.org/confluence/display/ATLAS" title="News">News</a>
</li>
<li> <a href="https://git-wip-us.apache.org/repos/asf/incubator-atlas.git" title="Git">Git</a>
</li>
<li> <a href="https://issues.apache.org/jira/browse/ATLAS" title="Jira">Jira</a>
</li>
<li> <a href="https://cwiki.apache.org/confluence/display/ATLAS/PoweredBy" title="Powered by">Powered by</a>
</li>
<li> <a href="http://blogs.apache.org/atlas/" title="Blog">Blog</a>
</li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Project Information <b class="caret"></b></a>
<ul class="dropdown-menu">
<li> <a href="project-info.html" title="Summary">Summary</a>
</li>
<li> <a href="mail-lists.html" title="Mailing Lists">Mailing Lists</a>
</li>
<li> <a href="http://webchat.freenode.net?channels=apacheatlas&uio=d4" title="IRC">IRC</a>
</li>
<li> <a href="team-list.html" title="Team">Team</a>
</li>
<li> <a href="issue-tracking.html" title="Issue Tracking">Issue Tracking</a>
</li>
<li> <a href="source-repository.html" title="Source Repository">Source Repository</a>
</li>
<li> <a href="license.html" title="License">License</a>
</li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Releases <b class="caret"></b></a>
<ul class="dropdown-menu">
<li> <a href="http://www.apache.org/dyn/closer.cgi/incubator/atlas/0.5.0-incubating/" title="0.5-incubating">0.5-incubating</a>
</li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="../index.html" title="latest">latest</a></li>
<li><a href="../0.6.0-incubating/index.html" title="0.6-incubating">0.6-incubating</a></li>
<li><a href="../0.5.0-incubating/index.html" title="0.5-incubating">0.5-incubating</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">ASF <b class="caret"></b></a>
<ul class="dropdown-menu">
<li> <a href="http://www.apache.org/foundation/how-it-works.html" title="How Apache Works">How Apache Works</a>
</li>
<li> <a href="http://www.apache.org/foundation/" title="Foundation">Foundation</a>
</li>
<li> <a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsoring Apache">Sponsoring Apache</a>
</li>
<li> <a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a>
</li>
</ul>
</li>
</ul>
<form id="search-form" action="http://www.google.com/search" method="get" class="navbar-search pull-right" >
<input value="http://atlas.incubator.apache.org" name="sitesearch" type="hidden"/>
<input class="search-query" name="q" id="query" type="text" />
</form>
<script type="text/javascript" src="http://www.google.com/coop/cse/brand?form=search-form"></script>
<iframe src="http://www.facebook.com/plugins/like.php?href=http://atlas.incubator.apache.org/atlas-docs&send=false&layout=button_count&show-faces=false&action=like&colorscheme=dark"
scrolling="no" frameborder="0"
style="border:none; width:80px; height:20px; margin-top: 10px;" class="pull-right" ></iframe>
<script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script>
<ul class="nav pull-right"><li style="margin-top: 10px;">
<div class="g-plusone" data-href="http://atlas.incubator.apache.org/atlas-docs" data-size="medium" width="60px" align="right" ></div>
</li></ul>
</div>
</div>
</div>
</div>
<div class="container">
<div id="banner">
<div class="pull-left">
<a href=".." id="bannerLeft">
<img src="images/atlas-logo.png" alt="Apache Atlas" width="200px" height="45px"/>
</a>
</div>
<div class="pull-right"> <a href="http://incubator.apache.org" id="bannerRight">
<img src="images/apache-incubator-logo.png" alt="Apache Incubator"/>
</a>
</div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li class="">
<a href="http://www.apache.org" class="externalLink" title="Apache">
Apache</a>
</li>
<li class="divider ">/</li>
<li class="">
<a href="index.html" title="Atlas">
Atlas</a>
</li>
<li class="divider ">/</li>
<li class="">Architecture</li>
<li id="publishDate" class="pull-right">Last Published: 2015-12-30</li> <li class="divider pull-right">|</li>
<li id="projectVersion" class="pull-right">Version: 0.6-incubating</li>
</ul>
</div>
<div id="bodyColumn" >
<div class="section">
<h2><a name="Architecture"></a>Architecture</h2></div>
<div class="section">
<h3><a name="Introduction"></a>Introduction</h3></div>
<div class="section">
<h3><a name="Atlas_High_Level_Architecture_-_Overview"></a>Atlas High Level Architecture - Overview</h3>
<p><img src="images/twiki/architecture.png" alt="" /></p>
<p>Architecturally, Atlas has the following components:</p>
<p></p>
<ul>
<li><b>A Web service</b>: This exposes RESTful APIs and a Web user interface to create, update and query metadata.</li>
<li><b>Metadata store</b>: Metadata is modeled using a graph model, implemented using the Graph database Titan. Titan has options for a variety of backing stores for persisting the graph, including an embedded Berkeley DB, Apache HBase and Apache Cassandra. The choice of the backing store determines the level of service availability.</li>
<li><b>Index store</b>: For powering full text searches on metadata, Atlas also indexes the metadata, again via Titan. The backing store for the full text search is a search backend like ElasticSearch or Apache Solr.</li>
<li><b>Bridges / Hooks</b>: To add metadata to Atlas, libraries called &#x2018;hooks&#x2019; are enabled in various systems like Apache Hive, Apache Falcon and Apache Sqoop which capture metadata events in the respective systems and propagate those events to Atlas. The Atlas server consumes these events and updates its stores.</li>
<li><b>Metadata notification events</b>: Any updates to metadata in Atlas, either via the Hooks or the API are propagated from Atlas to downstream systems via events. Systems like Apache Ranger consume these events and allow administrators to act on them, for e.g. to configure policies for Access control.</li>
<li><b>Notification Server</b>: Atlas uses Apache Kafka as a notification server for communication between hooks and downstream consumers of metadata notification events. Events are written by the hooks and Atlas to different Kafka topics. Kafka enables a loosely coupled integration between these disparate systems.</li></ul></div>
<div class="section">
<h3><a name="Bridges"></a>Bridges</h3>
<p>External components like hive/sqoop/storm/falcon should model their taxonomy using typesystem and register the types with Atlas. For every entity created in this external component, the corresponding entity should be registered in Atlas as well. This is typically done in a hook which runs in the external component and is called for every entity operation. Hook generally processes the entity asynchronously using a thread pool to avoid adding latency to the main operation. The hook can then build the entity and register the entity using Atlas REST APIs. Howerver, any failure in APIs because of network issue etc can in result entity not registered in Atlas and hence inconsistent metadata.</p>
<p>Atlas exposes notification interface and can be used for reliable entity registration by hook as well. The hook can send notification message containing the list of entities to be registered. Atlas service contains hook consumer that listens to these messages and registers the entities.</p>
<p>Available bridges are:</p>
<ul>
<li><a href="./Bridge-Hive.html">Hive Bridge</a></li></ul></div>
<div class="section">
<h3><a name="Notification"></a>Notification</h3>
<p>Notification is used for reliable entity registration from hooks and for entity/type change notifications. Atlas, by default, provides Kafka integration, but its possible to provide other implementations as well. Atlas service starts embedded Kafka server by default.</p>
<p>Atlas also provides <a href="./NotificationHookConsumer.html">NotificationHookConsumer</a> that runs in Atlas Service and listens to messages from hook and registers the entities in Atlas. <img src="images/twiki/notification.png" alt="" /></p></div>
</div>
</div>
<hr/>
<footer>
<div class="container">
<div class="row span12">Copyright &copy; 2015
<a href="http://www.apache.org">Apache Software Foundation</a>.
All Rights Reserved.
</div>
<p id="poweredBy" class="pull-right">
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" />
</a>
</p>
</div>
</footer>
</body>
</html>