| <!DOCTYPE html> |
| <!-- |
| | Generated by Apache Maven Doxia at 2015-12-30 |
| | Rendered using Apache Maven Fluido Skin 1.3.0 |
| --> |
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
| <head> |
| <meta charset="UTF-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> |
| <meta name="Date-Revision-yyyymmdd" content="20151230" /> |
| <meta http-equiv="Content-Language" content="en" /> |
| <title>Apache Atlas – Hive Atlas Bridge</title> |
| <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" /> |
| <link rel="stylesheet" href="./css/site.css" /> |
| <link rel="stylesheet" href="./css/print.css" media="print" /> |
| |
| |
| <script type="text/javascript" src="./js/apache-maven-fluido-1.3.0.min.js"></script> |
| |
| |
| |
| <script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script> |
| |
| </head> |
| <body class="topBarEnabled"> |
| |
| |
| |
| |
| |
| <div id="topbar" class="navbar navbar-fixed-top "> |
| <div class="navbar-inner"> |
| <div class="container" style="width: 68%;"><div class="nav-collapse"> |
| |
| |
| <ul class="nav"> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Atlas <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| |
| <li> <a href="index.html" title="About">About</a> |
| </li> |
| |
| <li> <a href="https://cwiki.apache.org/confluence/display/ATLAS" title="Wiki">Wiki</a> |
| </li> |
| |
| <li> <a href="https://cwiki.apache.org/confluence/display/ATLAS" title="News">News</a> |
| </li> |
| |
| <li> <a href="https://git-wip-us.apache.org/repos/asf/incubator-atlas.git" title="Git">Git</a> |
| </li> |
| |
| <li> <a href="https://issues.apache.org/jira/browse/ATLAS" title="Jira">Jira</a> |
| </li> |
| |
| <li> <a href="https://cwiki.apache.org/confluence/display/ATLAS/PoweredBy" title="Powered by">Powered by</a> |
| </li> |
| |
| <li> <a href="http://blogs.apache.org/atlas/" title="Blog">Blog</a> |
| </li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Project Information <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| |
| <li> <a href="project-info.html" title="Summary">Summary</a> |
| </li> |
| |
| <li> <a href="mail-lists.html" title="Mailing Lists">Mailing Lists</a> |
| </li> |
| |
| <li> <a href="http://webchat.freenode.net?channels=apacheatlas&uio=d4" title="IRC">IRC</a> |
| </li> |
| |
| <li> <a href="team-list.html" title="Team">Team</a> |
| </li> |
| |
| <li> <a href="issue-tracking.html" title="Issue Tracking">Issue Tracking</a> |
| </li> |
| |
| <li> <a href="source-repository.html" title="Source Repository">Source Repository</a> |
| </li> |
| |
| <li> <a href="license.html" title="License">License</a> |
| </li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Releases <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| |
| <li> <a href="http://www.apache.org/dyn/closer.cgi/incubator/atlas/0.5.0-incubating/" title="0.5-incubating">0.5-incubating</a> |
| </li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="../index.html" title="latest">latest</a></li> |
| <li><a href="../0.6.0-incubating/index.html" title="0.6-incubating">0.6-incubating</a></li> |
| <li><a href="../0.5.0-incubating/index.html" title="0.5-incubating">0.5-incubating</a></li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">ASF <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| |
| <li> <a href="http://www.apache.org/foundation/how-it-works.html" title="How Apache Works">How Apache Works</a> |
| </li> |
| |
| <li> <a href="http://www.apache.org/foundation/" title="Foundation">Foundation</a> |
| </li> |
| |
| <li> <a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsoring Apache">Sponsoring Apache</a> |
| </li> |
| |
| <li> <a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| |
| <form id="search-form" action="http://www.google.com/search" method="get" class="navbar-search pull-right" > |
| |
| <input value="http://atlas.incubator.apache.org" name="sitesearch" type="hidden"/> |
| <input class="search-query" name="q" id="query" type="text" /> |
| </form> |
| <script type="text/javascript" src="http://www.google.com/coop/cse/brand?form=search-form"></script> |
| |
| |
| |
| |
| |
| <iframe src="http://www.facebook.com/plugins/like.php?href=http://atlas.incubator.apache.org/atlas-docs&send=false&layout=button_count&show-faces=false&action=like&colorscheme=dark" |
| scrolling="no" frameborder="0" |
| style="border:none; width:80px; height:20px; margin-top: 10px;" class="pull-right" ></iframe> |
| |
| <script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script> |
| |
| <ul class="nav pull-right"><li style="margin-top: 10px;"> |
| |
| <div class="g-plusone" data-href="http://atlas.incubator.apache.org/atlas-docs" data-size="medium" width="60px" align="right" ></div> |
| |
| </li></ul> |
| |
| |
| </div> |
| |
| </div> |
| </div> |
| </div> |
| |
| <div class="container"> |
| <div id="banner"> |
| <div class="pull-left"> |
| <a href=".." id="bannerLeft"> |
| <img src="images/atlas-logo.png" alt="Apache Atlas" width="200px" height="45px"/> |
| </a> |
| </div> |
| <div class="pull-right"> <a href="http://incubator.apache.org" id="bannerRight"> |
| <img src="images/apache-incubator-logo.png" alt="Apache Incubator"/> |
| </a> |
| </div> |
| <div class="clear"><hr/></div> |
| </div> |
| |
| <div id="breadcrumbs"> |
| <ul class="breadcrumb"> |
| |
| |
| <li class=""> |
| <a href="http://www.apache.org" class="externalLink" title="Apache"> |
| Apache</a> |
| </li> |
| <li class="divider ">/</li> |
| <li class=""> |
| <a href="index.html" title="Atlas"> |
| Atlas</a> |
| </li> |
| <li class="divider ">/</li> |
| <li class="">Hive Atlas Bridge</li> |
| |
| |
| |
| <li id="publishDate" class="pull-right">Last Published: 2015-12-30</li> <li class="divider pull-right">|</li> |
| <li id="projectVersion" class="pull-right">Version: 0.6-incubating</li> |
| |
| </ul> |
| </div> |
| |
| |
| |
| <div id="bodyColumn" > |
| |
| <div class="section"> |
| <h2><a name="Hive_Atlas_Bridge"></a>Hive Atlas Bridge</h2></div> |
| <div class="section"> |
| <h3><a name="Hive_Model"></a>Hive Model</h3> |
| <p>The default hive modelling is available in org.apache.atlas.hive.model.HiveDataModelGenerator. It defines the following types:</p> |
| <div class="source"> |
| <pre> |
| hive_object_type(EnumType) - values [GLOBAL, DATABASE, TABLE, PARTITION, COLUMN] |
| hive_resource_type(EnumType) - values [JAR, FILE, ARCHIVE] |
| hive_principal_type(EnumType) - values [USER, ROLE, GROUP] |
| hive_db(ClassType) - super types [Referenceable] - attributes [name, clusterName, description, locationUri, parameters, ownerName, ownerType] |
| hive_order(StructType) - attributes [col, order] |
| hive_resourceuri(StructType) - attributes [resourceType, uri] |
| hive_serde(StructType) - attributes [name, serializationLib, parameters] |
| hive_type(ClassType) - super types [] - attributes [name, type1, type2, fields] |
| hive_storagedesc(ClassType) - super types [Referenceable] - attributes [cols, location, inputFormat, outputFormat, compressed, numBuckets, serdeInfo, bucketCols, sortCols, parameters, storedAsSubDirectories] |
| hive_role(ClassType) - super types [] - attributes [roleName, createTime, ownerName] |
| hive_column(ClassType) - super types [Referenceable] - attributes [name, type, comment] |
| hive_table(ClassType) - super types [DataSet] - attributes [tableName, db, owner, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, parameters, viewOriginalText, viewExpandedText, tableType, temporary] |
| hive_partition(ClassType) - super types [Referenceable] - attributes [values, table, createTime, lastAccessTime, sd, columns, parameters] |
| hive_process(ClassType) - super types [Process] - attributes [startTime, endTime, userName, operationType, queryText, queryPlan, queryId, queryGraph] |
| |
| </pre></div> |
| <p>The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying/lineage as well. Note that dbName and tableName should be in lower case. clusterName is explained below.</p> |
| <ul> |
| <li>hive_db - attribute qualifiedName - <dbName>@<clusterName></li> |
| <li>hive_table - attribute name - <dbName>.<tableName>@<clusterName></li> |
| <li>hive_column - attribute qualifiedName - <dbName>.<tableName>.<columnName>@<clusterName></li> |
| <li>hive_partition - attribute qualifiedName - <dbName>.<tableName>.<partitionValues('-' separated)>@<clusterName></li> |
| <li>hive_process - attribute name - <queryString> - trimmed query string in lower case</li></ul></div> |
| <div class="section"> |
| <h3><a name="Importing_Hive_Metadata"></a>Importing Hive Metadata</h3> |
| <p>org.apache.atlas.hive.bridge.HiveMetaStoreBridge imports the hive metadata into Atlas using the model defined in org.apache.atlas.hive.model.HiveDataModelGenerator. import-hive.sh command can be used to facilitate this. Set the following configuration in <atlas-conf>/client.properties and set environment variable $HIVE_CONF_DIR to the hive conf directory:</p> |
| <div class="source"> |
| <pre> |
| <property> |
| <name>atlas.cluster.name</name> |
| <value>primary</value> |
| </property> |
| |
| </pre></div> |
| <p>Usage: <atlas package>/bin/import-hive.sh. The logs are in <atlas package>/logs/import-hive.log</p></div> |
| <div class="section"> |
| <h3><a name="Hive_Hook"></a>Hive Hook</h3> |
| <p>Hive supports listeners on hive command execution using hive hooks. This is used to add/update/remove entities in Atlas using the model defined in org.apache.atlas.hive.model.HiveDataModelGenerator. The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities. Follow these instructions in your hive set-up to add hive hook for Atlas:</p> |
| <ul> |
| <li>Set-up atlas hook in hive-site.xml of your hive configuration:</li></ul> |
| <div class="source"> |
| <pre> |
| <property> |
| <name>hive.exec.post.hooks</name> |
| <value>org.apache.atlas.hive.hook.HiveHook</value> |
| </property> |
| |
| </pre></div> |
| <div class="source"> |
| <pre> |
| <property> |
| <name>atlas.cluster.name</name> |
| <value>primary</value> |
| </property> |
| |
| </pre></div> |
| <p></p> |
| <ul> |
| <li>Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in hive-env.sh of your hive configuration</li> |
| <li>Copy <atlas-conf>/client.properties and <atlas-conf>/application.properties to the hive conf directory.</li></ul> |
| <p>The following properties in <atlas-conf>/client.properties control the thread pool and notification details:</p> |
| <ul> |
| <li>atlas.hook.hive.synchronous - boolean, true to run the hook synchronously. default false</li> |
| <li>atlas.hook.hive.numRetries - number of retries for notification failure. default 3</li> |
| <li>atlas.hook.hive.minThreads - core number of threads. default 5</li> |
| <li>atlas.hook.hive.maxThreads - maximum number of threads. default 5</li> |
| <li>atlas.hook.hive.keepAliveTime - keep alive time in msecs. default 10</li> |
| <li>atlas.hook.hive.queueSize - queue size for the threadpool. default 10000</li></ul> |
| <p>Refer <a href="./Configuration.html">Configuration</a> for notification related configurations</p></div> |
| <div class="section"> |
| <h3><a name="Limitations"></a>Limitations</h3> |
| <p></p> |
| <ul> |
| <li>Since database name, table name and column names are case insensitive in hive, the corresponding names in entities are lowercase. So, any search APIs should use lowercase while querying on the entity names</li> |
| <li>Only the following hive operations are captured by hive hook currently - create database, create table, create view, CTAS, load, import, export, query, alter table rename and alter view rename</li></ul></div> |
| </div> |
| </div> |
| |
| <hr/> |
| |
| <footer> |
| <div class="container"> |
| <div class="row span12">Copyright © 2015 |
| <a href="http://www.apache.org">Apache Software Foundation</a>. |
| All Rights Reserved. |
| |
| </div> |
| |
| |
| <p id="poweredBy" class="pull-right"> |
| <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> |
| <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /> |
| </a> |
| </p> |
| |
| </div> |
| </footer> |
| </body> |
| </html> |