blob: bff8aa890c40890dffa056d8618b14263f85ea99 [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia at 2018-03-12
| Rendered using Apache Maven Fluido Skin 1.3.0
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="Date-Revision-yyyymmdd" content="20180312" />
<meta http-equiv="Content-Language" content="en" />
<title>Falcon - HDFS mirroring Extension</title>
<link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
<link rel="stylesheet" href="./css/site.css" />
<link rel="stylesheet" href="./css/print.css" media="print" />
<script type="text/javascript" src="./js/apache-maven-fluido-1.3.0.min.js"></script>
<script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script>
</head>
<body class="topBarDisabled">
<div class="container">
<div id="banner">
<div class="pull-left">
<div id="bannerLeft">
<img src="images/falcon-logo.png" alt="Apache Falcon" width="200px" height="45px"/>
</div>
</div>
<div class="pull-right"> </div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li class="">
<a href="index.html" title="Falcon">
Falcon</a>
</li>
<li class="divider ">/</li>
<li class="">HDFS mirroring Extension</li>
<li id="publishDate" class="pull-right">Last Published: 2018-03-12</li> <li class="divider pull-right">|</li>
<li id="projectVersion" class="pull-right">Version: 0.11</li>
</ul>
</div>
<div id="bodyColumn" >
<div class="section">
<h2>HDFS mirroring Extension<a name="HDFS_mirroring_Extension"></a></h2></div>
<div class="section">
<h3>Overview<a name="Overview"></a></h3>
<p>Falcon supports HDFS mirroring extension to replicate data from source cluster to destination cluster. This extension implements replicating arbitrary directories on HDFS and piggy backs on replication solution in Falcon which uses the <a href="./DistCp.html">DistCp</a> tool. It also allows users to replicate data from on-premise to cloud, either Azure WASB or S3.</p></div>
<div class="section">
<h3>Use Case<a name="Use_Case"></a></h3>
<p>* Copy directories between HDFS clusters with out dated partitions * Archive directories from HDFS to Cloud. Ex: S3, Azure WASB</p></div>
<div class="section">
<h3>Limitations<a name="Limitations"></a></h3>
<p>As the data volume and number of files grow, this can get inefficient.</p></div>
<div class="section">
<h3>Usage<a name="Usage"></a></h3></div>
<div class="section">
<h4>Setup source and destination clusters<a name="Setup_source_and_destination_clusters"></a></h4>
<div class="source">
<pre>
$FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml
</pre></div></div>
<div class="section">
<h4>HDFS mirroring extension properties<a name="HDFS_mirroring_extension_properties"></a></h4>
<p>Extension artifacts are expected to be installed on HDFS at the path specified by &quot;extension.store.uri&quot; in startup properties. hdfs-mirroring-properties.json file located at &quot;&lt;extension.store.uri&gt;/hdfs-mirroring/META/hdfs-mirroring-properties.json&quot; lists all the required and optional parameters/arguments for scheduling HDFS mirroring job.</p></div>
<div class="section">
<h4>Submit and schedule HDFS mirroring extension<a name="Submit_and_schedule_HDFS_mirroring_extension"></a></h4>
<div class="source">
<pre>
$FALCON_HOME/bin/falcon extension -submitAndSchedule -extensionName hdfs-mirroring -file /process/definition.xml
</pre></div>
<p>Please Refer to <a href="./Falconcli/FalconCLI.html">Falcon CLI</a> and <a href="./Restapi/ResourceList.html">REST API</a> for more details on usage of CLI and REST API's.</p></div>
</div>
</div>
<hr/>
<footer>
<div class="container">
<div class="row span12">Copyright &copy; 2013-2018
<a href="http://www.apache.org">Apache Software Foundation</a>.
All Rights Reserved.
</div>
<p id="poweredBy" class="pull-right">
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" />
</a>
</p>
</div>
</footer>
</body>
</html>