blob: 4bae9145121fea70aa60f46dcd87b160d4060c2c [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="Hadoop Ozone Documentation">
<title>Documentation for Apache Hadoop Ozone</title>
<link href="../css/bootstrap.min.css" rel="stylesheet">
<link href="../css/ozonedoc.css" rel="stylesheet">
</head>
<body>
<nav class="navbar navbar-inverse navbar-fixed-top">
<div class="container-fluid">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#sidebar" aria-expanded="false" aria-controls="navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a href="#" class="navbar-left" style="height: 50px; padding: 5px 5px 5px 0;">
<img src="../ozone-logo-small.png" width="40"/>
</a>
<a class="navbar-brand hidden-xs" href="#">
Apache Hadoop Ozone/HDDS documentation
</a>
<a class="navbar-brand visible-xs-inline" href="#">Hadoop Ozone</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav navbar-right">
<li><a href="https://github.com/apache/hadoop-ozone">Source</a></li>
<li><a href="https://hadoop.apache.org">Apache Hadoop</a></li>
<li><a href="https://apache.org">ASF</a></li>
</ul>
</div>
</div>
</nav>
<div class="container-fluid">
<div class="row">
<div class="col-sm-2 col-md-2 sidebar" id="sidebar">
<ul class="nav nav-sidebar">
<li class="">
<a href="../index.html">
<span>Overview</span>
</a>
</li>
<li class="">
<a href="../start.html">
<span>Getting Started</span>
</a>
</li>
<li class="">
<a href="../concept.html">
<span>Architecture</span>
</a>
<ul class="nav">
<li class="">
<a href="../concept/overview.html">Overview</a>
</li>
<li class="">
<a href="../concept/ozonemanager.html">Ozone Manager</a>
</li>
<li class="">
<a href="../concept/storagecontainermanager.html">Storage Container Manager</a>
</li>
<li class="">
<a href="../concept/containers.html">Containers</a>
</li>
<li class="">
<a href="../concept/datanodes.html">Datanodes</a>
</li>
</ul>
</li>
<li class="">
<a href="../feature.html">
<span>Features</span>
</a>
<ul class="nav">
<li class="">
<a href="../feature/ha.html">High Availability</a>
</li>
<li class="active">
<a href="../feature/topology.html">Topology awareness</a>
</li>
<li class="">
<a href="../feature/gdpr.html">GDPR in Ozone</a>
</li>
<li class="">
<a href="../feature/recon.html">Recon</a>
</li>
<li class="">
<a href="../feature/observability.html">Observability</a>
</li>
</ul>
</li>
<li class="">
<a href="../interface.html">
<span>Client Interfaces</span>
</a>
<ul class="nav">
<li class="">
<a href="../interface/ofs.html">Ofs (Hadoop compatible)</a>
</li>
<li class="">
<a href="../interface/o3fs.html">O3fs (Hadoop compatible)</a>
</li>
<li class="">
<a href="../interface/s3.html">S3 Protocol</a>
</li>
<li class="">
<a href="../interface/cli.html">Command Line Interface</a>
</li>
<li class="">
<a href="../interface/javaapi.html">Java API</a>
</li>
<li class="">
<a href="../interface/csi.html">CSI Protocol</a>
</li>
</ul>
</li>
<li class="">
<a href="../security.html">
<span>Security</span>
</a>
<ul class="nav">
<li class="">
<a href="../security/secureozone.html">Securing Ozone</a>
</li>
<li class="">
<a href="../security/securingtde.html">Transparent Data Encryption</a>
</li>
<li class="">
<a href="../security/securingdatanodes.html">Securing Datanodes</a>
</li>
<li class="">
<a href="../security/securingozonehttp.html">Securing HTTP</a>
</li>
<li class="">
<a href="../security/securings3.html">Securing S3</a>
</li>
<li class="">
<a href="../security/securityacls.html">Ozone ACLs</a>
</li>
<li class="">
<a href="../security/securitywithranger.html">Apache Ranger</a>
</li>
</ul>
</li>
<li class="">
<a href="../tools.html">
<span>Tools</span>
</a>
</li>
<li class="">
<a href="../recipe.html">
<span>Recipes</span>
</a>
</li>
<li><a href="../design.html"><span><b>Design docs</b></span></a></li>
<li class="visible-xs"><a href="#">References</a>
<ul class="nav">
<li><a href="https://github.com/apache/hadoop"><span class="glyphicon glyphicon-new-window" aria-hidden="true"></span> Source</a></li>
<li><a href="https://hadoop.apache.org"><span class="glyphicon glyphicon-new-window" aria-hidden="true"></span> Apache Hadoop</a></li>
<li><a href="https://apache.org"><span class="glyphicon glyphicon-new-window" aria-hidden="true"></span> ASF</a></li>
</ul></li>
</ul>
</div>
<div class="col-sm-10 col-sm-offset-2 col-md-10 col-md-offset-2 main">
<div class="col-md-9">
<nav aria-label="breadcrumb">
<ol class="breadcrumb">
<li class="breadcrumb-item"><a href="../">Home</a></li>
<li class="breadcrumb-item" aria-current="page"><a href="../feature.html">Features</a></li>
<li class="breadcrumb-item active" aria-current="page">Topology awareness</li>
</ol>
</nav>
<div class="pull-right">
</div>
<div class="col-md-9">
<h1>Topology awareness</h1>
<!---
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>Ozone can use topology related information (for example rack placement) to optimize read and write pipelines. To get full rack-aware cluster, Ozone requires three different configuration.</p>
<ol>
<li>The topology information should be configured by Ozone.</li>
<li>Topology related information should be used when Ozone chooses 3 different datanodes for a specific pipeline/container. (WRITE)</li>
<li>When Ozone reads a Key it should prefer to read from the closest node.</li>
</ol>
<div class="alert alert-warning" role="alert">
<p>Ozone uses RAFT replication for Open containers (write), and an async replication for closed, immutable containers (cold data). As RAFT requires low-latency network, topology awareness placement is available only for closed containers. See the <a href="../concept/containers.html">page about Containers</a> about more information related to Open vs Closed containers.</p>
</div>
<h2 id="topology-hierarchy">Topology hierarchy</h2>
<p>Topology hierarchy can be configured with using <code>net.topology.node.switch.mapping.impl</code> configuration key. This configuration should define an implementation of the <code>org.apache.hadoop.net.CachedDNSToSwitchMapping</code>. As this is a Hadoop class, the configuration is exactly the same as the Hadoop Configuration</p>
<h3 id="static-list">Static list</h3>
<p>Static list can be configured with the help of <code>TableMapping</code>:</p>
<div class="highlight"><pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-XML" data-lang="XML"><span style="color:#f92672">&lt;property&gt;</span>
<span style="color:#f92672">&lt;name&gt;</span>net.topology.node.switch.mapping.impl<span style="color:#f92672">&lt;/name&gt;</span>
<span style="color:#f92672">&lt;value&gt;</span>org.apache.hadoop.net.TableMapping<span style="color:#f92672">&lt;/value&gt;</span>
<span style="color:#f92672">&lt;/property&gt;</span>
<span style="color:#f92672">&lt;property&gt;</span>
<span style="color:#f92672">&lt;name&gt;</span>net.topology.table.file.name<span style="color:#f92672">&lt;/name&gt;</span>
<span style="color:#f92672">&lt;value&gt;</span>/opt/hadoop/compose/ozone-topology/network-config<span style="color:#f92672">&lt;/value&gt;</span>
<span style="color:#f92672">&lt;/property&gt;</span>
</code></pre></div><p>The second configuration option should point to a text file. The file format is a two column text file, with columns separated by whitespace. The first column is a DNS or IP address and the second column specifies the rack where the address maps. If no entry corresponding to a host in the cluster is found, then <code>/default-rack</code> is assumed.</p>
<h3 id="dynamic-list">Dynamic list</h3>
<p>Rack information can be identified with the help of an external script:</p>
<div class="highlight"><pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-XML" data-lang="XML"><span style="color:#f92672">&lt;property&gt;</span>
<span style="color:#f92672">&lt;name&gt;</span>net.topology.node.switch.mapping.impl<span style="color:#f92672">&lt;/name&gt;</span>
<span style="color:#f92672">&lt;value&gt;</span>org.apache.hadoop.net.TableMapping<span style="color:#f92672">&lt;/value&gt;</span>
<span style="color:#f92672">&lt;/property&gt;</span>
<span style="color:#f92672">&lt;property&gt;</span>
<span style="color:#f92672">&lt;name&gt;</span>org.apache.hadoop.net.ScriptBasedMapping<span style="color:#f92672">&lt;/name&gt;</span>
<span style="color:#f92672">&lt;value&gt;</span>/usr/local/bin/rack.sh<span style="color:#f92672">&lt;/value&gt;</span>
<span style="color:#f92672">&lt;/property&gt;</span>
</code></pre></div><p>If implementing an external script, it will be specified with the <code>net.topology.script.file.name</code> parameter in the configuration files. Unlike the java class, the external topology script is not included with the Ozone distribution and is provided by the administrator. Ozone will send multiple IP addresses to ARGV when forking the topology script. The number of IP addresses sent to the topology script is controlled with <code>net.topology.script.number.args</code> and defaults to 100. If <code>net.topology.script.number.args</code> was changed to 1, a topology script would get forked for each IP submitted.</p>
<h2 id="write-path">Write path</h2>
<p>Placement of the closed containers can be configured with <code>ozone.scm.container.placement.impl</code> configuration key. The available container placement policies can be found in the <code>org.apache.hdds.scm.container.placement</code> <a href="https://github.com/apache/hadoop-ozone/tree/master/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/placement/algorithms">package</a>.</p>
<p>By default the <code>SCMContainerPlacementRandom</code> is used for topology-awareness the <code>SCMContainerPlacementRackAware</code> can be used:</p>
<div class="highlight"><pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-XML" data-lang="XML"><span style="color:#f92672">&lt;property&gt;</span>
<span style="color:#f92672">&lt;name&gt;</span>ozone.scm.container.placement.impl<span style="color:#f92672">&lt;/name&gt;</span>
<span style="color:#f92672">&lt;value&gt;</span>org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware<span style="color:#f92672">&lt;/value&gt;</span>
<span style="color:#f92672">&lt;/property&gt;</span>
</code></pre></div><p>This placement policy complies with the algorithm used in HDFS. With default 3 replica, two replicas will be on the same rack, the third one will on a different rack.</p>
<p>This implementation applies to network topology like &ldquo;/rack/node&rdquo;. Don&rsquo;t recommend to use this if the network topology has more layers.</p>
<h2 id="read-path">Read path</h2>
<p>Finally the read path also should be configured to read the data from the closest pipeline.</p>
<div class="highlight"><pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-XML" data-lang="XML"><span style="color:#f92672">&lt;property&gt;</span>
<span style="color:#f92672">&lt;name&gt;</span>ozone.network.topology.aware.read<span style="color:#f92672">&lt;/name&gt;</span>
<span style="color:#f92672">&lt;value&gt;</span>true<span style="color:#f92672">&lt;/value&gt;</span>
<span style="color:#f92672">&lt;/property&gt;</span>
</code></pre></div><h2 id="references">References</h2>
<ul>
<li>Hadoop documentation about <code>net.topology.node.switch.mapping.impl</code>: <a href="https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/RackAwareness.html">https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/RackAwareness.html</a></li>
<li><a href="../design/topology.html">Design doc</a></li>
</ul>
<a class="btn btn-success btn-lg" href="../feature/gdpr.html">Next >></a>
</div>
</div>
</div>
</div>
</div>
<script src="../js/jquery-3.5.1.min.js"></script>
<script src="../js/ozonedoc.js"></script>
<script src="../js/bootstrap.min.js"></script>
</body>
</html>