blob: 1fe473afb79ca2336d628982c9e4139cb794bd4b [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="Apache Ozone Documentation">
<title>Documentation for Apache Ozone</title>
<link href="../css/bootstrap.min.css" rel="stylesheet">
<link href="../css/ozonedoc.css" rel="stylesheet">
</head>
<body>
<nav class="navbar navbar-inverse navbar-fixed-top">
<div class="container-fluid">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#sidebar" aria-expanded="false" aria-controls="navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a href="../index.html" class="navbar-left ozone-logo">
<img src="../ozone-logo-small.png"/>
</a>
<a class="navbar-brand hidden-xs" href="../index.html">
Apache Ozone/HDDS documentation
</a>
<a class="navbar-brand visible-xs-inline" href="#">Apache Ozone</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav navbar-right">
<li><a href="https://github.com/apache/hadoop-ozone">Source</a></li>
<li><a href="https://hadoop.apache.org">Apache Hadoop</a></li>
<li><a href="https://apache.org">ASF</a></li>
</ul>
</div>
</div>
</nav>
<div class="wrapper">
<div class="container-fluid">
<div class="row">
<div class="col-sm-2 col-md-2 sidebar" id="sidebar">
<ul class="nav nav-sidebar">
<li class="">
<a href="../index.html">
<span>Overview</span>
</a>
</li>
<li class="">
<a href="../start.html">
<span>Getting Started</span>
</a>
</li>
<li class="">
<a href="../concept.html">
<span>Architecture</span>
</a>
<ul class="nav">
<li class="">
<a href="../concept/overview.html">Overview</a>
</li>
<li class="">
<a href="../concept/ozonemanager.html">Ozone Manager</a>
</li>
<li class="">
<a href="../concept/storagecontainermanager.html">Storage Container Manager</a>
</li>
<li class="">
<a href="../concept/containers.html">Containers</a>
</li>
<li class="active">
<a href="../concept/datanodes.html">Datanodes</a>
</li>
<li class="">
<a href="../concept/recon.html">Recon</a>
</li>
</ul>
</li>
<li class="">
<a href="../feature.html">
<span>Features</span>
</a>
<ul class="nav">
<li class="">
<a href="../feature/ha.html">High Availability</a>
</li>
<li class="">
<a href="../feature/topology.html">Topology awareness</a>
</li>
<li class="">
<a href="../feature/quota.html">Quota in Ozone</a>
</li>
<li class="">
<a href="../feature/recon.html">Recon Server</a>
</li>
<li class="">
<a href="../feature/observability.html">Observability</a>
</li>
</ul>
</li>
<li class="">
<a href="../interface.html">
<span>Client Interfaces</span>
</a>
<ul class="nav">
<li class="">
<a href="../interface/ofs.html">Ofs (Hadoop compatible)</a>
</li>
<li class="">
<a href="../interface/o3fs.html">O3fs (Hadoop compatible)</a>
</li>
<li class="">
<a href="../interface/s3.html">S3 Protocol</a>
</li>
<li class="">
<a href="../interface/cli.html">Command Line Interface</a>
</li>
<li class="">
<a href="../interface/reconapi.html">Recon API</a>
</li>
<li class="">
<a href="../interface/javaapi.html">Java API</a>
</li>
<li class="">
<a href="../interface/csi.html">CSI Protocol</a>
</li>
</ul>
</li>
<li class="">
<a href="../security.html">
<span>Security</span>
</a>
<ul class="nav">
<li class="">
<a href="../security/secureozone.html">Securing Ozone</a>
</li>
<li class="">
<a href="../security/securingtde.html">Transparent Data Encryption</a>
</li>
<li class="">
<a href="../security/gdpr.html">GDPR in Ozone</a>
</li>
<li class="">
<a href="../security/securingdatanodes.html">Securing Datanodes</a>
</li>
<li class="">
<a href="../security/securingozonehttp.html">Securing HTTP</a>
</li>
<li class="">
<a href="../security/securings3.html">Securing S3</a>
</li>
<li class="">
<a href="../security/securityacls.html">Ozone ACLs</a>
</li>
<li class="">
<a href="../security/securitywithranger.html">Apache Ranger</a>
</li>
</ul>
</li>
<li class="">
<a href="../tools.html">
<span>Tools</span>
</a>
</li>
<li class="">
<a href="../recipe.html">
<span>Recipes</span>
</a>
</li>
<li><a href="../design.html"><span><b>Design docs</b></span></a></li>
<li class="visible-xs"><a href="#">References</a>
<ul class="nav">
<li><a href="https://github.com/apache/hadoop"><span class="glyphicon glyphicon-new-window" aria-hidden="true"></span> Source</a></li>
<li><a href="https://hadoop.apache.org"><span class="glyphicon glyphicon-new-window" aria-hidden="true"></span> Apache Hadoop</a></li>
<li><a href="https://apache.org"><span class="glyphicon glyphicon-new-window" aria-hidden="true"></span> ASF</a></li>
</ul></li>
</ul>
</div>
<div class="col-sm-10 col-sm-offset-2 col-md-10 col-md-offset-2 main">
<div class="col-md-9">
<nav aria-label="breadcrumb">
<ol class="breadcrumb">
<li class="breadcrumb-item"><a href="../index.html">Home</a></li>
<li class="breadcrumb-item" aria-current="page"><a href="../concept.html">Architecture</a></li>
<li class="breadcrumb-item active" aria-current="page">Datanodes</li>
</ol>
</nav>
<div class="pull-right">
<a href="../zh/concept/datanodes.html"><span class="label label-success">中文</span></a>
</div>
<div class="col-md-9">
<h1>Datanodes</h1>
<!---
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>Datanodes are the worker bees of Ozone. All data is stored on data nodes.
Clients write data in terms of blocks. Datanode aggregates these blocks into
a storage container. A storage container is the data streams and metadata
about the blocks written by the clients.</p>
<h2 id="storage-containers">Storage Containers</h2>
<p><img src="ContainerMetadata.png" alt="FunctionalOzone"></p>
<p>A storage container is a self-contained super block. It has a list of Ozone
blocks that reside inside it, as well as on-disk files which contain the
actual data streams. This is the default Storage container format. From
Ozone&rsquo;s perspective, container is a protocol spec, actual storage layouts
does not matter. In other words, it is trivial to extend or bring new
container layouts. Hence this should be treated as a reference implementation
of containers under Ozone.</p>
<h2 id="understanding-ozone-blocks-and-containers">Understanding Ozone Blocks and Containers</h2>
<p>When a client wants to read a key from Ozone, the client sends the name of
the key to the Ozone Manager. Ozone manager returns the list of Ozone blocks
that make up that key.</p>
<p>An Ozone block contains the container ID and a local ID. The figure below
shows the logical layout out of Ozone block.</p>
<p><img src="OzoneBlock.png" alt="OzoneBlock"></p>
<p>The container ID lets the clients discover the location of the container. The
authoritative information about where a container is located is with the
Storage Container Manager (SCM). In most cases, the container location will be
cached by Ozone Manager and will be returned along with the Ozone blocks.</p>
<p>Once the client is able to locate the container, that is, understand which
data nodes contain this container, the client will connect to the datanode
and read the data stream specified by <em>Container ID:Local ID</em>. In other
words, the local ID serves as index into the container which describes what
data stream we want to read from.</p>
<h3 id="discovering-the-container-locations">Discovering the Container Locations</h3>
<p>How does SCM know where the containers are located ? This is very similar to
what HDFS does; the data nodes regularly send container reports like block
reports. Container reports are far more concise than block reports. For
example, an Ozone deployment with a 196 TB data node will have around 40
thousand containers. Compare that with HDFS block count of million and half
blocks that get reported. That is a 40x reduction in the block reports.</p>
<p>This extra indirection helps tremendously with scaling Ozone. SCM has far
less block data to process and the namespace service (Ozone Manager) as a
different service are critical to scaling Ozone.</p>
<a class="btn btn-success btn-lg" href="../concept/recon.html">Next >></a>
</div>
</div>
</div>
</div>
</div>
<div class="push"></div>
</div>
<footer class="footer">
<div class="container">
<span class="small text-muted">
Version: 1.1.0, Last Modified: August 7, 2020 <a class="hide-child link primary-color" href="https://github.com/apache/ozone/commit/5ce6f0eab381389fd04c3130531b3ec626acbc65">5ce6f0eab</a>
</span>
</div>
</footer>
<script src="../js/jquery-3.5.1.min.js"></script>
<script src="../js/ozonedoc.js"></script>
<script src="../js/bootstrap.min.js"></script>
</body>
</html>