blob: bccaa5df9fa961a92632a53be5c4296b07102046 [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="Hadoop Ozone Documentation">
<title>Documentation for Apache Hadoop Ozone</title>
<link href="css/bootstrap.min.css" rel="stylesheet">
<link href="css/ozonedoc.css" rel="stylesheet">
</head>
<body>
<nav class="navbar navbar-inverse navbar-fixed-top">
<div class="container-fluid">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false" aria-controls="navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="#">Apache Hadoop Ozone/HDDS documentation</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav navbar-right">
<li><a href="https://github.com/apache/hadoop">Source</a></li>
<li><a href="https://hadoop.apache.org">Apache Hadoop</a></li>
<li><a href="https://apache.org">ASF</a></li>
</ul>
</div>
</div>
</nav>
<div class="container-fluid">
<div class="row">
<div class="col-sm-3 col-md-2 sidebar">
<img src="ozone-logo.png" style="max-width: 100%;"/>
<ul class="nav nav-sidebar">
<li class="">
<a href="index.html">
<span>Ozone Overview</span>
</a>
</li>
<li class="">
<a href="runningviadocker.html">
<span>Getting Started</span>
</a>
<ul class="nav">
<li class="">
<a href="./runningviadocker.html">Alpha Cluster</a>
</li>
<li class="">
<a href="./settings.html">Configuration</a>
</li>
<li class="">
<a href="./realcluster.html">Starting an Ozone Cluster</a>
</li>
<li class="">
<a href="./runningwithhdfs.html">Running concurrently with HDFS</a>
</li>
<li class="">
<a href="./buildingsources.html">Building from Sources</a>
</li>
</ul>
</li>
<li class="">
<a href="commandshell.html">
<span>Client</span>
</a>
<ul class="nav">
<li class="">
<a href="./commandshell.html">Ozone CLI</a>
</li>
<li class="">
<a href="./volumecommands.html">Volume Commands</a>
</li>
<li class="">
<a href="./bucketcommands.html">Bucket Commands</a>
</li>
<li class="">
<a href="./keycommands.html">Key Commands</a>
</li>
<li class="">
<a href="./javaapi.html">Java API</a>
</li>
<li class="">
<a href="./ozonefs.html">Ozone File System</a>
</li>
<li class="">
<a href="./rest.html">REST API</a>
</li>
</ul>
</li>
<li class="">
<a href="dozone.html">
<span>Tools</span>
</a>
<ul class="nav">
<li class="">
<a href="./dozone.html">Dozone &amp; Dev Tools</a>
</li>
<li class="">
<a href="./freon.html">Freon</a>
</li>
<li class="">
<a href="./scmcli.html">SCMCLI</a>
</li>
</ul>
</li>
<li class="active">
<a href="./concepts.html">
<span>Architecture</span>
</a>
<ul class="nav">
<li class="">
<a href="./hdds.html">Hadoop Distributed Data Store</a>
</li>
<li class="">
<a href="./ozonemanager.html">Ozone Manager</a>
</li>
</ul>
</li>
</ul>
</div>
<div class="col-sm-9 col-sm-offset-3 col-md-10 col-md-offset-2 main">
<h1>Architecture</h1>
<div class="col-md-9">
<!---
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>Ozone is a redundant, distributed object store build by
leveraging primitives present in HDFS. The primary design point of ozone is scalability, and it aims to scale to billions of objects.</p>
<p>Ozone consists of volumes, buckets, and keys. A volume is similar to a home directory in the ozone world. Only an administrator can create it. Volumes are used to store buckets. Once a volume is created users can create as many buckets as needed. Ozone stores data as keys which live inside these buckets.</p>
<p>Ozone namespace is composed of many storage volumes. Storage volumes are also used as the basis for storage accounting.</p>
<p>To access a key, an Ozone URL has the following format:</p>
<pre><code>http://servername:port/volume/bucket/key
</code></pre>
<p>Where the server name is the name of a data node, the port is the data node HTTP port. The volume represents the name of the ozone volume; bucket is an ozone bucket created by the user and key represents the file.</p>
<p>Please look at the <a href="./commandshell.html#shell">command line interface</a> for more info.</p>
<p>Ozone supports both REST and RPC protocols. Clients can choose either of these protocols to communicate with Ozone. Please see the <a href="./javaapi.html">client documentation</a> for more details.</p>
<p>Ozone separates namespace management and block space management; this helps
ozone to scale much better. The namespace is managed by a daemon called
<a href="./ozonemanager.html">Ozone Manager </a> (OM), and block space is
managed by <a href="./hdds.html">Storage Container Manager</a> (SCM).</p>
<p>The data nodes provide replication and ability to store blocks; these blocks are stored in groups to reduce the metadata pressure on SCM. This groups of blocks are called storage containers. Hence the block manager is called storage container
manager.</p>
<h2 id="ozone-overview">Ozone Overview</h2>
<p>
The following diagram is a high-level overview of the core components of Ozone.

</p>
<p><img src="../../OzoneOverview.svg" alt="Architecture diagram" /></p>
<p>The main elements of Ozone are
:</p>
<h3 id="ozone-manager">Ozone Manager
</h3>
<p><a href="./ozonemanager.html">Ozone Manager</a> (OM) takes care of the Ozone&rsquo;s namespace.
All ozone objects like volumes, buckets, and keys are managed by OM. In Short, OM is the metadata manager for Ozone.
OM talks to blockManager(SCM) to get blocks and passes it on to the Ozone
client. Ozone client writes data to these blocks.
OM will eventually be replicated via Apache Ratis for High Availability.
</p>
<h3 id="storage-container-manager">Storage Container Manager</h3>
<p><a href="./hdds.html">Storage Container Manager</a> (SCM) is the block and cluster manager for Ozone.
SCM along with data nodes offer a service called &lsquo;storage containers&rsquo;.
A storage container is a group unrelated of blocks that are managed together as a single entity.</p>
<p>SCM offers the following abstractions.

</p>
<p><img src="../../SCMBlockDiagram.png" alt="SCM Abstractions" /></p>
<h3 id="blocks">Blocks</h3>
<p>Blocks are similar to blocks in HDFS. They are replicated store of data. Client writes data to blocks.</p>
<h3 id="containers">Containers</h3>
<p>A collection of blocks replicated and managed together.</p>
<h3 id="pipelines">Pipelines</h3>
<p>SCM allows each storage container to choose its method of replication.
For example, a storage container might decide that it needs only one copy of a block
and might choose a stand-alone pipeline. Another storage container might want to have a very high level of reliability and pick a RATIS based pipeline. In other words, SCM allows different kinds of replication strategies to co-exist. The client while writing data, chooses a storage container with required properties.</p>
<h3 id="pools">Pools</h3>
<p>A group of data nodes is called a pool. For scaling purposes,
we define a pool as a set of machines. This makes management of data nodes easier.</p>
<h3 id="nodes">Nodes</h3>
<p>The data node where data is stored. SCM monitors these nodes via heartbeat.</p>
<h3 id="clients">Clients</h3>
<p>Ozone ships with a set of clients. Ozone <a href="./commandshell.html#shell">CLI</a> is the command line interface like &lsquo;hdfs&rsquo; command.
 <a href="./freon.html">Freon</a> is a load generation tool for Ozone.
</p>
<h3 id="rest-handler">REST Handler</h3>
<p>Ozone provides an RPC (Remote Procedure Call) as well as a REST (Representational State Transfer) interface. This allows clients to be written in many languages quickly. Ozone strives to maintain an API compatibility between REST and RPC.
For most purposes, a client can make one line change to switch from REST to RPC or vice versa. 
</p>
<h3 id="ozone-file-system">Ozone File System</h3>
<p>Ozone file system (TODO: Add documentation) is a Hadoop compatible file system. This allows Hadoop services and applications like Hive and Spark to run against
Ozone without any change.</p>
<h3 id="ozone-client">Ozone Client</h3>
<p>This is similar to DFSClient in HDFS. This is the standard client to talk to Ozone. All other components that we have discussed so far rely on Ozone client. Ozone client supports both RPC and REST protocols.</p>
</div>
</div>
</div>
</div>
<script src="./js/jquery.min.js"></script>
<script src="./js/ozonedoc.js"></script>
<script src="./js/bootstrap.min.js"></script>
</body>
</html>