blob: b08cf799b816991b2ba14c07d687f4f0bf95681a [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Apache DistributedLog</title>
<meta name="description" content="Apache DistributedLog is an high performance replicated log.
">
<link rel="stylesheet" href="/docs/latest/styles/site.css">
<link rel="stylesheet" href="/docs/latest/css/theme.css">
<!-- JQuery -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js"></script>
<script src="/docs/latest/js/bootstrap.min.js"></script>
<link rel="canonical" href="http://bookkeeper.apache.org/distributedlog/docs/latest/admin_guide/hardware.html" data-proofer-ignore>
<link rel="alternate" type="application/rss+xml" title="Apache DistributedLog" href="http://bookkeeper.apache.org/distributedlog/docs/latest/feed.xml">
<!-- Font Awesome -->
<script src="//cdnjs.cloudflare.com/ajax/libs/anchor-js/3.2.0/anchor.min.js"></script>
<!-- Google Analytics -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-83870961-1', 'auto');
ga('send', 'pageview');
</script>
<!-- End Google Analytics -->
<link rel="shortcut icon" type="image/x-icon" href="/images/favicon.ico">
</head>
<body role="document">
<nav class="navbar navbar-default navbar-fixed-top">
<div class="container">
<div class="navbar-header">
<a href="/" class="navbar-brand" >
<img alt="Brand" style="height: 28px" src="/docs/latest/images/distributedlog_logo_navbar.png">
</a>
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false" aria-controls="navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<!-- Overview -->
<li><a href="/docs/latest/">V0.6.0</a></li>
<!-- Concepts -->
<li><a href="/docs/latest/basics/introduction">Concepts</a></li>
<!-- Quick Start -->
<li>
<a href="/docs/latest/start" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Start<span class="caret"></span></a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="/docs/latest/start/building.html">
Build DistributedLog from Source
</a>
</li>
<li>
<a href="/docs/latest/start/download.html">
Download Releases
</a>
</li>
<li role="separator" class="divider"></li>
<li class="dropdown-header"><strong>Quickstart</strong></li>
<li>
<a href="/docs/latest/start/quickstart.html">
Setup & Run Example
</a>
</li>
<li>
<a href="/docs/latest/tutorials/basic-1.html">
API - Write Records (via core library)
</a>
</li>
<li>
<a href="/docs/latest/tutorials/basic-2.html">
API - Write Records (via write proxy)
</a>
</li>
<li>
<a href="/docs/latest/tutorials/basic-5.html">
API - Read Records
</a>
</li>
<li role="separator" class="divider"></li>
<li class="dropdown-header"><strong>Deployment</strong></li>
<li>
<a href="/docs/latest/deployment/cluster.html">
Cluster Setup
</a>
</li>
<li>
<a href="/docs/latest/deployment/global-cluster.html">
Global Cluster Setup
</a>
</li>
<li>
<a href="/docs/latest/deployment/kubernetes.html">
Kubernetes
</a>
</li>
</ul>
</li>
<!-- API -->
<li>
<a href="/docs/latest/start" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">API<span class="caret"></span></a>
<ul class="dropdown-menu" role="menu">
<li><a href="/docs/latest/api/java">Java</a></li>
</ul>
</li>
<!-- User Guide -->
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">User Guide<span class="caret"></span></a>
<ul class="dropdown-menu">
<li>
<a href="/docs/latest/basics/introduction.html">
Introduction
</a>
</li>
<li>
<a href="/docs/latest/user_guide/considerations/main.html">
Considerations
</a>
</li>
<li>
<a href="/docs/latest/user_guide/architecture/main.html">
Architecture
</a>
</li>
<li>
<a href="/docs/latest/user_guide/api/main.html">
API
</a>
</li>
<li>
<a href="/docs/latest/user_guide/configuration/main.html">
Configuration
</a>
</li>
<li>
<a href="/docs/latest/user_guide/design/main.html">
Detail Design
</a>
</li>
<li>
<a href="/docs/latest/user_guide/globalreplicatedlog/main.html">
Global Replicated Log
</a>
</li>
<li>
<a href="/docs/latest/user_guide/implementation/main.html">
Implementation
</a>
</li>
<li>
<a href="/docs/latest/user_guide/references/main.html">
References
</a>
</li>
</ul>
</li>
<!-- Admin Guide -->
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Admin Guide<span class="caret"></span></a>
<ul class="dropdown-menu">
<li><a href="/docs/latest/deployment/cluster">Cluster Setup</a></li>
<li>
<a href="/docs/latest/admin_guide/operations.html">
Operations
</a>
</li>
<li>
<a href="/docs/latest/admin_guide/loadtest.html">
Load Test
</a>
</li>
<li>
<a href="/docs/latest/admin_guide/performance.html">
Performance Tuning
</a>
</li>
<li>
<a href="/docs/latest/admin_guide/hardware.html">
Hardware
</a>
</li>
<li>
<a href="/docs/latest/admin_guide/monitoring.html">
Monitoring
</a>
</li>
<li>
<a href="/docs/latest/admin_guide/zookeeper.html">
ZooKeeper
</a>
</li>
<li>
<a href="/docs/latest/admin_guide/bookkeeper.html">
BookKeeper
</a>
</li>
</ul>
</li>
<!-- Tutorials -->
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Tutorials<span class="caret"></span></a>
<ul class="dropdown-menu">
<li class="dropdown-header"><strong>Basic</strong></li>
<li><a href="/docs/latest/tutorials/basic-1">Write Records (via Core Library)</a></li>
<li><a href="/docs/latest/tutorials/basic-2">Write Records (via Write Proxy)</a></li>
<li><a href="/docs/latest/tutorials/basic-3">Write Records to multiple streams</a></li>
<li><a href="/docs/latest/tutorials/basic-4">Atomic Write Records</a></li>
<li><a href="/docs/latest/tutorials/basic-5">Tailing Read Records</a></li>
<li><a href="/docs/latest/tutorials/basic-6">Rewind Read Records</a></li>
<li role="separator" class="divider"></li>
<li class="dropdown-header"><strong>Messaging</strong></li>
<li>
<a href="/docs/latest/tutorials/messaging-1.html">
Write records to partitioned streams
</a>
</li>
<li>
<a href="/docs/latest/tutorials/messaging-2.html">
Write records to multiple streams (load balancer)
</a>
</li>
<li>
<a href="/docs/latest/tutorials/messaging-3.html">
At-least-once Processing
</a>
</li>
<li>
<a href="/docs/latest/tutorials/messaging-4.html">
Exact-Once Processing
</a>
</li>
<li>
<a href="/docs/latest/tutorials/messaging-5.html">
Implement a kafka-like pub/sub system
</a>
</li>
<li role="separator" class="divider"></li>
<li class="dropdown-header"><strong>Replicated State Machines</strong></li>
<li>
<a href="/docs/latest/tutorials/replicatedstatemachines.html">
Build replicated state machines
</a>
</li>
<li role="separator" class="divider"></li>
<li class="dropdown-header"><strong>Analytics</strong></li>
<li><a href="/docs/latest/tutorials/analytics-mapreduce">Process log streams using MapReduce</a></li>
</ul>
</li>
</ul>
</div><!--/.nav-collapse -->
</div>
</nav>
<link rel="stylesheet" href="">
<div class="container" role="main">
<div class="row">
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<div class="row">
<!-- Sub Navigation -->
<div class="col-sm-3">
<ul id="sub-nav">
<li><a href="/docs/latest/admin_guide/main.html" class="">Admin Guide</a>
<ul>
<li>
<a href="/docs/latest/deployment/cluster.html" class="">
Cluster Setup
</a>
<ul>
</ul>
</li>
<li>
<a href="/docs/latest/deployment/global-cluster.html" class="">
Global Cluster Setup
</a>
<ul>
</ul>
</li>
<li>
<a href="/docs/latest/admin_guide/operations.html" class="">
Operations
</a>
<ul>
</ul>
</li>
<li>
<a href="/docs/latest/admin_guide/loadtest.html" class="">
Load Test
</a>
<ul>
</ul>
</li>
<li>
<a href="/docs/latest/admin_guide/performance.html" class="">
Performance Tuning
</a>
<ul>
</ul>
</li>
<li>
<a href="/docs/latest/admin_guide/hardware.html" class="active">
Hardware
</a>
<ul>
</ul>
</li>
<li>
<a href="/docs/latest/admin_guide/monitoring.html" class="">
Monitoring
</a>
<ul>
</ul>
</li>
<li>
<a href="/docs/latest/admin_guide/zookeeper.html" class="">
ZooKeeper
</a>
<ul>
</ul>
</li>
<li>
<a href="/docs/latest/admin_guide/bookkeeper.html" class="">
BookKeeper
</a>
<ul>
</ul>
</li>
</ul>
</li>
</ul>
</div>
<!-- Main -->
<div class="col-sm-9">
<!-- Top anchor -->
<a href="#top"></a>
<!-- Breadcrumbs above the main heading -->
<ol class="breadcrumb">
<li><a href="/docs/latest/admin_guide/main.html">Admin Guide</a></li>
<li class="active">Hardware</li>
</ol>
<div class="text">
<!-- Content -->
<div class="contents topic" id="hardware">
<p class="topic-title first">Hardware</p>
<ul class="simple">
<li><a class="reference internal" href="#id1" id="id2">Hardware</a><ul>
<li><a class="reference internal" href="#metrics" id="id3">Metrics</a></li>
<li><a class="reference internal" href="#write-proxy" id="id4">Write Proxy</a><ul>
<li><a class="reference internal" href="#cpus" id="id5">CPUs</a></li>
<li><a class="reference internal" href="#memories" id="id6">Memories</a></li>
<li><a class="reference internal" href="#disks" id="id7">Disks</a></li>
<li><a class="reference internal" href="#network" id="id8">Network</a></li>
</ul>
</li>
<li><a class="reference internal" href="#bookkeeper" id="id9">BookKeeper</a></li>
<li><a class="reference internal" href="#read-proxy" id="id10">Read Proxy</a></li>
</ul>
</li>
</ul>
</div>
<div class="section" id="id1">
<h2><a class="toc-backref" href="#id2">Hardware</a></h2>
<p>Figure 1 describes the data flow of DistributedLog. Write traffic comes to <cite>Write Proxy</cite>
and the data is replicated in <cite>RF</cite> (replication factor) ways to <cite>BookKeeper</cite>. BookKeeper
stores the replicated data and keeps the data for a given retention period. The data is
read by <cite>Read Proxy</cite> and fanout to readers.</p>
<p>In such layered architecture, each layer has its own responsibilities and different resource
requirements. It makes the capacity and cost model much clear and users could scale
different layers independently.</p>
<div class="figure align-center">
<img alt="../images/costmodel.png" src="../images/costmodel.png" />
<p class="caption">Figure 1. DistributedLog Cost Model</p>
</div>
<div class="section" id="metrics">
<h3><a class="toc-backref" href="#id3">Metrics</a></h3>
<p>There are different metrics measuring the capability of a service instance in each layer
(e.g a <cite>write proxy</cite> node, a <cite>bookie</cite> storage node, a <cite>read proxy</cite> node and such). These metrics
can be <cite>rps</cite> (requests per second), <cite>bps</cite> (bits per second), <cite>number of streams</cite> that a instance
can support, and latency requirements. <cite>bps</cite> is the best and simple factor on measuring the
capability of current distributedlog architecture.</p>
</div>
<div class="section" id="write-proxy">
<h3><a class="toc-backref" href="#id4">Write Proxy</a></h3>
<p>Write Proxy (WP) is a stateless serving service that writes and replicates fan-in traffic into BookKeeper.
The capability of a write proxy instance is purely dominated by the <em>OUTBOUND</em> network bandwidth,
which is reflected as incoming <cite>Write Throughput</cite> and <cite>Replication Factor</cite>.</p>
<p>Calculating the capacity of Write Proxy (number of instances of write proxies) is pretty straightforward.
The formula is listed as below.</p>
<pre class="literal-block">
Number of Write Proxies = (Write Throughput) * (Replication Factor) / (Write Proxy Outbound Bandwidth)
</pre>
<p>As it is bandwidth bound, we'd recommend using machines that have high network bandwith (e.g 10Gb NIC).</p>
<p>The cost estimation is also straightforward.</p>
<pre class="literal-block">
Bandwidth TCO ($/day/MB) = (Write Proxy TCO) / (Write Proxy Outbound Bandwidth)
Cost of write proxies = (Write Throughput) * (Replication Factor) / (Bandwidth TCO)
</pre>
<div class="section" id="cpus">
<h4><a class="toc-backref" href="#id5">CPUs</a></h4>
<p>DistributedLog is not CPU bound. You can run an instance with 8 or 12 cores just fine.</p>
</div>
<div class="section" id="memories">
<h4><a class="toc-backref" href="#id6">Memories</a></h4>
<p>There's a fair bit of caching. Consider running with at least 8GB of memory.</p>
</div>
<div class="section" id="disks">
<h4><a class="toc-backref" href="#id7">Disks</a></h4>
<p>This is a stateless process, disk performances are not relevant.</p>
</div>
<div class="section" id="network">
<h4><a class="toc-backref" href="#id8">Network</a></h4>
<p>Depending on your throughput, you might be better off running this with 10Gb NIC. In this scenario, you can easily achieves 350MBps of writes.</p>
</div>
</div>
<div class="section" id="bookkeeper">
<h3><a class="toc-backref" href="#id9">BookKeeper</a></h3>
<p>BookKeeper is the log segment store, which is a stateful service. There are two factors to measure the
capability of a Bookie instance: <cite>bandwidth</cite> and <cite>storage</cite>. The bandwidth is majorly dominated by the
outbound traffic from write proxy, which is <cite>(Write Throughput) * (Replication Factor)</cite>. The storage is
majorly dominated by the traffic and also <cite>Retention Period</cite>.</p>
<p>Calculating the capacity of BookKeeper (number of instances of bookies) is a bit more complicated than Write
Proxy. The total number of instances is the maximum number of the instances of bookies calculated using
<cite>bandwidth</cite> and <cite>storage</cite>.</p>
<pre class="literal-block">
Number of bookies based on bandwidth = (Write Throughput) * (Replication Factor) / (Bookie Inbound Bandwidth)
Number of bookies based on storage = (Write Throughput) * (Replication Factor) * (Replication Factor) / (Bookie disk space)
Number of bookies = maximum((number of bookies based on bandwidth), (number of bookies based on storage))
</pre>
<p>We should consider both bandwidth and storage when choosing the hardware for bookies. There are several rules to follow:
- A bookie should have multiple disks.
- The number of disks used as journal disks should have similar I/O bandwidth as its <em>INBOUND</em> network bandwidth. For example, if you plan to use a disk for journal which I/O bandwidth is around 100MBps, a 1Gb NIC is a better choice than 10Gb NIC.
- The number of disks used as ledger disks should be large enough to hold data if retention period is typical long.</p>
<p>The cost estimation is straightforward based on the number of bookies estimated above.</p>
<pre class="literal-block">
Cost of bookies = (Number of bookies) * (Bookie TCO)
</pre>
</div>
<div class="section" id="read-proxy">
<h3><a class="toc-backref" href="#id10">Read Proxy</a></h3>
<p>Similar as Write Proxy, Read Proxy is also dominated by <em>OUTBOUND</em> bandwidth, which is reflected as incoming <cite>Write Throughput</cite> and <cite>Fanout Factor</cite>.</p>
<p>Calculating the capacity of Read Proxy (number of instances of read proxies) is also pretty straightforward.
The formula is listed as below.</p>
<pre class="literal-block">
Number of Read Proxies = (Write Throughput) * (Fanout Factor) / (Read Proxy Outbound Bandwidth)
</pre>
<p>As it is bandwidth bound, we'd recommend using machines that have high network bandwith (e.g 10Gb NIC).</p>
<p>The cost estimation is also straightforward.</p>
<pre class="literal-block">
Bandwidth TCO ($/day/MB) = (Read Proxy TCO) / (Read Proxy Outbound Bandwidth)
Cost of read proxies = (Write Throughput) * (Fanout Factor) / (Bandwidth TCO)
</pre>
</div>
</div>
</div>
</div>
</div>
</div>
<hr>
<div class="row">
<div class="col-xs-12">
<footer>
<p class="text-center">&copy; Copyright 2016
<a href="http://www.apache.org">The Apache Software Foundation.</a> All Rights Reserved.
</p>
<p class="text-center">
<a href="/docs/latest/feed.xml">RSS Feed</a>
</p>
</footer>
</div>
</div>
<!-- container div end -->
</div>
<script>
(function () {
'use strict';
anchors.options.placement = 'right';
anchors.add();
})();
</script>
</body>
</html>