| <!DOCTYPE html> |
| <html lang="en"> |
| |
| <head> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge"> |
| <meta name="viewport" content="width=device-width, initial-scale=1"> |
| |
| <title>Apache DistributedLog</title> |
| <meta name="description" content="Apache DistributedLog is an high performance replicated log. |
| "> |
| |
| <link rel="stylesheet" href="/docs/latest/styles/site.css"> |
| <link rel="stylesheet" href="/docs/latest/css/theme.css"> |
| <!-- JQuery --> |
| <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js"></script> |
| <script src="/docs/latest/js/bootstrap.min.js"></script> |
| <link rel="canonical" href="http://bookkeeper.apache.org/distributedlog/docs/latest/admin_guide/hardware.html" data-proofer-ignore> |
| <link rel="alternate" type="application/rss+xml" title="Apache DistributedLog" href="http://bookkeeper.apache.org/distributedlog/docs/latest/feed.xml"> |
| <!-- Font Awesome --> |
| <script src="//cdnjs.cloudflare.com/ajax/libs/anchor-js/3.2.0/anchor.min.js"></script> |
| <!-- Google Analytics --> |
| <script> |
| (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ |
| (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), |
| m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) |
| })(window,document,'script','https://www.google-analytics.com/analytics.js','ga'); |
| |
| ga('create', 'UA-83870961-1', 'auto'); |
| ga('send', 'pageview'); |
| </script> |
| <!-- End Google Analytics --> |
| <link rel="shortcut icon" type="image/x-icon" href="/images/favicon.ico"> |
| </head> |
| |
| |
| <body role="document"> |
| |
| |
| <nav class="navbar navbar-default navbar-fixed-top"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <a href="/" class="navbar-brand" > |
| <img alt="Brand" style="height: 28px" src="/docs/latest/images/distributedlog_logo_navbar.png"> |
| </a> |
| <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false" aria-controls="navbar"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| </div> |
| <div id="navbar" class="navbar-collapse collapse"> |
| <ul class="nav navbar-nav"> |
| <!-- Overview --> |
| <li><a href="/docs/latest/">V0.6.0</a></li> |
| <!-- Concepts --> |
| <li><a href="/docs/latest/basics/introduction">Concepts</a></li> |
| <!-- Quick Start --> |
| <li> |
| <a href="/docs/latest/start" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Start<span class="caret"></span></a> |
| <ul class="dropdown-menu" role="menu"> |
| |
| |
| <li> |
| <a href="/docs/latest/start/building.html"> |
| Build DistributedLog from Source |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/start/download.html"> |
| Download Releases |
| </a> |
| </li> |
| |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header"><strong>Quickstart</strong></li> |
| |
| |
| <li> |
| <a href="/docs/latest/start/quickstart.html"> |
| Setup & Run Example |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/tutorials/basic-1.html"> |
| API - Write Records (via core library) |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/tutorials/basic-2.html"> |
| API - Write Records (via write proxy) |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/tutorials/basic-5.html"> |
| API - Read Records |
| </a> |
| </li> |
| |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header"><strong>Deployment</strong></li> |
| |
| |
| <li> |
| <a href="/docs/latest/deployment/cluster.html"> |
| Cluster Setup |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/deployment/global-cluster.html"> |
| Global Cluster Setup |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/deployment/kubernetes.html"> |
| Kubernetes |
| </a> |
| </li> |
| |
| </ul> |
| </li> |
| <!-- API --> |
| <li> |
| <a href="/docs/latest/start" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">API<span class="caret"></span></a> |
| <ul class="dropdown-menu" role="menu"> |
| <li><a href="/docs/latest/api/java">Java</a></li> |
| </ul> |
| </li> |
| <!-- User Guide --> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">User Guide<span class="caret"></span></a> |
| <ul class="dropdown-menu"> |
| |
| |
| <li> |
| <a href="/docs/latest/basics/introduction.html"> |
| Introduction |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/user_guide/considerations/main.html"> |
| Considerations |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/user_guide/architecture/main.html"> |
| Architecture |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/user_guide/api/main.html"> |
| API |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/user_guide/configuration/main.html"> |
| Configuration |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/user_guide/design/main.html"> |
| Detail Design |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/user_guide/globalreplicatedlog/main.html"> |
| Global Replicated Log |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/user_guide/implementation/main.html"> |
| Implementation |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/user_guide/references/main.html"> |
| References |
| </a> |
| </li> |
| |
| </ul> |
| </li> |
| <!-- Admin Guide --> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Admin Guide<span class="caret"></span></a> |
| <ul class="dropdown-menu"> |
| <li><a href="/docs/latest/deployment/cluster">Cluster Setup</a></li> |
| |
| |
| <li> |
| <a href="/docs/latest/admin_guide/operations.html"> |
| Operations |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/admin_guide/loadtest.html"> |
| Load Test |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/admin_guide/performance.html"> |
| Performance Tuning |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/admin_guide/hardware.html"> |
| Hardware |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/admin_guide/monitoring.html"> |
| Monitoring |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/admin_guide/zookeeper.html"> |
| ZooKeeper |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/admin_guide/bookkeeper.html"> |
| BookKeeper |
| </a> |
| </li> |
| |
| </ul> |
| </li> |
| <!-- Tutorials --> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Tutorials<span class="caret"></span></a> |
| <ul class="dropdown-menu"> |
| <li class="dropdown-header"><strong>Basic</strong></li> |
| <li><a href="/docs/latest/tutorials/basic-1">Write Records (via Core Library)</a></li> |
| <li><a href="/docs/latest/tutorials/basic-2">Write Records (via Write Proxy)</a></li> |
| <li><a href="/docs/latest/tutorials/basic-3">Write Records to multiple streams</a></li> |
| <li><a href="/docs/latest/tutorials/basic-4">Atomic Write Records</a></li> |
| <li><a href="/docs/latest/tutorials/basic-5">Tailing Read Records</a></li> |
| <li><a href="/docs/latest/tutorials/basic-6">Rewind Read Records</a></li> |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header"><strong>Messaging</strong></li> |
| |
| |
| <li> |
| <a href="/docs/latest/tutorials/messaging-1.html"> |
| Write records to partitioned streams |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/tutorials/messaging-2.html"> |
| Write records to multiple streams (load balancer) |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/tutorials/messaging-3.html"> |
| At-least-once Processing |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/tutorials/messaging-4.html"> |
| Exact-Once Processing |
| </a> |
| </li> |
| |
| <li> |
| <a href="/docs/latest/tutorials/messaging-5.html"> |
| Implement a kafka-like pub/sub system |
| </a> |
| </li> |
| |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header"><strong>Replicated State Machines</strong></li> |
| |
| |
| <li> |
| <a href="/docs/latest/tutorials/replicatedstatemachines.html"> |
| Build replicated state machines |
| </a> |
| </li> |
| |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header"><strong>Analytics</strong></li> |
| <li><a href="/docs/latest/tutorials/analytics-mapreduce">Process log streams using MapReduce</a></li> |
| </ul> |
| </li> |
| </ul> |
| </div><!--/.nav-collapse --> |
| </div> |
| </nav> |
| |
| |
| <link rel="stylesheet" href=""> |
| |
| |
| <div class="container" role="main"> |
| |
| <div class="row"> |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| |
| <div class="row"> |
| <!-- Sub Navigation --> |
| <div class="col-sm-3"> |
| <ul id="sub-nav"> |
| |
| |
| |
| |
| <li><a href="/docs/latest/admin_guide/main.html" class="">Admin Guide</a> |
| |
| <ul> |
| |
| |
| <li> |
| <a href="/docs/latest/deployment/cluster.html" class=""> |
| Cluster Setup |
| </a> |
| |
| <ul> |
| |
| </ul> |
| |
| </li> |
| |
| |
| <li> |
| <a href="/docs/latest/deployment/global-cluster.html" class=""> |
| Global Cluster Setup |
| </a> |
| |
| <ul> |
| |
| </ul> |
| |
| </li> |
| |
| |
| <li> |
| <a href="/docs/latest/admin_guide/operations.html" class=""> |
| Operations |
| </a> |
| |
| <ul> |
| |
| </ul> |
| |
| </li> |
| |
| |
| <li> |
| <a href="/docs/latest/admin_guide/loadtest.html" class=""> |
| Load Test |
| </a> |
| |
| <ul> |
| |
| </ul> |
| |
| </li> |
| |
| |
| <li> |
| <a href="/docs/latest/admin_guide/performance.html" class=""> |
| Performance Tuning |
| </a> |
| |
| <ul> |
| |
| </ul> |
| |
| </li> |
| |
| |
| <li> |
| <a href="/docs/latest/admin_guide/hardware.html" class="active"> |
| Hardware |
| </a> |
| |
| <ul> |
| |
| </ul> |
| |
| </li> |
| |
| |
| <li> |
| <a href="/docs/latest/admin_guide/monitoring.html" class=""> |
| Monitoring |
| </a> |
| |
| <ul> |
| |
| </ul> |
| |
| </li> |
| |
| |
| <li> |
| <a href="/docs/latest/admin_guide/zookeeper.html" class=""> |
| ZooKeeper |
| </a> |
| |
| <ul> |
| |
| </ul> |
| |
| </li> |
| |
| |
| <li> |
| <a href="/docs/latest/admin_guide/bookkeeper.html" class=""> |
| BookKeeper |
| </a> |
| |
| <ul> |
| |
| </ul> |
| |
| </li> |
| |
| </ul> |
| |
| </li> |
| |
| </ul> |
| </div> |
| <!-- Main --> |
| <div class="col-sm-9"> |
| <!-- Top anchor --> |
| <a href="#top"></a> |
| |
| <!-- Breadcrumbs above the main heading --> |
| <ol class="breadcrumb"> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li><a href="/docs/latest/admin_guide/main.html">Admin Guide</a></li> |
| |
| |
| <li class="active">Hardware</li> |
| </ol> |
| |
| <div class="text"> |
| <!-- Content --> |
| <div class="contents topic" id="hardware"> |
| <p class="topic-title first">Hardware</p> |
| <ul class="simple"> |
| <li><a class="reference internal" href="#id1" id="id2">Hardware</a><ul> |
| <li><a class="reference internal" href="#metrics" id="id3">Metrics</a></li> |
| <li><a class="reference internal" href="#write-proxy" id="id4">Write Proxy</a><ul> |
| <li><a class="reference internal" href="#cpus" id="id5">CPUs</a></li> |
| <li><a class="reference internal" href="#memories" id="id6">Memories</a></li> |
| <li><a class="reference internal" href="#disks" id="id7">Disks</a></li> |
| <li><a class="reference internal" href="#network" id="id8">Network</a></li> |
| </ul> |
| </li> |
| <li><a class="reference internal" href="#bookkeeper" id="id9">BookKeeper</a></li> |
| <li><a class="reference internal" href="#read-proxy" id="id10">Read Proxy</a></li> |
| </ul> |
| </li> |
| </ul> |
| </div> |
| <div class="section" id="id1"> |
| <h2><a class="toc-backref" href="#id2">Hardware</a></h2> |
| <p>Figure 1 describes the data flow of DistributedLog. Write traffic comes to <cite>Write Proxy</cite> |
| and the data is replicated in <cite>RF</cite> (replication factor) ways to <cite>BookKeeper</cite>. BookKeeper |
| stores the replicated data and keeps the data for a given retention period. The data is |
| read by <cite>Read Proxy</cite> and fanout to readers.</p> |
| <p>In such layered architecture, each layer has its own responsibilities and different resource |
| requirements. It makes the capacity and cost model much clear and users could scale |
| different layers independently.</p> |
| <div class="figure align-center"> |
| <img alt="../images/costmodel.png" src="../images/costmodel.png" /> |
| <p class="caption">Figure 1. DistributedLog Cost Model</p> |
| </div> |
| <div class="section" id="metrics"> |
| <h3><a class="toc-backref" href="#id3">Metrics</a></h3> |
| <p>There are different metrics measuring the capability of a service instance in each layer |
| (e.g a <cite>write proxy</cite> node, a <cite>bookie</cite> storage node, a <cite>read proxy</cite> node and such). These metrics |
| can be <cite>rps</cite> (requests per second), <cite>bps</cite> (bits per second), <cite>number of streams</cite> that a instance |
| can support, and latency requirements. <cite>bps</cite> is the best and simple factor on measuring the |
| capability of current distributedlog architecture.</p> |
| </div> |
| <div class="section" id="write-proxy"> |
| <h3><a class="toc-backref" href="#id4">Write Proxy</a></h3> |
| <p>Write Proxy (WP) is a stateless serving service that writes and replicates fan-in traffic into BookKeeper. |
| The capability of a write proxy instance is purely dominated by the <em>OUTBOUND</em> network bandwidth, |
| which is reflected as incoming <cite>Write Throughput</cite> and <cite>Replication Factor</cite>.</p> |
| <p>Calculating the capacity of Write Proxy (number of instances of write proxies) is pretty straightforward. |
| The formula is listed as below.</p> |
| <pre class="literal-block"> |
| Number of Write Proxies = (Write Throughput) * (Replication Factor) / (Write Proxy Outbound Bandwidth) |
| </pre> |
| <p>As it is bandwidth bound, we'd recommend using machines that have high network bandwith (e.g 10Gb NIC).</p> |
| <p>The cost estimation is also straightforward.</p> |
| <pre class="literal-block"> |
| Bandwidth TCO ($/day/MB) = (Write Proxy TCO) / (Write Proxy Outbound Bandwidth) |
| Cost of write proxies = (Write Throughput) * (Replication Factor) / (Bandwidth TCO) |
| </pre> |
| <div class="section" id="cpus"> |
| <h4><a class="toc-backref" href="#id5">CPUs</a></h4> |
| <p>DistributedLog is not CPU bound. You can run an instance with 8 or 12 cores just fine.</p> |
| </div> |
| <div class="section" id="memories"> |
| <h4><a class="toc-backref" href="#id6">Memories</a></h4> |
| <p>There's a fair bit of caching. Consider running with at least 8GB of memory.</p> |
| </div> |
| <div class="section" id="disks"> |
| <h4><a class="toc-backref" href="#id7">Disks</a></h4> |
| <p>This is a stateless process, disk performances are not relevant.</p> |
| </div> |
| <div class="section" id="network"> |
| <h4><a class="toc-backref" href="#id8">Network</a></h4> |
| <p>Depending on your throughput, you might be better off running this with 10Gb NIC. In this scenario, you can easily achieves 350MBps of writes.</p> |
| </div> |
| </div> |
| <div class="section" id="bookkeeper"> |
| <h3><a class="toc-backref" href="#id9">BookKeeper</a></h3> |
| <p>BookKeeper is the log segment store, which is a stateful service. There are two factors to measure the |
| capability of a Bookie instance: <cite>bandwidth</cite> and <cite>storage</cite>. The bandwidth is majorly dominated by the |
| outbound traffic from write proxy, which is <cite>(Write Throughput) * (Replication Factor)</cite>. The storage is |
| majorly dominated by the traffic and also <cite>Retention Period</cite>.</p> |
| <p>Calculating the capacity of BookKeeper (number of instances of bookies) is a bit more complicated than Write |
| Proxy. The total number of instances is the maximum number of the instances of bookies calculated using |
| <cite>bandwidth</cite> and <cite>storage</cite>.</p> |
| <pre class="literal-block"> |
| Number of bookies based on bandwidth = (Write Throughput) * (Replication Factor) / (Bookie Inbound Bandwidth) |
| Number of bookies based on storage = (Write Throughput) * (Replication Factor) * (Replication Factor) / (Bookie disk space) |
| Number of bookies = maximum((number of bookies based on bandwidth), (number of bookies based on storage)) |
| </pre> |
| <p>We should consider both bandwidth and storage when choosing the hardware for bookies. There are several rules to follow: |
| - A bookie should have multiple disks. |
| - The number of disks used as journal disks should have similar I/O bandwidth as its <em>INBOUND</em> network bandwidth. For example, if you plan to use a disk for journal which I/O bandwidth is around 100MBps, a 1Gb NIC is a better choice than 10Gb NIC. |
| - The number of disks used as ledger disks should be large enough to hold data if retention period is typical long.</p> |
| <p>The cost estimation is straightforward based on the number of bookies estimated above.</p> |
| <pre class="literal-block"> |
| Cost of bookies = (Number of bookies) * (Bookie TCO) |
| </pre> |
| </div> |
| <div class="section" id="read-proxy"> |
| <h3><a class="toc-backref" href="#id10">Read Proxy</a></h3> |
| <p>Similar as Write Proxy, Read Proxy is also dominated by <em>OUTBOUND</em> bandwidth, which is reflected as incoming <cite>Write Throughput</cite> and <cite>Fanout Factor</cite>.</p> |
| <p>Calculating the capacity of Read Proxy (number of instances of read proxies) is also pretty straightforward. |
| The formula is listed as below.</p> |
| <pre class="literal-block"> |
| Number of Read Proxies = (Write Throughput) * (Fanout Factor) / (Read Proxy Outbound Bandwidth) |
| </pre> |
| <p>As it is bandwidth bound, we'd recommend using machines that have high network bandwith (e.g 10Gb NIC).</p> |
| <p>The cost estimation is also straightforward.</p> |
| <pre class="literal-block"> |
| Bandwidth TCO ($/day/MB) = (Read Proxy TCO) / (Read Proxy Outbound Bandwidth) |
| Cost of read proxies = (Write Throughput) * (Fanout Factor) / (Bandwidth TCO) |
| </pre> |
| </div> |
| </div> |
| |
| |
| </div> |
| </div> |
| </div> |
| |
| |
| |
| </div> |
| |
| |
| <hr> |
| <div class="row"> |
| <div class="col-xs-12"> |
| <footer> |
| <p class="text-center">© Copyright 2016 |
| <a href="http://www.apache.org">The Apache Software Foundation.</a> All Rights Reserved. |
| </p> |
| <p class="text-center"> |
| <a href="/docs/latest/feed.xml">RSS Feed</a> |
| </p> |
| </footer> |
| </div> |
| </div> |
| <!-- container div end --> |
| </div> |
| |
| |
| <script> |
| (function () { |
| 'use strict'; |
| anchors.options.placement = 'right'; |
| anchors.add(); |
| })(); |
| </script> |
| |
| </body> |
| |
| </html> |