| --- |
| layout: docpage |
| |
| title: "Documentation" |
| |
| is_homepage: false |
| is_sphinx_doc: true |
| |
| doc-parent: "Operating Cassandra" |
| |
| doc-title: "Hardware Choices" |
| doc-header-links: ' |
| <link rel="top" title="Apache Cassandra Documentation v3.11.5" href="../index.html"/> |
| <link rel="up" title="Operating Cassandra" href="index.html"/> |
| <link rel="next" title="Cassandra Tools" href="../tools/index.html"/> |
| <link rel="prev" title="Security" href="security.html"/> |
| ' |
| doc-search-path: "../search.html" |
| |
| extra-footer: ' |
| <script type="text/javascript"> |
| var DOCUMENTATION_OPTIONS = { |
| URL_ROOT: "", |
| VERSION: "", |
| COLLAPSE_INDEX: false, |
| FILE_SUFFIX: ".html", |
| HAS_SOURCE: false, |
| SOURCELINK_SUFFIX: ".txt" |
| }; |
| </script> |
| ' |
| |
| --- |
| <div class="container-fluid"> |
| <div class="row"> |
| <div class="col-md-3"> |
| <div class="doc-navigation"> |
| <div class="doc-menu" role="navigation"> |
| <div class="navbar-header"> |
| <button type="button" class="pull-left navbar-toggle" data-toggle="collapse" data-target=".sidebar-navbar-collapse"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| </div> |
| <div class="navbar-collapse collapse sidebar-navbar-collapse"> |
| <form id="doc-search-form" class="navbar-form" action="../search.html" method="get" role="search"> |
| <div class="form-group"> |
| <input type="text" size="30" class="form-control input-sm" name="q" placeholder="Search docs"> |
| <input type="hidden" name="check_keywords" value="yes" /> |
| <input type="hidden" name="area" value="default" /> |
| </div> |
| </form> |
| |
| |
| |
| <ul class="current"> |
| <li class="toctree-l1"><a class="reference internal" href="../getting_started/index.html">Getting Started</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../architecture/index.html">Architecture</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../data_modeling/index.html">Data Modeling</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../cql/index.html">The Cassandra Query Language (CQL)</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../configuration/index.html">Configuring Cassandra</a></li> |
| <li class="toctree-l1 current"><a class="reference internal" href="index.html">Operating Cassandra</a><ul class="current"> |
| <li class="toctree-l2"><a class="reference internal" href="snitch.html">Snitch</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="topo_changes.html">Adding, replacing, moving and removing nodes</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="repair.html">Repair</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="read_repair.html">Read repair</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="hints.html">Hints</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="compaction.html">Compaction</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="bloom_filters.html">Bloom Filters</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="compression.html">Compression</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="cdc.html">Change Data Capture</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="backups.html">Backups</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="bulk_loading.html">Bulk Loading</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="metrics.html">Monitoring</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="security.html">Security</a></li> |
| <li class="toctree-l2 current"><a class="current reference internal" href="#">Hardware Choices</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="#cpu">CPU</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#memory">Memory</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#disks">Disks</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#common-cloud-choices">Common Cloud Choices</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li class="toctree-l1"><a class="reference internal" href="../tools/index.html">Cassandra Tools</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../troubleshooting/index.html">Troubleshooting</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../development/index.html">Cassandra Development</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../faq/index.html">Frequently Asked Questions</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../bugs.html">Reporting Bugs and Contributing</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../contactus.html">Contact us</a></li> |
| </ul> |
| |
| |
| |
| </div><!--/.nav-collapse --> |
| </div> |
| </div> |
| </div> |
| <div class="col-md-8"> |
| <div class="content doc-content"> |
| <div class="content-container"> |
| |
| <div class="section" id="hardware-choices"> |
| <h1>Hardware Choices<a class="headerlink" href="#hardware-choices" title="Permalink to this headline">¶</a></h1> |
| <p>Like most databases, Cassandra throughput improves with more CPU cores, more RAM, and faster disks. While Cassandra can |
| be made to run on small servers for testing or development environments (including Raspberry Pis), a minimal production |
| server requires at least 2 cores, and at least 8GB of RAM. Typical production servers have 8 or more cores and at least |
| 32GB of RAM.</p> |
| <div class="section" id="cpu"> |
| <h2>CPU<a class="headerlink" href="#cpu" title="Permalink to this headline">¶</a></h2> |
| <p>Cassandra is highly concurrent, handling many simultaneous requests (both read and write) using multiple threads running |
| on as many CPU cores as possible. The Cassandra write path tends to be heavily optimized (writing to the commitlog and |
| then inserting the data into the memtable), so writes, in particular, tend to be CPU bound. Consequently, adding |
| additional CPU cores often increases throughput of both reads and writes.</p> |
| </div> |
| <div class="section" id="memory"> |
| <h2>Memory<a class="headerlink" href="#memory" title="Permalink to this headline">¶</a></h2> |
| <p>Cassandra runs within a Java VM, which will pre-allocate a fixed size heap (java’s Xmx system parameter). In addition to |
| the heap, Cassandra will use significant amounts of RAM offheap for compression metadata, bloom filters, row, key, and |
| counter caches, and an in process page cache. Finally, Cassandra will take advantage of the operating system’s page |
| cache, storing recently accessed portions files in RAM for rapid re-use.</p> |
| <p>For optimal performance, operators should benchmark and tune their clusters based on their individual workload. However, |
| basic guidelines suggest:</p> |
| <ul class="simple"> |
| <li>ECC RAM should always be used, as Cassandra has few internal safeguards to protect against bit level corruption</li> |
| <li>The Cassandra heap should be no less than 2GB, and no more than 50% of your system RAM</li> |
| <li>Heaps smaller than 12GB should consider ParNew/ConcurrentMarkSweep garbage collection</li> |
| <li>Heaps larger than 12GB should consider G1GC</li> |
| </ul> |
| </div> |
| <div class="section" id="disks"> |
| <h2>Disks<a class="headerlink" href="#disks" title="Permalink to this headline">¶</a></h2> |
| <p>Cassandra persists data to disk for two very different purposes. The first is to the commitlog when a new write is made |
| so that it can be replayed after a crash or system shutdown. The second is to the data directory when thresholds are |
| exceeded and memtables are flushed to disk as SSTables.</p> |
| <p>Commitlogs receive every write made to a Cassandra node and have the potential to block client operations, but they are |
| only ever read on node start-up. SSTable (data file) writes on the other hand occur asynchronously, but are read to |
| satisfy client look-ups. SSTables are also periodically merged and rewritten in a process called compaction. The data |
| held in the commitlog directory is data that has not been permanently saved to the SSTable data directories - it will be |
| periodically purged once it is flushed to the SSTable data files.</p> |
| <p>Cassandra performs very well on both spinning hard drives and solid state disks. In both cases, Cassandra’s sorted |
| immutable SSTables allow for linear reads, few seeks, and few overwrites, maximizing throughput for HDDs and lifespan of |
| SSDs by avoiding write amplification. However, when using spinning disks, it’s important that the commitlog |
| (<code class="docutils literal notranslate"><span class="pre">commitlog_directory</span></code>) be on one physical disk (not simply a partition, but a physical disk), and the data files |
| (<code class="docutils literal notranslate"><span class="pre">data_file_directories</span></code>) be set to a separate physical disk. By separating the commitlog from the data directory, |
| writes can benefit from sequential appends to the commitlog without having to seek around the platter as reads request |
| data from various SSTables on disk.</p> |
| <p>In most cases, Cassandra is designed to provide redundancy via multiple independent, inexpensive servers. For this |
| reason, using NFS or a SAN for data directories is an antipattern and should typically be avoided. Similarly, servers |
| with multiple disks are often better served by using RAID0 or JBOD than RAID1 or RAID5 - replication provided by |
| Cassandra obsoletes the need for replication at the disk layer, so it’s typically recommended that operators take |
| advantage of the additional throughput of RAID0 rather than protecting against failures with RAID1 or RAID5.</p> |
| </div> |
| <div class="section" id="common-cloud-choices"> |
| <h2>Common Cloud Choices<a class="headerlink" href="#common-cloud-choices" title="Permalink to this headline">¶</a></h2> |
| <p>Many large users of Cassandra run in various clouds, including AWS, Azure, and GCE - Cassandra will happily run in any |
| of these environments. Users should choose similar hardware to what would be needed in physical space. In EC2, popular |
| options include:</p> |
| <ul class="simple"> |
| <li>m1.xlarge instances, which provide 1.6TB of local ephemeral spinning storage and sufficient RAM to run moderate |
| workloads</li> |
| <li>i2 instances, which provide both a high RAM:CPU ratio and local ephemeral SSDs</li> |
| <li>m4.2xlarge / c4.4xlarge instances, which provide modern CPUs, enhanced networking and work well with EBS GP2 (SSD) |
| storage</li> |
| </ul> |
| <p>Generally, disk and network performance increases with instance size and generation, so newer generations of instances |
| and larger instance types within each family often perform better than their smaller or older alternatives.</p> |
| </div> |
| </div> |
| |
| |
| |
| |
| <div class="doc-prev-next-links" role="navigation" aria-label="footer navigation"> |
| |
| <a href="../tools/index.html" class="btn btn-default pull-right " role="button" title="Cassandra Tools" accesskey="n">Next <span class="glyphicon glyphicon-circle-arrow-right" aria-hidden="true"></span></a> |
| |
| |
| <a href="security.html" class="btn btn-default" role="button" title="Security" accesskey="p"><span class="glyphicon glyphicon-circle-arrow-left" aria-hidden="true"></span> Previous</a> |
| |
| </div> |
| |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |