blob: fe0c4a591f2798811d506f72db865bfde2ccfeb0 [file] [log] [blame]
.. Licensed to the Apache Software Foundation (ASF) under one
.. or more contributor license agreements. See the NOTICE file
.. distributed with this work for additional information
.. regarding copyright ownership. The ASF licenses this file
.. to you under the Apache License, Version 2.0 (the
.. "License"); you may not use this file except in compliance
.. with the License. You may obtain a copy of the License at
..
.. http://www.apache.org/licenses/LICENSE-2.0
..
.. Unless required by applicable law or agreed to in writing, software
.. distributed under the License is distributed on an "AS IS" BASIS,
.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
.. See the License for the specific language governing permissions and
.. limitations under the License.
Production Recommendations
----------------------------
The ``cassandra.yaml`` and ``jvm.options`` files have a number of notes and recommendations for production usage. This page
expands on some of the notes in these files with additional information.
Tokens
^^^^^^^
Using more than 1 token (referred to as vnodes) allows for more flexible expansion and more streaming peers when
bootstrapping new nodes into the cluster. This can limit the negative impact of streaming (I/O and CPU overhead)
as well as allow for incremental cluster expansion.
As a tradeoff, more tokens will lead to sharing data with more peers, which can result in decreased availability. To learn more about this we
recommend reading `this paper <https://github.com/jolynch/python_performance_toolkit/raw/master/notebooks/cassandra_availability/whitepaper/cassandra-availability-virtual.pdf>`_.
The number of tokens can be changed using the following setting:
``num_tokens: 16``
Here are the most common token counts with a brief explanation of when and why you would use each one.
+-------------+---------------------------------------------------------------------------------------------------+
| Token Count | Description |
+=============+===================================================================================================+
| 1 | Maximum availablility, maximum cluster size, fewest peers, |
| | but inflexible expansion. Must always |
| | double size of cluster to expand and remain balanced. |
+-------------+---------------------------------------------------------------------------------------------------+
| 4 | A healthy mix of elasticity and availability. Recommended for clusters which will eventually |
| | reach over 30 nodes. Requires adding approximately 20% more nodes to remain balanced. |
| | Shrinking a cluster may result in cluster imbalance. |
+-------------+---------------------------------------------------------------------------------------------------+
| 16 | Best for heavily elastic clusters which expand and shrink regularly, but may have issues |
| | availability with larger clusters. Not recommended for clusters over 50 nodes. |
+-------------+---------------------------------------------------------------------------------------------------+
In addition to setting the token count, it's extremely important that ``allocate_tokens_for_local_replication_factor`` be
set as well, to ensure even token allocation.
.. _read-ahead:
Read Ahead
^^^^^^^^^^^
Read ahead is an operating system feature that attempts to keep as much data loaded in the page cache as possible. The
goal is to decrease latency by using additional throughput on reads where the latency penalty is high due to seek times
on spinning disks. By leveraging read ahead, the OS can pull additional data into memory without the cost of additional
seeks. This works well when available RAM is greater than the size of the hot dataset, but can be problematic when the
hot dataset is much larger than available RAM. The benefit of read ahead decreases as the size of your hot dataset gets
bigger in proportion to available memory.
With small partitions (usually tables with no partition key, but not limited to this case) and solid state drives, read
ahead can increase disk usage without any of the latency benefits, and in some cases can result in up to
a 5x latency and throughput performance penalty. Read heavy, key/value tables with small (under 1KB) rows are especially
prone to this problem.
We recommend the following read ahead settings:
+----------------+-------------------------+
| Hardware | Initial Recommendation |
+================+=========================+
|Spinning Disks | 64KB |
+----------------+-------------------------+
|SSD | 4KB |
+----------------+-------------------------+
Read ahead can be adjusted on Linux systems by using the `blockdev` tool.
For example, we can set read ahead of ``/dev/sda1` to 4KB by doing the following::
blockdev --setra 8 /dev/sda1
**Note**: blockdev accepts the number of 512 byte sectors to read ahead. The argument of 8 above is equivilent to 4KB.
Since each system is different, use the above recommendations as a starting point and tuning based on your SLA and
throughput requirements. To understand how read ahead impacts disk resource usage we recommend carefully reading through the
:ref:`troubleshooting <use-os-tools>` portion of the documentation.
Compression
^^^^^^^^^^^^
Compressed data is stored by compressing fixed size byte buffers and writing the data to disk. The buffer size is
determined by the ``chunk_length_in_kb`` element in the compression map of the schema settings.
The default setting is 16KB starting with Cassandra 4.0.
Since the entire compressed buffer must be read off disk, using too high of a compression chunk length can lead to
significant overhead when reading small records. Combined with the default read ahead setting this can result in massive
read amplification for certain workloads.
LZ4Compressor is the default and recommended compression algorithm.
There is additional information on this topic on `The Last Pickle Blog <https://thelastpickle.com/blog/2018/08/08/compression_performance.html>`_.
Compaction
^^^^^^^^^^^^
There are different :ref:`compaction <compaction>` strategies available for different workloads.
We recommend reading up on the different strategies to understand which is the best for your environment. Different tables
may (and frequently do) use different compaction strategies on the same cluster.
Encryption
^^^^^^^^^^^
It is significantly easier to set up peer to peer encryption and client server encryption when setting up your production
cluster as opposed to setting it up once the cluster is already serving production traffic. If you are planning on using network encryption
eventually (in any form), we recommend setting it up now. Changing these configurations down the line is not impossible,
but mistakes can result in downtime or data loss.
Ensure Keyspaces are Created with NetworkTopologyStrategy
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Production clusters should never use SimpleStrategy. Production keyspaces should use the NetworkTopologyStrategy (NTS).
For example::
create KEYSPACE mykeyspace WITH replication =
{'class': 'NetworkTopologyStrategy', 'datacenter1': 3};
NetworkTopologyStrategy allows Cassandra to take advantage of multiple racks and data centers.
Configure Racks and Snitch
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**Correctly configuring or changing racks after a cluster has been provisioned is an unsupported process**. Migrating from
a single rack to multiple racks is also unsupported and can result in data loss.
Using ``GossipingPropertyFileSnitch`` is the most flexible solution for on premise or mixed cloud environments. ``Ec2Snitch``
is reliable for AWS EC2 only environments.