src/doc/4.1/_sources/operating/compression.rst.txt - cassandra-website - Git at Google

 .. Licensed to the Apache Software Foundation (ASF) under one
 .. or more contributor license agreements.  See the NOTICE file
 .. distributed with this work for additional information
 .. regarding copyright ownership.  The ASF licenses this file
 .. to you under the Apache License, Version 2.0 (the
 .. "License"); you may not use this file except in compliance
 .. with the License.  You may obtain a copy of the License at
 ..
 ..     http://www.apache.org/licenses/LICENSE-2.0
 ..
 .. Unless required by applicable law or agreed to in writing, software
 .. distributed under the License is distributed on an "AS IS" BASIS,
 .. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 .. See the License for the specific language governing permissions and
 .. limitations under the License.

 .. highlight:: none

 Compression
 -----------

 Cassandra offers operators the ability to configure compression on a per-table basis. Compression reduces the size of
 data on disk by compressing the SSTable in user-configurable compression ``chunk_length_in_kb``. As Cassandra SSTables
 are immutable, the CPU cost of compressing is only necessary when the SSTable is written - subsequent updates
 to data will land in different SSTables, so Cassandra will not need to decompress, overwrite, and recompress data when
 UPDATE commands are issued. On reads, Cassandra will locate the relevant compressed chunks on disk, decompress the full
 chunk, and then proceed with the remainder of the read path (merging data from disks and memtables, read repair, and so
 on).

 Compression algorithms typically trade off between the following three areas:

 - **Compression speed**: How fast does the compression algorithm compress data. This is critical in the flush and
   compaction paths because data must be compressed before it is written to disk.
 - **Decompression speed**: How fast does the compression algorithm de-compress data. This is critical in the read
   and compaction paths as data must be read off disk in a full chunk and decompressed before it can be returned.
 - **Ratio**: By what ratio is the uncompressed data reduced by. Cassandra typically measures this as the size of data
   on disk relative to the uncompressed size. For example a ratio of ``0.5`` means that the data on disk is 50% the size
   of the uncompressed data. Cassandra exposes this ratio per table as the ``SSTable Compression Ratio`` field of
   ``nodetool tablestats``.

 Cassandra offers five compression algorithms by default that make different tradeoffs in these areas. While
 benchmarking compression algorithms depends on many factors (algorithm parameters such as compression level,
 the compressibility of the input data, underlying processor class, etc ...), the following table should help you pick
 a starting point based on your application's requirements with an extremely rough grading of the different choices
 by their performance in these areas (A is relatively good, F is relatively bad):

 +---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
 | Compression Algorithm                       | Cassandra Class       | Compression | Decompression | Ratio | C* Version  |
 +=============================================+=======================+=============+===============+=======+=============+
 | `LZ4 <https://lz4.github.io/lz4/>`_         | ``LZ4Compressor``     |          A+ |            A+ |    C+ | ``>=1.2.2`` |
 +---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
 | `LZ4HC <https://lz4.github.io/lz4/>`_       | ``LZ4Compressor``     |          C+ |            A+ |    B+ | ``>= 3.6``  |
 +---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
 | `Zstd <https://facebook.github.io/zstd/>`_  | ``ZstdCompressor``    |          A- |            A- |    A+ | ``>= 4.0``  |
 +---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
 | `Snappy <http://google.github.io/snappy/>`_ | ``SnappyCompressor``  |          A- |            A  |     C | ``>= 1.0``  |
 +---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
 | `Deflate (zlib) <https://zlib.net>`_        | ``DeflateCompressor`` |          C  |            C  |     A | ``>= 1.0``  |
 +---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+

 Generally speaking for a performance critical (latency or throughput) application ``LZ4`` is the right choice as it
 gets excellent ratio per CPU cycle spent. This is why it is the default choice in Cassandra.

 For storage critical applications (disk footprint), however, ``Zstd`` may be a better choice as it can get significant
 additional ratio to ``LZ4``.

 ``Snappy`` is kept for backwards compatibility and ``LZ4`` will typically be preferable.

 ``Deflate`` is kept for backwards compatibility and ``Zstd`` will typically be preferable.

 Configuring Compression
 ^^^^^^^^^^^^^^^^^^^^^^^

 Compression is configured on a per-table basis as an optional argument to ``CREATE TABLE`` or ``ALTER TABLE``. Three
 options are available for all compressors:

 - ``class`` (default: ``LZ4Compressor``): specifies the compression class to use. The two "fast"
   compressors are ``LZ4Compressor`` and ``SnappyCompressor`` and the two "good" ratio compressors are ``ZstdCompressor``
   and ``DeflateCompressor``.
 - ``chunk_length_in_kb`` (default: ``16KiB``): specifies the number of kilobytes of data per compression chunk. The main
   tradeoff here is that larger chunk sizes give compression algorithms more context and improve their ratio, but
   require reads to deserialize and read more off disk.
 - ``crc_check_chance`` (default: ``1.0``): determines how likely Cassandra is to verify the checksum on each compression
   chunk during reads to protect against data corruption. Unless you have profiles indicating this is a performance
   problem it is highly encouraged not to turn this off as it is Cassandra's only protection against bitrot.

 The ``LZ4Compressor`` supports the following additional options:

 - ``lz4_compressor_type`` (default ``fast``): specifies if we should use the ``high`` (a.k.a ``LZ4HC``) ratio version
   or the ``fast`` (a.k.a ``LZ4``) version of ``LZ4``. The ``high`` mode supports a configurable level, which can allow
   operators to tune the performance <-> ratio tradeoff via the ``lz4_high_compressor_level`` option. Note that in
   ``4.0`` and above it may be preferable to use the ``Zstd`` compressor.
 - ``lz4_high_compressor_level`` (default ``9``): A number between ``1`` and ``17`` inclusive that represents how much
   CPU time to spend trying to get more compression ratio. Generally lower levels are "faster" but they get less ratio
   and higher levels are slower but get more compression ratio.

 The ``ZstdCompressor`` supports the following options in addition:

 - ``compression_level`` (default ``3``): A number between ``-131072`` and ``22`` inclusive that represents how much CPU
   time to spend trying to get more compression ratio. The lower the level, the faster the speed (at the cost of ratio).
   Values from 20 to 22 are called "ultra levels" and should be used with caution, as they require more memory.
   The default of ``3`` is a good choice for competing with ``Deflate`` ratios and ``1`` is a good choice for competing
   with ``LZ4``.


 Users can set compression using the following syntax:

 ::

     CREATE TABLE keyspace.table (id int PRIMARY KEY) WITH compression = {'class': 'LZ4Compressor'};

 Or

 ::

     ALTER TABLE keyspace.table WITH compression = {'class': 'LZ4Compressor', 'chunk_length_in_kb': 64, 'crc_check_chance': 0.5};

 Once enabled, compression can be disabled with ``ALTER TABLE`` setting ``enabled`` to ``false``:

 ::

     ALTER TABLE keyspace.table WITH compression = {'enabled':'false'};

 Operators should be aware, however, that changing compression is not immediate. The data is compressed when the SSTable
 is written, and as SSTables are immutable, the compression will not be modified until the table is compacted. Upon
 issuing a change to the compression options via ``ALTER TABLE``, the existing SSTables will not be modified until they
 are compacted - if an operator needs compression changes to take effect immediately, the operator can trigger an SSTable
 rewrite using ``nodetool scrub`` or ``nodetool upgradesstables -a``, both of which will rebuild the SSTables on disk,
 re-compressing the data in the process.

 Benefits and Uses
 ^^^^^^^^^^^^^^^^^

 Compression's primary benefit is that it reduces the amount of data written to disk. Not only does the reduced size save
 in storage requirements, it often increases read and write throughput, as the CPU overhead of compressing data is faster
 than the time it would take to read or write the larger volume of uncompressed data from disk.

 Compression is most useful in tables comprised of many rows, where the rows are similar in nature. Tables containing
 similar text columns (such as repeated JSON blobs) often compress very well. Tables containing data that has already
 been compressed or random data (e.g. benchmark datasets) do not typically compress well.

 Operational Impact
 ^^^^^^^^^^^^^^^^^^

 - Compression metadata is stored off-heap and scales with data on disk.  This often requires 1-3GB of off-heap RAM per
   terabyte of data on disk, though the exact usage varies with ``chunk_length_in_kb`` and compression ratios.

 - Streaming operations involve compressing and decompressing data on compressed tables - in some code paths (such as
   non-vnode bootstrap), the CPU overhead of compression can be a limiting factor.

 - To prevent slow compressors (``Zstd``, ``Deflate``, ``LZ4HC``) from blocking flushes for too long, all three
   flush with the default fast ``LZ4`` compressor and then rely on normal compaction to re-compress the data into the
   desired compression strategy. See `CASSANDRA-15379 <https://issues.apache.org/jira/browse/CASSANDRA-15379>` for more
   details.

 - The compression path checksums data to ensure correctness - while the traditional Cassandra read path does not have a
   way to ensure correctness of data on disk, compressed tables allow the user to set ``crc_check_chance`` (a float from
   0.0 to 1.0) to allow Cassandra to probabilistically validate chunks on read to verify bits on disk are not corrupt.

 Advanced Use
 ^^^^^^^^^^^^

 Advanced users can provide their own compression class by implementing the interface at
 ``org.apache.cassandra.io.compress.ICompressor``.
	.. Licensed to the Apache Software Foundation (ASF) under one
	.. or more contributor license agreements. See the NOTICE file
	.. distributed with this work for additional information
	.. regarding copyright ownership. The ASF licenses this file
	.. to you under the Apache License, Version 2.0 (the
	.. "License"); you may not use this file except in compliance
	.. with the License. You may obtain a copy of the License at
	..
	.. http://www.apache.org/licenses/LICENSE-2.0
	..
	.. Unless required by applicable law or agreed to in writing, software
	.. distributed under the License is distributed on an "AS IS" BASIS,
	.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	.. See the License for the specific language governing permissions and
	.. limitations under the License.

	.. highlight:: none

	Compression
	-----------

	Cassandra offers operators the ability to configure compression on a per-table basis. Compression reduces the size of
	data on disk by compressing the SSTable in user-configurable compression ``chunk_length_in_kb``. As Cassandra SSTables
	are immutable, the CPU cost of compressing is only necessary when the SSTable is written - subsequent updates
	to data will land in different SSTables, so Cassandra will not need to decompress, overwrite, and recompress data when
	UPDATE commands are issued. On reads, Cassandra will locate the relevant compressed chunks on disk, decompress the full
	chunk, and then proceed with the remainder of the read path (merging data from disks and memtables, read repair, and so
	on).

	Compression algorithms typically trade off between the following three areas:

	- Compression speed: How fast does the compression algorithm compress data. This is critical in the flush and
	compaction paths because data must be compressed before it is written to disk.
	- Decompression speed: How fast does the compression algorithm de-compress data. This is critical in the read
	and compaction paths as data must be read off disk in a full chunk and decompressed before it can be returned.
	- Ratio: By what ratio is the uncompressed data reduced by. Cassandra typically measures this as the size of data
	on disk relative to the uncompressed size. For example a ratio of ``0.5`` means that the data on disk is 50% the size
	of the uncompressed data. Cassandra exposes this ratio per table as the ``SSTable Compression Ratio`` field of
	``nodetool tablestats``.

	Cassandra offers five compression algorithms by default that make different tradeoffs in these areas. While
	benchmarking compression algorithms depends on many factors (algorithm parameters such as compression level,
	the compressibility of the input data, underlying processor class, etc ...), the following table should help you pick
	a starting point based on your application's requirements with an extremely rough grading of the different choices
	by their performance in these areas (A is relatively good, F is relatively bad):

	+---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
	\| Compression Algorithm \| Cassandra Class \| Compression \| Decompression \| Ratio \| C* Version \|
	+=============================================+=======================+=============+===============+=======+=============+
	\| `LZ4 <https://lz4.github.io/lz4/>`_ \| ``LZ4Compressor`` \| A+ \| A+ \| C+ \| ``>=1.2.2`` \|
	+---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
	\| `LZ4HC <https://lz4.github.io/lz4/>`_ \| ``LZ4Compressor`` \| C+ \| A+ \| B+ \| ``>= 3.6`` \|
	+---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
	\| `Zstd <https://facebook.github.io/zstd/>`_ \| ``ZstdCompressor`` \| A- \| A- \| A+ \| ``>= 4.0`` \|
	+---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
	\| `Snappy <http://google.github.io/snappy/>`_ \| ``SnappyCompressor`` \| A- \| A \| C \| ``>= 1.0`` \|
	+---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+
	\| `Deflate (zlib) <https://zlib.net>`_ \| ``DeflateCompressor`` \| C \| C \| A \| ``>= 1.0`` \|
	+---------------------------------------------+-----------------------+-------------+---------------+-------+-------------+

	Generally speaking for a performance critical (latency or throughput) application ``LZ4`` is the right choice as it
	gets excellent ratio per CPU cycle spent. This is why it is the default choice in Cassandra.

	For storage critical applications (disk footprint), however, ``Zstd`` may be a better choice as it can get significant
	additional ratio to ``LZ4``.

	``Snappy`` is kept for backwards compatibility and ``LZ4`` will typically be preferable.

	``Deflate`` is kept for backwards compatibility and ``Zstd`` will typically be preferable.

	Configuring Compression
	^^^^^^^^^^^^^^^^^^^^^^^

	Compression is configured on a per-table basis as an optional argument to ``CREATE TABLE`` or ``ALTER TABLE``. Three
	options are available for all compressors:

	- ``class`` (default: ``LZ4Compressor``): specifies the compression class to use. The two "fast"
	compressors are ``LZ4Compressor`` and ``SnappyCompressor`` and the two "good" ratio compressors are ``ZstdCompressor``
	and ``DeflateCompressor``.
	- ``chunk_length_in_kb`` (default: ``16KiB``): specifies the number of kilobytes of data per compression chunk. The main
	tradeoff here is that larger chunk sizes give compression algorithms more context and improve their ratio, but
	require reads to deserialize and read more off disk.
	- ``crc_check_chance`` (default: ``1.0``): determines how likely Cassandra is to verify the checksum on each compression
	chunk during reads to protect against data corruption. Unless you have profiles indicating this is a performance
	problem it is highly encouraged not to turn this off as it is Cassandra's only protection against bitrot.

	The ``LZ4Compressor`` supports the following additional options:

	- ``lz4_compressor_type`` (default ``fast``): specifies if we should use the ``high`` (a.k.a ``LZ4HC``) ratio version
	or the ``fast`` (a.k.a ``LZ4``) version of ``LZ4``. The ``high`` mode supports a configurable level, which can allow
	operators to tune the performance <-> ratio tradeoff via the ``lz4_high_compressor_level`` option. Note that in
	``4.0`` and above it may be preferable to use the ``Zstd`` compressor.
	- ``lz4_high_compressor_level`` (default ``9``): A number between ``1`` and ``17`` inclusive that represents how much
	CPU time to spend trying to get more compression ratio. Generally lower levels are "faster" but they get less ratio
	and higher levels are slower but get more compression ratio.

	The ``ZstdCompressor`` supports the following options in addition:

	- ``compression_level`` (default ``3``): A number between ``-131072`` and ``22`` inclusive that represents how much CPU
	time to spend trying to get more compression ratio. The lower the level, the faster the speed (at the cost of ratio).
	Values from 20 to 22 are called "ultra levels" and should be used with caution, as they require more memory.
	The default of ``3`` is a good choice for competing with ``Deflate`` ratios and ``1`` is a good choice for competing
	with ``LZ4``.


	Users can set compression using the following syntax:

	::

	CREATE TABLE keyspace.table (id int PRIMARY KEY) WITH compression = {'class': 'LZ4Compressor'};

	Or

	::

	ALTER TABLE keyspace.table WITH compression = {'class': 'LZ4Compressor', 'chunk_length_in_kb': 64, 'crc_check_chance': 0.5};

	Once enabled, compression can be disabled with ``ALTER TABLE`` setting ``enabled`` to ``false``:

	::

	ALTER TABLE keyspace.table WITH compression = {'enabled':'false'};

	Operators should be aware, however, that changing compression is not immediate. The data is compressed when the SSTable
	is written, and as SSTables are immutable, the compression will not be modified until the table is compacted. Upon
	issuing a change to the compression options via ``ALTER TABLE``, the existing SSTables will not be modified until they
	are compacted - if an operator needs compression changes to take effect immediately, the operator can trigger an SSTable
	rewrite using ``nodetool scrub`` or ``nodetool upgradesstables -a``, both of which will rebuild the SSTables on disk,
	re-compressing the data in the process.

	Benefits and Uses
	^^^^^^^^^^^^^^^^^

	Compression's primary benefit is that it reduces the amount of data written to disk. Not only does the reduced size save
	in storage requirements, it often increases read and write throughput, as the CPU overhead of compressing data is faster
	than the time it would take to read or write the larger volume of uncompressed data from disk.

	Compression is most useful in tables comprised of many rows, where the rows are similar in nature. Tables containing
	similar text columns (such as repeated JSON blobs) often compress very well. Tables containing data that has already
	been compressed or random data (e.g. benchmark datasets) do not typically compress well.

	Operational Impact
	^^^^^^^^^^^^^^^^^^

	- Compression metadata is stored off-heap and scales with data on disk. This often requires 1-3GB of off-heap RAM per
	terabyte of data on disk, though the exact usage varies with ``chunk_length_in_kb`` and compression ratios.

	- Streaming operations involve compressing and decompressing data on compressed tables - in some code paths (such as
	non-vnode bootstrap), the CPU overhead of compression can be a limiting factor.

	- To prevent slow compressors (``Zstd``, ``Deflate``, ``LZ4HC``) from blocking flushes for too long, all three
	flush with the default fast ``LZ4`` compressor and then rely on normal compaction to re-compress the data into the
	desired compression strategy. See `CASSANDRA-15379 <https://issues.apache.org/jira/browse/CASSANDRA-15379>` for more
	details.

	- The compression path checksums data to ensure correctness - while the traditional Cassandra read path does not have a
	way to ensure correctness of data on disk, compressed tables allow the user to set ``crc_check_chance`` (a float from
	0.0 to 1.0) to allow Cassandra to probabilistically validate chunks on read to verify bits on disk are not corrupt.

	Advanced Use
	^^^^^^^^^^^^

	Advanced users can provide their own compression class by implementing the interface at
	``org.apache.cassandra.io.compress.ICompressor``.