blob: 162d2fec5e568f51214a7bf20bbda4cc596e37a7 [file] [log] [blame]
= Size Tiered Compaction Strategy
[[stcs]]
The basic idea of `SizeTieredCompactionStrategy` (STCS) is to merge
sstables of approximately the same size. All sstables are put in
different buckets depending on their size. An sstable is added to the
bucket if size of the sstable is within `bucket_low` and `bucket_high`
of the current average size of the sstables already in the bucket. This
will create several buckets and the most interesting of those buckets
will be compacted. The most interesting one is decided by figuring out
which bucket's sstables takes the most reads.
== Major compaction
When running a major compaction with STCS you will end up with two
sstables per data directory (one for repaired data and one for
unrepaired data). There is also an option (-s) to do a major compaction
that splits the output into several sstables. The sizes of the sstables
are approximately 50%, 25%, 12.5%... of the total size.
[[stcs_options]]
== STCS options
`min_sstable_size` (default: 50MB)::
Sstables smaller than this are put in the same bucket.
`bucket_low` (default: 0.5)::
How much smaller than the average size of a bucket a sstable should be
before not being included in the bucket. That is, if
`bucket_low * avg_bucket_size < sstable_size` (and the `bucket_high`
condition holds, see below), then the sstable is added to the bucket.
`bucket_high` (default: 1.5)::
How much bigger than the average size of a bucket a sstable should be
before not being included in the bucket. That is, if
`sstable_size < bucket_high * avg_bucket_size` (and the `bucket_low`
condition holds, see above), then the sstable is added to the bucket.
== Defragmentation
Defragmentation is done when many sstables are touched during a read.
The result of the read is put in to the memtable so that the next read
will not have to touch as many sstables. This can cause writes on a
read-only-cluster.