blob: 6684946b47e9902daf5f401934dafac6136d95ab [file] [log] [blame]
= sstablepartitions
Identifies large partitions of SSTables and outputs the partition size in bytes, row count, cell count, and tombstone count.
You can supply any number of sstables file paths, or directories containing sstables. Each sstable will be analyzed separately.
If a metrics threshold such as `--min-size`, `--min-rows`, `--min-cells` or `--min-tombstones` is provided,
then the partition keys exceeding of the threshold will be printed in the output.
It also prints a summary of metrics for the table. The percentiles in the metrics are estimates,
while the min/max/count metrics are accurate.
The default output of this tool is meant to be read by human eyes.
Future versions might include small formatting changes or present new data that can fool scripts reading it.
Scripts or other automatic tools should use the `--csv` flag to produce machine-readable output.
Future versions will not change the format of the CSV output except for maybe adding new columns,
so a proper CSV parser consuming the output should keep working.
Cassandra doesn't need to be running before this tool is executed.
== Usage
sstablepartitions <options> <sstable files or directories>
[cols=",",]
|===
|-t, --min-size <arg> |Partition size threshold, expressed as either the number of bytes or a size with unit of the form 10KiB, 20MiB, 30GiB, etc.
|-w, --min-rows <arg> |Partition row count threshold.
|-c, --min-cells <arg> |Partition cell count threshold
|-o, --min-tombstones <arg> |Partition tombstone count threshold.
|-k, --key <arg> |Partition keys to include, instead of scanning all partitions.
|-x, --exclude-key <arg> |Partition keys to exclude.
|-r, --recursive |Scan for sstables recursively
|-b, --backups |Include backups present in data directories when scanning directories
|-s, --snaphsots |Include snapshots present in data directories when scanning directories
|-u, --current-timestamp <arg> |Timestamp (seconds since epoch, unit time) for TTL expired calculation.
|-y, --partitions-only |Only brief partition information. Exclude per-partition detailed row/cell/tombstone information from process and output.
|-m, --csv |Produced CSV output (machine readable)
|===
== Examples
=== Analyze partition statistics for a single SSTable
Use the path to the SSTable file as the only argument.
Example:
....
sstablepartitions data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db
Processing k.t-d7be5e90e90111ed8b54efe3c39cb0bb #8 (big-nc) (1.368 GiB uncompressed, 534.979 MiB on disk)
Partition size Row count Cell count Tombstone count
~p50 767.519 KiB 770 1916 0
~p75 2.238 MiB 2299 5722 0
~p90 3.867 MiB 3311 9887 50
~p95 16.629 MiB 14237 42510 446
~p99 148.267 MiB 126934 379022 1331
~p999 368.936 MiB 315852 943127 2759
min 56.854 KiB 100 150 0
max 356.067 MiB 310706 932118 2450
count 210
....
=== Analyze partition statistics for all SSTables in a directory
Use the path to the SSTables directory as the only argument.
Example:
....
sstablepartitions data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb
Processing k.t-d7be5e90e90111ed8b54efe3c39cb0bb #8 (big-nc) (1.368 GiB uncompressed, 534.979 MiB on disk)
Partition size Row count Cell count Tombstone count
~p50 767.519 KiB 770 1916 0
~p75 2.238 MiB 2299 5722 0
~p90 3.867 MiB 3311 9887 50
~p95 16.629 MiB 14237 42510 446
~p99 148.267 MiB 126934 379022 1331
~p999 368.936 MiB 315852 943127 2759
min 56.854 KiB 100 150 0
max 356.067 MiB 310706 932118 2450
count 210
Processing k.t-d7be5e90e90111ed8b54efe3c39cb0bb #9 (big-nc) (457.540 MiB uncompressed, 174.880 MiB on disk)
Partition size Row count Cell count Tombstone count
~p50 1.865 MiB 1597 4768 0
~p75 13.858 MiB 14237 42510 0
~p90 28.735 MiB 29521 73457 50
~p95 34.482 MiB 29521 88148 8239
~p99 49.654 MiB 42510 126934 14237
~p999 49.654 MiB 42510 126934 14237
min 47.272 KiB 100 150 0
max 45.133 MiB 39429 118287 13030
count 57
....
=== Output only partitions over 100MiB in size
Use the `--min-size` option to specify the minimum size a partition must have to be included in the output.
Example:
....
sstablepartitions data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db --min-size 100MiB
Processing k.t-d7be5e90e90111ed8b54efe3c39cb0bb #8 (big-nc) (1.368 GiB uncompressed, 534.979 MiB on disk)
Partition: '13' (0000000d) live, size: 105.056 MiB, rows: 91490, cells: 274470, tombstones: 50 (row:50, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Partition: '1' (00000001) live, size: 127.241 MiB, rows: 111065, cells: 333195, tombstones: 50 (row:50, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Partition: '8' (00000008) live, size: 356.067 MiB, rows: 310706, cells: 932118, tombstones: 0 (row:0, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Partition: '2' (00000002) live, size: 213.341 MiB, rows: 186582, cells: 559125, tombstones: 978 (row:978, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Summary of k.t-d7be5e90e90111ed8b54efe3c39cb0bb #8 (big-nc):
File: /Users/adelapena/src/cassandra/trunk/data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db
4 partitions match
Keys: 13 1 8 2
Partition size Row count Cell count Tombstone count
~p50 767.519 KiB 770 1916 0
~p75 2.238 MiB 2299 5722 0
~p90 3.867 MiB 3311 9887 50
~p95 16.629 MiB 14237 42510 446
~p99 148.267 MiB 126934 379022 1331
~p999 368.936 MiB 315852 943127 2759
min 56.854 KiB 100 150 0
max 356.067 MiB 310706 932118 2450
count 210
....
=== Output only partitions with more than 1000 tombstones
Use the `--min-tombstones` option to specify the minimum number of tombstones a partition must have to be included in the output.
Example:
....
sstablepartitions data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db --min-tombstones 1000
Processing k.t-d7be5e90e90111ed8b54efe3c39cb0bb #8 (big-nc) (1.368 GiB uncompressed, 534.979 MiB on disk)
Partition: '55' (00000037) live, size: 1.290 MiB, rows: 2317, cells: 3474, tombstones: 1159 (row:1159, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Partition: '28' (0000001c) live, size: 1.198 MiB, rows: 2099, cells: 3147, tombstones: 1050 (row:1050, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Partition: '89' (00000059) live, size: 1.346 MiB, rows: 2226, cells: 3339, tombstones: 1113 (row:1113, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Partition: '21' (00000015) live, size: 3.853 MiB, rows: 4900, cells: 9927, tombstones: 2450 (row:2450, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Summary of k.t-d7be5e90e90111ed8b54efe3c39cb0bb #8 (big-nc):
File: /Users/adelapena/src/cassandra/trunk/data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db
4 partitions match
Keys: 55 28 89 21
Partition size Row count Cell count Tombstone count
~p50 767.519 KiB 770 1916 0
~p75 2.238 MiB 2299 5722 0
~p90 3.867 MiB 3311 9887 50
~p95 16.629 MiB 14237 42510 446
~p99 148.267 MiB 126934 379022 1331
~p999 368.936 MiB 315852 943127 2759
min 56.854 KiB 100 150 0
max 356.067 MiB 310706 932118 2450
count 210
....
=== Output CSV machine-readable output
Use the `--csv` option to output a CSV machine-readable output, combined with any threshold value.
Example:
....
sstablepartitions data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db --min-size 100MiB --csv
key,keyBinary,live,offset,size,rowCount,cellCount,tombstoneCount,rowTombstoneCount,rangeTombstoneCount,complexTombstoneCount,cellTombstoneCount,rowTtlExpired,cellTtlExpired,directory,keyspace,table,index,snapshot,backup,generation,format,version
"13",0000000d,true,186403543,110158965,91490,274470,50,50,0,0,0,0,0,/Users/adelapena/src/cassandra/trunk/data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db,k,t,,,,8,big,nc
"1",00000001,true,325141542,133422183,111065,333195,50,50,0,0,0,0,0,/Users/adelapena/src/cassandra/trunk/data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db,k,t,,,,8,big,nc
"8",00000008,true,477133752,373362819,310706,932118,0,0,0,0,0,0,0,/Users/adelapena/src/cassandra/trunk/data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db,k,t,,,,8,big,nc
"2",00000002,true,851841363,223704192,186582,559125,978,978,0,0,0,0,0,/Users/adelapena/src/cassandra/trunk/data/data/k/t-d7be5e90e90111ed8b54efe3c39cb0bb/nc-8-big-Data.db,k,t,,,,8,big,nc
....