blob: 1826e9ea8f7d5c3dce378d70796a162f55626e30 [file] [log] [blame]
= sstablescrub
Fix a broken sstable. The scrub process rewrites the sstable, skipping
any corrupted rows. Because these rows are lost, follow this process
with a repair.
ref: https://issues.apache.org/jira/browse/CASSANDRA-4321
Cassandra must be stopped before this tool is executed, or unexpected
results will occur. Note: the script does not verify that Cassandra is
stopped.
== Usage
sstablescrub <options> <keyspace> <table>
[cols=",",]
|===
|--debug |display stack traces
|-h,--help |display this help message
|-m,--manifest-check |only check and repair the leveled manifest,
without actually scrubbing the sstables
|-n,--no-validate |do not validate columns using column validator
|-r,--reinsert-overflowed-ttl |Rewrites rows with overflowed expiration
date affected by CASSANDRA-14092 with the maximum supported expiration
date of 2038-01-19T03:14:06+00:00. The rows are rewritten with the
original timestamp incremented by one millisecond to override/supersede
any potential tombstone that may have been generated during compaction
of the affected rows.
|-s,--skip-corrupted |skip corrupt rows in counter tables
|-v,--verbose |verbose output
|===
== Basic Scrub
The scrub without options will do a snapshot first, then write all
non-corrupted files to a new sstable.
Example:
....
sstablescrub keyspace1 standard1
Pre-scrub sstables snapshotted into snapshot pre-scrub-1534424070883
Scrubbing BigTableReader(path='/var/lib/cassandra/data/keyspace1/standard1-6365332094dd11e88f324f9c503e4753/mc-5-big-Data.db') (17.142MiB)
Scrub of BigTableReader(path='/var/lib/cassandra/data/keyspace1/standard1-6365332094dd11e88f324f9c503e4753/mc-5-big-Data.db') complete: 73367 rows in new sstable and 0 empty (tombstoned) rows dropped
Checking leveled manifest
....
== Scrub without Validation
ref: https://issues.apache.org/jira/browse/CASSANDRA-9406
Use the --no-validate option to retain data that may be misrepresented
(e.g., an integer stored in a long field) but not corrupt. This data
usually doesn not present any errors to the client.
Example:
....
sstablescrub --no-validate keyspace1 standard1
Pre-scrub sstables snapshotted into snapshot pre-scrub-1536243158517
Scrubbing BigTableReader(path='/var/lib/cassandra/data/keyspace1/standard1-bc9cf530b1da11e886c66d2c86545d91/mc-2-big-Data.db') (4.482MiB)
Scrub of BigTableReader(path='/var/lib/cassandra/data/keyspace1/standard1-bc9cf530b1da11e886c66d2c86545d91/mc-2-big-Data.db') complete; looks like all 0 rows were tombstoned
....
== Skip Corrupted Counter Tables
ref: https://issues.apache.org/jira/browse/CASSANDRA-5930
If counter tables are corrupted in a way that prevents sstablescrub from
completing, you can use the --skip-corrupted option to skip scrubbing
those counter tables. This workaround is not necessary in versions 2.0+.
Example:
....
sstablescrub --skip-corrupted keyspace1 counter1
....
== Dealing with Overflow Dates
ref: https://issues.apache.org/jira/browse/CASSANDRA-14092
Using the option --reinsert-overflowed-ttl allows a rewriting of rows
that had a max TTL going over the maximum (causing an overflow).
Example:
....
sstablescrub --reinsert-overflowed-ttl keyspace1 counter1
....
== Manifest Check
As of Cassandra version 2.0, this option is no longer relevant, since
level data was moved from a separate manifest into the sstable metadata.