doc/modules/cassandra/pages/troubleshooting/use_nodetool.adoc - cassandra - Git at Google

 = Use Nodetool

 Cassandra's `nodetool` allows you to narrow problems from the cluster
 down to a particular node and gives a lot of insight into the state of
 the Cassandra process itself. There are dozens of useful commands (see
 `nodetool help` for all the commands), but briefly some of the most
 useful for troubleshooting:

 [[nodetool-status]]
 == Cluster Status

 You can use `nodetool status` to assess status of the cluster:

 [source, bash]
 ----
 $ nodetool status <optional keyspace>

 Datacenter: dc1
 =======================
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
 UN  127.0.1.1  4.69 GiB   1            100.0%            35ea8c9f-b7a2-40a7-b9c5-0ee8b91fdd0e  r1
 UN  127.0.1.2  4.71 GiB   1            100.0%            752e278f-b7c5-4f58-974b-9328455af73f  r2
 UN  127.0.1.3  4.69 GiB   1            100.0%            9dc1a293-2cc0-40fa-a6fd-9e6054da04a7  r3
 ----

 In this case we can see that we have three nodes in one datacenter with
 about 4.6GB of data each and they are all "up". The up/down status of a
 node is independently determined by every node in the cluster, so you
 may have to run `nodetool status` on multiple nodes in a cluster to see
 the full view.

 You can use `nodetool status` plus a little grep to see which nodes are
 down:

 [source, bash]
 ----
 $ nodetool status | grep -v '^UN'
 Datacenter: dc1
 ===============
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
 Datacenter: dc2
 ===============
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
 DN  127.0.0.5  105.73 KiB  1            33.3%             df303ac7-61de-46e9-ac79-6e630115fd75  r1
 ----

 In this case there are two datacenters and there is one node down in
 datacenter `dc2` and rack `r1`. This may indicate an issue on
 `127.0.0.5` warranting investigation.

 [[nodetool-proxyhistograms]]
 == Coordinator Query Latency

 You can view latency distributions of coordinator read and write latency
 to help narrow down latency issues using `nodetool proxyhistograms`:

 [source, bash]
 ----
 $ nodetool proxyhistograms
 Percentile       Read Latency      Write Latency      Range Latency   CAS Read Latency  CAS Write Latency View Write Latency
                      (micros)           (micros)           (micros)           (micros)           (micros)           (micros)
 50%                    454.83             219.34               0.00               0.00               0.00               0.00
 75%                    545.79             263.21               0.00               0.00               0.00               0.00
 95%                    654.95             315.85               0.00               0.00               0.00               0.00
 98%                    785.94             379.02               0.00               0.00               0.00               0.00
 99%                   3379.39            2346.80               0.00               0.00               0.00               0.00
 Min                     42.51             105.78               0.00               0.00               0.00               0.00
 Max                  25109.16           43388.63               0.00               0.00               0.00               0.00
 ----

 Here you can see the full latency distribution of reads, writes, range
 requests (e.g. `select * from keyspace.table`), CAS read (compare phase
 of CAS) and CAS write (set phase of compare and set). These can be
 useful for narrowing down high level latency problems, for example in
 this case if a client had a 20 millisecond timeout on their reads they
 might experience the occasional timeout from this node but less than 1%
 (since the 99% read latency is 3.3 milliseconds < 20 milliseconds).

 [[nodetool-tablehistograms]]
 == Local Query Latency

 If you know which table is having latency/error issues, you can use
 `nodetool tablehistograms` to get a better idea of what is happening
 locally on a node:

 [source, bash]
 ----
 $ nodetool tablehistograms keyspace table
 Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                               (micros)          (micros)           (bytes)
 50%             0.00             73.46            182.79             17084               103
 75%             1.00             88.15            315.85             17084               103
 95%             2.00            126.93            545.79             17084               103
 98%             2.00            152.32            654.95             17084               103
 99%             2.00            182.79            785.94             17084               103
 Min             0.00             42.51             24.60             14238                87
 Max             2.00          12108.97          17436.92             17084               103
 ----

 This shows you percentile breakdowns particularly critical metrics.

 The first column contains how many sstables were read per logical read.
 A very high number here indicates that you may have chosen the wrong
 compaction strategy, e.g. `SizeTieredCompactionStrategy` typically has
 many more reads per read than `LeveledCompactionStrategy` does for
 update heavy workloads.

 The second column shows you a latency breakdown of _local_ write
 latency. In this case we see that while the p50 is quite good at 73
 microseconds, the maximum latency is quite slow at 12 milliseconds. High
 write max latencies often indicate a slow commitlog volume (slow to
 fsync) or large writes that quickly saturate commitlog segments.

 The third column shows you a latency breakdown of _local_ read latency.
 We can see that local Cassandra reads are (as expected) slower than
 local writes, and the read speed correlates highly with the number of
 sstables read per read.

 The fourth and fifth columns show distributions of partition size and
 column count per partition. These are useful for determining if the
 table has on average skinny or wide partitions and can help you isolate
 bad data patterns. For example if you have a single cell that is 2
 megabytes, that is probably going to cause some heap pressure when it's
 read.

 [[nodetool-tpstats]]
 == Threadpool State

 You can use `nodetool tpstats` to view the current outstanding requests
 on a particular node. This is useful for trying to find out which
 resource (read threads, write threads, compaction, request response
 threads) the Cassandra process lacks. For example:

 [source, bash]
 ----
 $ nodetool tpstats
 Pool Name                         Active   Pending      Completed   Blocked  All time blocked
 ReadStage                              2         0             12         0                 0
 MiscStage                              0         0              0         0                 0
 CompactionExecutor                     0         0           1940         0                 0
 MutationStage                          0         0              0         0                 0
 GossipStage                            0         0          10293         0                 0
 Repair-Task                            0         0              0         0                 0
 RequestResponseStage                   0         0             16         0                 0
 ReadRepairStage                        0         0              0         0                 0
 CounterMutationStage                   0         0              0         0                 0
 MemtablePostFlush                      0         0             83         0                 0
 ValidationExecutor                     0         0              0         0                 0
 MemtableFlushWriter                    0         0             30         0                 0
 ViewMutationStage                      0         0              0         0                 0
 CacheCleanupExecutor                   0         0              0         0                 0
 MemtableReclaimMemory                  0         0             30         0                 0
 PendingRangeCalculator                 0         0             11         0                 0
 SecondaryIndexManagement               0         0              0         0                 0
 HintsDispatcher                        0         0              0         0                 0
 Native-Transport-Requests              0         0            192         0                 0
 MigrationStage                         0         0             14         0                 0
 PerDiskMemtableFlushWriter_0           0         0             30         0                 0
 Sampler                                0         0              0         0                 0
 ViewBuildExecutor                      0         0              0         0                 0
 InternalResponseStage                  0         0              0         0                 0
 AntiEntropyStage                       0         0              0         0                 0

 Message type           Dropped                  Latency waiting in queue (micros)
                                              50%               95%               99%               Max
 READ                         0               N/A               N/A               N/A               N/A
 RANGE_SLICE                  0              0.00              0.00              0.00              0.00
 _TRACE                       0               N/A               N/A               N/A               N/A
 HINT                         0               N/A               N/A               N/A               N/A
 MUTATION                     0               N/A               N/A               N/A               N/A
 COUNTER_MUTATION             0               N/A               N/A               N/A               N/A
 BATCH_STORE                  0               N/A               N/A               N/A               N/A
 BATCH_REMOVE                 0               N/A               N/A               N/A               N/A
 REQUEST_RESPONSE             0              0.00              0.00              0.00              0.00
 PAGED_RANGE                  0               N/A               N/A               N/A               N/A
 READ_REPAIR                  0               N/A               N/A               N/A               N/A
 ----

 This command shows you all kinds of interesting statistics. The first
 section shows a detailed breakdown of threadpools for each Cassandra
 stage, including how many threads are current executing (Active) and how
 many are waiting to run (Pending). Typically if you see pending
 executions in a particular threadpool that indicates a problem localized
 to that type of operation. For example if the `RequestResponseState`
 queue is backing up, that means that the coordinators are waiting on a
 lot of downstream replica requests and may indicate a lack of token
 awareness, or very high consistency levels being used on read requests
 (for example reading at `ALL` ties up RF `RequestResponseState` threads
 whereas `LOCAL_ONE` only uses a single thread in the `ReadStage`
 threadpool). On the other hand if you see a lot of pending compactions
 that may indicate that your compaction threads cannot keep up with the
 volume of writes and you may need to tune either the compaction strategy
 or the `concurrent_compactors` or `compaction_throughput` options.

 The second section shows drops (errors) and latency distributions for
 all the major request types. Drops are cumulative since process start,
 but if you have any that indicate a serious problem as the default
 timeouts to qualify as a drop are quite high (~5-10 seconds). Dropped
 messages often warrants further investigation.

 [[nodetool-compactionstats]]
 == Compaction State

 As Cassandra is a LSM datastore, Cassandra sometimes has to compact
 sstables together, which can have adverse effects on performance. In
 particular, compaction uses a reasonable quantity of CPU resources,
 invalidates large quantities of the OS
 https://en.wikipedia.org/wiki/Page_cache[page cache], and can put a lot
 of load on your disk drives. There are great `os tools <os-iostat>` to
 determine if this is the case, but often it's a good idea to check if
 compactions are even running using `nodetool compactionstats`:

 [source, bash]
 ----
 $ nodetool compactionstats
 pending tasks: 2
 - keyspace.table: 2

 id                                   compaction type keyspace table completed total    unit  progress
 2062b290-7f3a-11e8-9358-cd941b956e60 Compaction      keyspace table 21848273  97867583 bytes 22.32%
 Active compaction remaining time :   0h00m04s
 ----

 In this case there is a single compaction running on the
 `keyspace.table` table, has completed 21.8 megabytes of 97 and Cassandra
 estimates (based on the configured compaction throughput) that this will
 take 4 seconds. You can also pass `-H` to get the units in a human
 readable format.

 Generally each running compaction can consume a single core, but the
 more you do in parallel the faster data compacts. Compaction is crucial
 to ensuring good read performance so having the right balance of
 concurrent compactions such that compactions complete quickly but don't
 take too many resources away from query threads is very important for
 performance. If you notice compaction unable to keep up, try tuning
 Cassandra's `concurrent_compactors` or `compaction_throughput` options.
	= Use Nodetool

	Cassandra's `nodetool` allows you to narrow problems from the cluster
	down to a particular node and gives a lot of insight into the state of
	the Cassandra process itself. There are dozens of useful commands (see
	`nodetool help` for all the commands), but briefly some of the most
	useful for troubleshooting:

	[[nodetool-status]]
	== Cluster Status

	You can use `nodetool status` to assess status of the cluster:

	[source, bash]
	----
	$ nodetool status <optional keyspace>

	Datacenter: dc1
	=======================
	Status=Up/Down
	\|/ State=Normal/Leaving/Joining/Moving
	-- Address Load Tokens Owns (effective) Host ID Rack
	UN 127.0.1.1 4.69 GiB 1 100.0% 35ea8c9f-b7a2-40a7-b9c5-0ee8b91fdd0e r1
	UN 127.0.1.2 4.71 GiB 1 100.0% 752e278f-b7c5-4f58-974b-9328455af73f r2
	UN 127.0.1.3 4.69 GiB 1 100.0% 9dc1a293-2cc0-40fa-a6fd-9e6054da04a7 r3
	----

	In this case we can see that we have three nodes in one datacenter with
	about 4.6GB of data each and they are all "up". The up/down status of a
	node is independently determined by every node in the cluster, so you
	may have to run `nodetool status` on multiple nodes in a cluster to see
	the full view.

	You can use `nodetool status` plus a little grep to see which nodes are
	down:

	[source, bash]
	----
	$ nodetool status \| grep -v '^UN'
	Datacenter: dc1
	===============
	Status=Up/Down
	\|/ State=Normal/Leaving/Joining/Moving
	-- Address Load Tokens Owns (effective) Host ID Rack
	Datacenter: dc2
	===============
	Status=Up/Down
	\|/ State=Normal/Leaving/Joining/Moving
	-- Address Load Tokens Owns (effective) Host ID Rack
	DN 127.0.0.5 105.73 KiB 1 33.3% df303ac7-61de-46e9-ac79-6e630115fd75 r1
	----

	In this case there are two datacenters and there is one node down in
	datacenter `dc2` and rack `r1`. This may indicate an issue on
	`127.0.0.5` warranting investigation.

	[[nodetool-proxyhistograms]]
	== Coordinator Query Latency

	You can view latency distributions of coordinator read and write latency
	to help narrow down latency issues using `nodetool proxyhistograms`:

	[source, bash]
	----
	$ nodetool proxyhistograms
	Percentile Read Latency Write Latency Range Latency CAS Read Latency CAS Write Latency View Write Latency
	(micros) (micros) (micros) (micros) (micros) (micros)
	50% 454.83 219.34 0.00 0.00 0.00 0.00
	75% 545.79 263.21 0.00 0.00 0.00 0.00
	95% 654.95 315.85 0.00 0.00 0.00 0.00
	98% 785.94 379.02 0.00 0.00 0.00 0.00
	99% 3379.39 2346.80 0.00 0.00 0.00 0.00
	Min 42.51 105.78 0.00 0.00 0.00 0.00
	Max 25109.16 43388.63 0.00 0.00 0.00 0.00
	----

	Here you can see the full latency distribution of reads, writes, range
	requests (e.g. `select * from keyspace.table`), CAS read (compare phase
	of CAS) and CAS write (set phase of compare and set). These can be
	useful for narrowing down high level latency problems, for example in
	this case if a client had a 20 millisecond timeout on their reads they
	might experience the occasional timeout from this node but less than 1%
	(since the 99% read latency is 3.3 milliseconds < 20 milliseconds).

	[[nodetool-tablehistograms]]
	== Local Query Latency

	If you know which table is having latency/error issues, you can use
	`nodetool tablehistograms` to get a better idea of what is happening
	locally on a node:

	[source, bash]
	----
	$ nodetool tablehistograms keyspace table
	Percentile SSTables Write Latency Read Latency Partition Size Cell Count
	(micros) (micros) (bytes)
	50% 0.00 73.46 182.79 17084 103
	75% 1.00 88.15 315.85 17084 103
	95% 2.00 126.93 545.79 17084 103
	98% 2.00 152.32 654.95 17084 103
	99% 2.00 182.79 785.94 17084 103
	Min 0.00 42.51 24.60 14238 87
	Max 2.00 12108.97 17436.92 17084 103
	----

	This shows you percentile breakdowns particularly critical metrics.

	The first column contains how many sstables were read per logical read.
	A very high number here indicates that you may have chosen the wrong
	compaction strategy, e.g. `SizeTieredCompactionStrategy` typically has
	many more reads per read than `LeveledCompactionStrategy` does for
	update heavy workloads.

	The second column shows you a latency breakdown of _local_ write
	latency. In this case we see that while the p50 is quite good at 73
	microseconds, the maximum latency is quite slow at 12 milliseconds. High
	write max latencies often indicate a slow commitlog volume (slow to
	fsync) or large writes that quickly saturate commitlog segments.

	The third column shows you a latency breakdown of _local_ read latency.
	We can see that local Cassandra reads are (as expected) slower than
	local writes, and the read speed correlates highly with the number of
	sstables read per read.

	The fourth and fifth columns show distributions of partition size and
	column count per partition. These are useful for determining if the
	table has on average skinny or wide partitions and can help you isolate
	bad data patterns. For example if you have a single cell that is 2
	megabytes, that is probably going to cause some heap pressure when it's
	read.

	[[nodetool-tpstats]]
	== Threadpool State

	You can use `nodetool tpstats` to view the current outstanding requests
	on a particular node. This is useful for trying to find out which
	resource (read threads, write threads, compaction, request response
	threads) the Cassandra process lacks. For example:

	[source, bash]
	----
	$ nodetool tpstats
	Pool Name Active Pending Completed Blocked All time blocked
	ReadStage 2 0 12 0 0
	MiscStage 0 0 0 0 0
	CompactionExecutor 0 0 1940 0 0
	MutationStage 0 0 0 0 0
	GossipStage 0 0 10293 0 0
	Repair-Task 0 0 0 0 0
	RequestResponseStage 0 0 16 0 0
	ReadRepairStage 0 0 0 0 0
	CounterMutationStage 0 0 0 0 0
	MemtablePostFlush 0 0 83 0 0
	ValidationExecutor 0 0 0 0 0
	MemtableFlushWriter 0 0 30 0 0
	ViewMutationStage 0 0 0 0 0
	CacheCleanupExecutor 0 0 0 0 0
	MemtableReclaimMemory 0 0 30 0 0
	PendingRangeCalculator 0 0 11 0 0
	SecondaryIndexManagement 0 0 0 0 0
	HintsDispatcher 0 0 0 0 0
	Native-Transport-Requests 0 0 192 0 0
	MigrationStage 0 0 14 0 0
	PerDiskMemtableFlushWriter_0 0 0 30 0 0
	Sampler 0 0 0 0 0
	ViewBuildExecutor 0 0 0 0 0
	InternalResponseStage 0 0 0 0 0
	AntiEntropyStage 0 0 0 0 0

	Message type Dropped Latency waiting in queue (micros)
	50% 95% 99% Max
	READ 0 N/A N/A N/A N/A
	RANGE_SLICE 0 0.00 0.00 0.00 0.00
	_TRACE 0 N/A N/A N/A N/A
	HINT 0 N/A N/A N/A N/A
	MUTATION 0 N/A N/A N/A N/A
	COUNTER_MUTATION 0 N/A N/A N/A N/A
	BATCH_STORE 0 N/A N/A N/A N/A
	BATCH_REMOVE 0 N/A N/A N/A N/A
	REQUEST_RESPONSE 0 0.00 0.00 0.00 0.00
	PAGED_RANGE 0 N/A N/A N/A N/A
	READ_REPAIR 0 N/A N/A N/A N/A
	----

	This command shows you all kinds of interesting statistics. The first
	section shows a detailed breakdown of threadpools for each Cassandra
	stage, including how many threads are current executing (Active) and how
	many are waiting to run (Pending). Typically if you see pending
	executions in a particular threadpool that indicates a problem localized
	to that type of operation. For example if the `RequestResponseState`
	queue is backing up, that means that the coordinators are waiting on a
	lot of downstream replica requests and may indicate a lack of token
	awareness, or very high consistency levels being used on read requests
	(for example reading at `ALL` ties up RF `RequestResponseState` threads
	whereas `LOCAL_ONE` only uses a single thread in the `ReadStage`
	threadpool). On the other hand if you see a lot of pending compactions
	that may indicate that your compaction threads cannot keep up with the
	volume of writes and you may need to tune either the compaction strategy
	or the `concurrent_compactors` or `compaction_throughput` options.

	The second section shows drops (errors) and latency distributions for
	all the major request types. Drops are cumulative since process start,
	but if you have any that indicate a serious problem as the default
	timeouts to qualify as a drop are quite high (~5-10 seconds). Dropped
	messages often warrants further investigation.

	[[nodetool-compactionstats]]
	== Compaction State

	As Cassandra is a LSM datastore, Cassandra sometimes has to compact
	sstables together, which can have adverse effects on performance. In
	particular, compaction uses a reasonable quantity of CPU resources,
	invalidates large quantities of the OS
	https://en.wikipedia.org/wiki/Page_cache[page cache], and can put a lot
	of load on your disk drives. There are great `os tools <os-iostat>` to
	determine if this is the case, but often it's a good idea to check if
	compactions are even running using `nodetool compactionstats`:

	[source, bash]
	----
	$ nodetool compactionstats
	pending tasks: 2
	- keyspace.table: 2

	id compaction type keyspace table completed total unit progress
	2062b290-7f3a-11e8-9358-cd941b956e60 Compaction keyspace table 21848273 97867583 bytes 22.32%
	Active compaction remaining time : 0h00m04s
	----

	In this case there is a single compaction running on the
	`keyspace.table` table, has completed 21.8 megabytes of 97 and Cassandra
	estimates (based on the configured compaction throughput) that this will
	take 4 seconds. You can also pass `-H` to get the units in a human
	readable format.

	Generally each running compaction can consume a single core, but the
	more you do in parallel the faster data compacts. Compaction is crucial
	to ensuring good read performance so having the right balance of
	concurrent compactions such that compactions complete quickly but don't
	take too many resources away from query threads is very important for
	performance. If you notice compaction unable to keep up, try tuning
	Cassandra's `concurrent_compactors` or `compaction_throughput` options.