doc/modules/cassandra/pages/operating/hints.adoc - cassandra - Git at Google

 = Hints

 Hinting is a data repair technique applied during write operations. When
 replica nodes are unavailable to accept a mutation, either due to
 failure or more commonly routine maintenance, coordinators attempting to
 write to those replicas store temporary hints on their local filesystem
 for later application to the unavailable replica. Hints are an important
 way to help reduce the duration of data inconsistency. Coordinators
 replay hints quickly after unavailable replica nodes return to the ring.
 Hints are best effort, however, and do not guarantee eventual
 consistency like xref:operating/repair.adoc[`anti-entropy repair`] does.

 Hints are useful because of how Apache Cassandra replicates data to
 provide fault tolerance, high availability and durability. Cassandra
 xref:architecture/dynamo.adoc#consistent-hashing-using-a-token-ring[`partitions data across the cluster`] using
 consistent hashing, and then replicates keys to multiple nodes along the
 hash ring. To guarantee availability, all replicas of a key can accept
 mutations without consensus, but this means it is possible for some
 replicas to accept a mutation while others do not. When this happens an
 inconsistency is introduced.

 Hints are one of the three ways, in addition to read-repair and
 full/incremental anti-entropy repair, that Cassandra implements the
 eventual consistency guarantee that all updates are eventually received
 by all replicas. Hints, like read-repair, are best effort and not an
 alternative to performing full repair, but they do help reduce the
 duration of inconsistency between replicas in practice.

 == Hinted Handoff

 Hinted handoff is the process by which Cassandra applies hints to
 unavailable nodes.

 For example, consider a mutation is to be made at `Consistency Level`
 `LOCAL_QUORUM` against a keyspace with `Replication Factor` of `3`.
 Normally the client sends the mutation to a single coordinator, who then
 sends the mutation to all three replicas, and when two of the three
 replicas acknowledge the mutation the coordinator responds successfully
 to the client. If a replica node is unavailable, however, the
 coordinator stores a hint locally to the filesystem for later
 application. New hints will be retained for up to
 `max_hint_window_in_ms` of downtime (defaults to `3 hours`). If the
 unavailable replica does return to the cluster before the window
 expires, the coordinator applies any pending hinted mutations against
 the replica to ensure that eventual consistency is maintained.

 image::hints.svg[Hinted Handoff in Action]

 * (`t0`): The write is sent by the client, and the coordinator sends it
 to the three replicas. Unfortunately `replica_2` is restarting and
 cannot receive the mutation.
 * (`t1`): The client receives a quorum acknowledgement from the
 coordinator. At this point the client believe the write to be durable
 and visible to reads (which it is).
 * (`t2`): After the write timeout (default `2s`), the coordinator
 decides that `replica_2` is unavailable and stores a hint to its local
 disk.
 * (`t3`): Later, when `replica_2` starts back up it sends a gossip
 message to all nodes, including the coordinator.
 * (`t4`): The coordinator replays hints including the missed mutation
 against `replica_2`.

 If the node does not return in time, the destination replica will be
 permanently out of sync until either read-repair or full/incremental
 anti-entropy repair propagates the mutation.

 === Application of Hints

 Hints are streamed in bulk, a segment at a time, to the target replica
 node and the target node replays them locally. After the target node has
 replayed a segment it deletes the segment and receives the next segment.
 This continues until all hints are drained.

 === Storage of Hints on Disk

 Hints are stored in flat files in the coordinator node’s
 `$CASSANDRA_HOME/data/hints` directory. A hint includes a hint id, the
 target replica node on which the mutation is meant to be stored, the
 serialized mutation (stored as a blob) that couldn't be delivered to the
 replica node, the mutation timestamp, and the Cassandra version used to
 serialize the mutation. By default hints are compressed using
 `LZ4Compressor`. Multiple hints are appended to the same hints file.

 Since hints contain the original unmodified mutation timestamp, hint
 application is idempotent and cannot overwrite a future mutation.

 === Hints for Timed Out Write Requests

 Hints are also stored for write requests that time out. The
 `write_request_timeout_in_ms` setting in `cassandra.yaml` configures the
 timeout for write requests.

 [source,none]
 ----
 write_request_timeout_in_ms: 2000
 ----

 The coordinator waits for the configured amount of time for write
 requests to complete, at which point it will time out and generate a
 hint for the timed out request. The lowest acceptable value for
 `write_request_timeout_in_ms` is 10 ms.

 == Configuring Hints

 Hints are enabled by default as they are critical for data consistency.
 The `cassandra.yaml` configuration file provides several settings for
 configuring hints:

 Table 1. Settings for Hints

 [width="100%",cols="38%,36%,26%",]
 |===
 |Setting |Description |Default Value

 |`hinted_handoff_enabled` |Enables/Disables hinted handoffs |`true`

 |`hinted_handoff_disabled_datacenters` a|
 A list of data centers that do not perform hinted handoffs even when
 handoff is otherwise enabled. Example:

 [source,yaml]
 ----
 hinted_handoff_disabled_datacenters:
   - DC1
   - DC2
 ----

 |`unset`

 |`max_hint_window_in_ms` |Defines the maximum amount of time (ms) a node
 shall have hints generated after it has failed. |`10800000` # 3 hours

 |`hinted_handoff_throttle_in_kb` |Maximum throttle in KBs per second,
 per delivery thread. This will be reduced proportionally to the number
 of nodes in the cluster. (If there are two nodes in the cluster, each
 delivery thread will use the maximum rate; if there are 3, each will
 throttle to half of the maximum,since it is expected for two nodes to be
 delivering hints simultaneously.) |`1024`

 |`max_hints_delivery_threads` |Number of threads with which to deliver
 hints; Consider increasing this number when you have multi-dc
 deployments, since cross-dc handoff tends to be slower |`2`

 |`hints_directory` |Directory where Cassandra stores hints.
 |`$CASSANDRA_HOME/data/hints`

 |`hints_flush_period_in_ms` |How often hints should be flushed from the
 internal buffers to disk. Will _not_ trigger fsync. |`10000`

 |`max_hints_file_size_in_mb` |Maximum size for a single hints file, in
 megabytes. |`128`

 |`hints_compression` |Compression to apply to the hint files. If
 omitted, hints files will be written uncompressed. LZ4, Snappy, and
 Deflate compressors are supported. |`LZ4Compressor`
 |===

 == Configuring Hints at Runtime with `nodetool`

 `nodetool` provides several commands for configuring hints or getting
 hints related information. The nodetool commands override the
 corresponding settings if any in `cassandra.yaml` for the node running
 the command.

 Table 2. Nodetool Commands for Hints

 [width="100%",cols="43%,57%",]
 |===
 |Command |Description

 |`nodetool disablehandoff` |Disables storing and delivering hints

 |`nodetool disablehintsfordc` |Disables storing and delivering hints to
 a data center

 |`nodetool enablehandoff` |Re-enables future hints storing and delivery
 on the current node

 |`nodetool enablehintsfordc` |Enables hints for a data center that was
 previously disabled

 |`nodetool getmaxhintwindow` |Prints the max hint window in ms. New in
 Cassandra 4.0.

 |`nodetool handoffwindow` |Prints current hinted handoff window

 |`nodetool pausehandoff` |Pauses hints delivery process

 |`nodetool resumehandoff` |Resumes hints delivery process

 |`nodetool sethintedhandoffthrottlekb` |Sets hinted handoff throttle in
 kb per second, per delivery thread

 |`nodetool setmaxhintwindow` |Sets the specified max hint window in ms

 |`nodetool statushandoff` |Status of storing future hints on the current
 node

 |`nodetool truncatehints` |Truncates all hints on the local node, or
 truncates hints for the endpoint(s) specified.
 |===

 === Make Hints Play Faster at Runtime

 The default of `1024 kbps` handoff throttle is conservative for most
 modern networks, and it is entirely possible that in a simple node
 restart you may accumulate many gigabytes hints that may take hours to
 play back. For example if you are ingesting `100 Mbps` of data per node,
 a single 10 minute long restart will create
 `10 minutes * (100 megabit / second) ~= 7 GiB` of data which at
 `(1024 KiB / second)` would take
 `7.5 GiB / (1024 KiB / second) = 2.03 hours` to play back. The exact
 math depends on the load balancing strategy (round robin is better than
 token aware), number of tokens per node (more tokens is better than
 fewer), and naturally the cluster's write rate, but regardless you may
 find yourself wanting to increase this throttle at runtime.

 If you find yourself in such a situation, you may consider raising the
 `hinted_handoff_throttle` dynamically via the
 `nodetool sethintedhandoffthrottlekb` command.

 === Allow a Node to be Down Longer at Runtime

 Sometimes a node may be down for more than the normal
 `max_hint_window_in_ms`, (default of three hours), but the hardware and
 data itself will still be accessible. In such a case you may consider
 raising the `max_hint_window_in_ms` dynamically via the
 `nodetool setmaxhintwindow` command added in Cassandra 4.0
 (https://issues.apache.org/jira/browse/CASSANDRA-11720[CASSANDRA-11720]).
 This will instruct Cassandra to continue holding hints for the down
 endpoint for a longer amount of time.

 This command should be applied on all nodes in the cluster that may be
 holding hints. If needed, the setting can be applied permanently by
 setting the `max_hint_window_in_ms` setting in `cassandra.yaml` followed
 by a rolling restart.

 == Monitoring Hint Delivery

 Cassandra 4.0 adds histograms available to understand how long it takes
 to deliver hints which is useful for operators to better identify
 problems
 (https://issues.apache.org/jira/browse/CASSANDRA-13234[CASSANDRA-13234]).

 There are also metrics available for tracking
 `Hinted Handoff <handoff-metrics>` and
 `Hints Service <hintsservice-metrics>` metrics.
	= Hints

	Hinting is a data repair technique applied during write operations. When
	replica nodes are unavailable to accept a mutation, either due to
	failure or more commonly routine maintenance, coordinators attempting to
	write to those replicas store temporary hints on their local filesystem
	for later application to the unavailable replica. Hints are an important
	way to help reduce the duration of data inconsistency. Coordinators
	replay hints quickly after unavailable replica nodes return to the ring.
	Hints are best effort, however, and do not guarantee eventual
	consistency like xref:operating/repair.adoc[`anti-entropy repair`] does.

	Hints are useful because of how Apache Cassandra replicates data to
	provide fault tolerance, high availability and durability. Cassandra
	xref:architecture/dynamo.adoc#consistent-hashing-using-a-token-ring[`partitions data across the cluster`] using
	consistent hashing, and then replicates keys to multiple nodes along the
	hash ring. To guarantee availability, all replicas of a key can accept
	mutations without consensus, but this means it is possible for some
	replicas to accept a mutation while others do not. When this happens an
	inconsistency is introduced.

	Hints are one of the three ways, in addition to read-repair and
	full/incremental anti-entropy repair, that Cassandra implements the
	eventual consistency guarantee that all updates are eventually received
	by all replicas. Hints, like read-repair, are best effort and not an
	alternative to performing full repair, but they do help reduce the
	duration of inconsistency between replicas in practice.

	== Hinted Handoff

	Hinted handoff is the process by which Cassandra applies hints to
	unavailable nodes.

	For example, consider a mutation is to be made at `Consistency Level`
	`LOCAL_QUORUM` against a keyspace with `Replication Factor` of `3`.
	Normally the client sends the mutation to a single coordinator, who then
	sends the mutation to all three replicas, and when two of the three
	replicas acknowledge the mutation the coordinator responds successfully
	to the client. If a replica node is unavailable, however, the
	coordinator stores a hint locally to the filesystem for later
	application. New hints will be retained for up to
	`max_hint_window_in_ms` of downtime (defaults to `3 hours`). If the
	unavailable replica does return to the cluster before the window
	expires, the coordinator applies any pending hinted mutations against
	the replica to ensure that eventual consistency is maintained.

	image::hints.svg[Hinted Handoff in Action]

	* (`t0`): The write is sent by the client, and the coordinator sends it
	to the three replicas. Unfortunately `replica_2` is restarting and
	cannot receive the mutation.
	* (`t1`): The client receives a quorum acknowledgement from the
	coordinator. At this point the client believe the write to be durable
	and visible to reads (which it is).
	* (`t2`): After the write timeout (default `2s`), the coordinator
	decides that `replica_2` is unavailable and stores a hint to its local
	disk.
	* (`t3`): Later, when `replica_2` starts back up it sends a gossip
	message to all nodes, including the coordinator.
	* (`t4`): The coordinator replays hints including the missed mutation
	against `replica_2`.

	If the node does not return in time, the destination replica will be
	permanently out of sync until either read-repair or full/incremental
	anti-entropy repair propagates the mutation.

	=== Application of Hints

	Hints are streamed in bulk, a segment at a time, to the target replica
	node and the target node replays them locally. After the target node has
	replayed a segment it deletes the segment and receives the next segment.
	This continues until all hints are drained.

	=== Storage of Hints on Disk

	Hints are stored in flat files in the coordinator node’s
	`$CASSANDRA_HOME/data/hints` directory. A hint includes a hint id, the
	target replica node on which the mutation is meant to be stored, the
	serialized mutation (stored as a blob) that couldn't be delivered to the
	replica node, the mutation timestamp, and the Cassandra version used to
	serialize the mutation. By default hints are compressed using
	`LZ4Compressor`. Multiple hints are appended to the same hints file.

	Since hints contain the original unmodified mutation timestamp, hint
	application is idempotent and cannot overwrite a future mutation.

	=== Hints for Timed Out Write Requests

	Hints are also stored for write requests that time out. The
	`write_request_timeout_in_ms` setting in `cassandra.yaml` configures the
	timeout for write requests.

	[source,none]
	----
	write_request_timeout_in_ms: 2000
	----

	The coordinator waits for the configured amount of time for write
	requests to complete, at which point it will time out and generate a
	hint for the timed out request. The lowest acceptable value for
	`write_request_timeout_in_ms` is 10 ms.

	== Configuring Hints

	Hints are enabled by default as they are critical for data consistency.
	The `cassandra.yaml` configuration file provides several settings for
	configuring hints:

	Table 1. Settings for Hints

	[width="100%",cols="38%,36%,26%",]
	\|===
	\|Setting \|Description \|Default Value

	\|`hinted_handoff_enabled` \|Enables/Disables hinted handoffs \|`true`

	\|`hinted_handoff_disabled_datacenters` a\|
	A list of data centers that do not perform hinted handoffs even when
	handoff is otherwise enabled. Example:

	[source,yaml]
	----
	hinted_handoff_disabled_datacenters:
	- DC1
	- DC2
	----

	\|`unset`

	\|`max_hint_window_in_ms` \|Defines the maximum amount of time (ms) a node
	shall have hints generated after it has failed. \|`10800000` # 3 hours

	\|`hinted_handoff_throttle_in_kb` \|Maximum throttle in KBs per second,
	per delivery thread. This will be reduced proportionally to the number
	of nodes in the cluster. (If there are two nodes in the cluster, each
	delivery thread will use the maximum rate; if there are 3, each will
	throttle to half of the maximum,since it is expected for two nodes to be
	delivering hints simultaneously.) \|`1024`

	\|`max_hints_delivery_threads` \|Number of threads with which to deliver
	hints; Consider increasing this number when you have multi-dc
	deployments, since cross-dc handoff tends to be slower \|`2`

	\|`hints_directory` \|Directory where Cassandra stores hints.
	\|`$CASSANDRA_HOME/data/hints`

	\|`hints_flush_period_in_ms` \|How often hints should be flushed from the
	internal buffers to disk. Will _not_ trigger fsync. \|`10000`

	\|`max_hints_file_size_in_mb` \|Maximum size for a single hints file, in
	megabytes. \|`128`

	\|`hints_compression` \|Compression to apply to the hint files. If
	omitted, hints files will be written uncompressed. LZ4, Snappy, and
	Deflate compressors are supported. \|`LZ4Compressor`
	\|===

	== Configuring Hints at Runtime with `nodetool`

	`nodetool` provides several commands for configuring hints or getting
	hints related information. The nodetool commands override the
	corresponding settings if any in `cassandra.yaml` for the node running
	the command.

	Table 2. Nodetool Commands for Hints

	[width="100%",cols="43%,57%",]
	\|===
	\|Command \|Description

	\|`nodetool disablehandoff` \|Disables storing and delivering hints

	\|`nodetool disablehintsfordc` \|Disables storing and delivering hints to
	a data center

	\|`nodetool enablehandoff` \|Re-enables future hints storing and delivery
	on the current node

	\|`nodetool enablehintsfordc` \|Enables hints for a data center that was
	previously disabled

	\|`nodetool getmaxhintwindow` \|Prints the max hint window in ms. New in
	Cassandra 4.0.

	\|`nodetool handoffwindow` \|Prints current hinted handoff window

	\|`nodetool pausehandoff` \|Pauses hints delivery process

	\|`nodetool resumehandoff` \|Resumes hints delivery process

	\|`nodetool sethintedhandoffthrottlekb` \|Sets hinted handoff throttle in
	kb per second, per delivery thread

	\|`nodetool setmaxhintwindow` \|Sets the specified max hint window in ms

	\|`nodetool statushandoff` \|Status of storing future hints on the current
	node

	\|`nodetool truncatehints` \|Truncates all hints on the local node, or
	truncates hints for the endpoint(s) specified.
	\|===

	=== Make Hints Play Faster at Runtime

	The default of `1024 kbps` handoff throttle is conservative for most
	modern networks, and it is entirely possible that in a simple node
	restart you may accumulate many gigabytes hints that may take hours to
	play back. For example if you are ingesting `100 Mbps` of data per node,
	a single 10 minute long restart will create
	`10 minutes * (100 megabit / second) ~= 7 GiB` of data which at
	`(1024 KiB / second)` would take
	`7.5 GiB / (1024 KiB / second) = 2.03 hours` to play back. The exact
	math depends on the load balancing strategy (round robin is better than
	token aware), number of tokens per node (more tokens is better than
	fewer), and naturally the cluster's write rate, but regardless you may
	find yourself wanting to increase this throttle at runtime.

	If you find yourself in such a situation, you may consider raising the
	`hinted_handoff_throttle` dynamically via the
	`nodetool sethintedhandoffthrottlekb` command.

	=== Allow a Node to be Down Longer at Runtime

	Sometimes a node may be down for more than the normal
	`max_hint_window_in_ms`, (default of three hours), but the hardware and
	data itself will still be accessible. In such a case you may consider
	raising the `max_hint_window_in_ms` dynamically via the
	`nodetool setmaxhintwindow` command added in Cassandra 4.0
	(https://issues.apache.org/jira/browse/CASSANDRA-11720[CASSANDRA-11720]).
	This will instruct Cassandra to continue holding hints for the down
	endpoint for a longer amount of time.

	This command should be applied on all nodes in the cluster that may be
	holding hints. If needed, the setting can be applied permanently by
	setting the `max_hint_window_in_ms` setting in `cassandra.yaml` followed
	by a rolling restart.

	== Monitoring Hint Delivery

	Cassandra 4.0 adds histograms available to understand how long it takes
	to deliver hints which is useful for operators to better identify
	problems
	(https://issues.apache.org/jira/browse/CASSANDRA-13234[CASSANDRA-13234]).

	There are also metrics available for tracking
	`Hinted Handoff <handoff-metrics>` and
	`Hints Service <hintsservice-metrics>` metrics.