Metrics

Ratis Server

StateMachine Metrics

ApplicationComponentNameTypeDescription
ratisstate_machineappliedIndexGaugeApplied index of state machine
ratisstate_machineapplyCompletedIndexGaugeLast log index which completely applied to the state machine
ratisstate_machinetakeSnapshotTimerTime taken for state machine to take a snapshot

Leader Election Metrics

ApplicationComponentNameTypeDescription
ratisleader_electionelectionCountCounterNumber of leader elections of this group
ratisleader_electiontimeoutCountCounterNumber of election timeouts of this peer
ratisleader_electionelectionTimeTimerTime spent on leader election
ratisleader_electionlastLeaderElapsedTimeGaugeTime elapsed since last hearing from an active leader
ratisleader_electiontransferLeadershipCountCounterNumber of transferLeader requests
ratisleader_electionlastLeaderElectionElapsedTimeGaugeTime elapsed since last leader election

Log Appender Metrics

ApplicationComponentNameTypeDescription
ratislog_appenderfollower_{peer}_next_indexGaugeNext index of peer
ratislog_appenderfollower_{peer}_match_indexGaugeMatch index of peer
ratislog_appenderfollower_{peer}_rpc_response_timeGaugeTime elapsed since peer's last rpc response

Raft Log Metrics

ApplicationComponentNameTypeDescription
ratislog_workermetadataLogEntryCountCounterNumber of metadata(term-index) log entries
ratislog_workerconfigLogEntryCountCounterNumber of configuration log entries
ratislog_workerstateMachineLogEntryCountCounterNumber of statemachine log entries
ratislog_workerflushTimeTimerTime taken to flush log
ratislog_workerflushCountCounterNumber of times of log-flush invoked
ratislog_workersyncTimeTimerTime taken to log sync (fsync)
ratislog_workerdataQueueSizeGaugeRaft log data queue size which at any time gives the number of log related operations in the queue
ratislog_workerworkerQueueSizeGaugeRaft log worker queue size which at any time gives number of committed entries that are to be synced
ratislog_workersyncBatchSizeGaugeNumber of raft log entries synced in each flush call
ratislog_workercacheMissCountCounterCount of RaftLogCache Misses
ratislog_workercacheHitCountCounterCount of RaftLogCache Hits
ratislog_workerclosedSegmentsNumGaugeNumber of closed raft log segments
ratislog_workerclosedSegmentsSizeInBytesGaugeSize of closed raft log segments in bytes
ratislog_workeropenSegmentSizeInBytesGaugeSize of open raft log segment in bytes
ratislog_workerappendEntryLatencyTimerTotal time taken to append a raft log entry
ratislog_workerenqueuedTimeTimerTime spent by a Raft log operation in the queue
ratislog_workerqueueingDelayTimerTime taken for a Raft log operation to get into the queue after being requested, waiting queue to be non-full
ratislog_worker{operation}ExecutionTimeTimerTime taken for a Raft log operation(open/close/flush/write/purge) to complete execution
ratislog_workerappendEntryCountCounterNumber of entries appended to the raft log
ratislog_workerpurgeLogTimerTime taken for Raft log purge operation to complete execution
ratislog_workernumStateMachineDataWriteTimeoutCounterNumber of statemachine dataApi write timeouts
ratislog_workernumStateMachineDataReadTimeoutCounterNumber of statemachine dataApi read timeouts
ratislog_workerreadEntryLatencyTimerTime required to read a raft log entry from actual raft log file and create a raft log entry
ratislog_workersegmentLoadLatencyTimerTime required to load and process raft log segments during restart

Raft Server Metrics

ApplicationComponentNameTypeDescription
ratisserver{peer}_lastHeartbeatElapsedTimeGaugeTime elapsed since last heartbeat rpc response
ratisserverfollower_append_entry_latencyTimerTime taken for followers to append log entries
ratisserver{peer}_peerCommitIndexGaugeCommit index of peer
ratisserverclientReadRequestTimerTime taken to process read requests from client
ratisserverclientStaleReadRequestTimerTime taken to process stale-read requests from client
ratisserverclientWriteRequestTimerTime taken to process write requests from client
ratisserverclientWatch{level}RequestTimerTime taken to process watch(replication_level) requests from client
ratisservernumRequestQueueLimitHitsCounterNumber of (total client requests in queue) limit hits
ratisservernumRequestsByteSizeLimitHitsCounterNumber of (total size of client requests in queue) limit hits
ratisservernumResourceLimitHitsCounterSum of numRequestQueueLimitHits and numRequestsByteSizeLimitHits
ratisservernumPendingRequestInQueueGaugeNumber of pending client requests in queue
ratisservernumPendingRequestMegaByteSizeGaugeTotal size of pending client requests in queue
ratisserverretryCacheEntryCountGaugeNumber of entries in retry cache
ratisserverretryCacheHitCountGaugeNumber of retry cache hits
ratisserverretryCacheHitRateGaugeRetry cache hit rate
ratisserverretryCacheMissCountGaugeNumber of retry cache misses
ratisserverretryCacheMissRateGaugeRetry cache miss rate
ratisservernumFailedClientStaleReadOnServerCounterNumber of failed stale-read requests
ratisservernumFailedClientReadOnServerCounterNumber of failed read requests
ratisservernumFailedClientWriteOnServerCounterNumber of failed write requests
ratisservernumFailedClientWatchOnServerCounterNumber of failed watch requests
ratisservernumFailedClientStreamOnServerCounterNumber of failed stream requests
ratisservernumInstallSnapshotCounterNumber of install-snapshot requests
ratisservernumWatch{level}RequestTimeoutCounterNumber of watch(replication_level) request timeout
ratisservernumWatch{level}RequestInQueueGaugeNumber of watch(replication_level) requests in queue
ratisservernumWatch{level}RequestQueueLimitHitsCounterNumber of (total watch request in queue) limit hits

Ratis Netty Metrics

ApplicationComponentNameTypeDescription
ratis_nettystream_server{request}_latencytimerTime taken to process data stream request
ratis_nettystream_server{request}_success_reply_countCounterNumber of success replies of request
ratis_nettystream_server{request}_fail_reply_countCounterNumber of fail replies of request
ratis_nettystream_servernum_requests_{request}CounterNumber of total data stream requests

Ratis gRPC Metrics

Message Metrics

ApplicationComponentNameTypeDescription
ratisclient_message_metrics{method}_started_totalCountertotal messages started of {method}
ratisclient_message_metrics{method}_completed_totalCountertotal messages completed of {method}
ratisclient_message_metrics{method}_received_executedCountertotal messages received and executed of {method}
ratisserver_message_metrics{method}_started_totalCountertotal messages started of {method}
ratisserver_message_metrics{method}_completed_totalCountertotal messages completed of {method}
ratisserver_message_metrics{method}_received_executedCountertotal messages received and executed of {method}

gRPC Log Appender Metrics

ApplicationComponentNameTypeDescription
ratis_grpclog_appender{appendEntries}_latencyTimerLatency of method (appendEntries/heartbeat)
ratis_grpclog_appender{follower}_success_reply_countCounterNumber of success replies
ratis_grpclog_appender{follower}_not_leader_reply_countCounterNumber of NotLeader replies
ratis_grpclog_appender{follower}_inconsistency_reply_countCounterNumber of Inconsistency replies
ratis_grpclog_appender{follower}_append_entry_timeout_countCounterNumber of appendEntries timeouts
ratis_grpclog_appender{follower}_pending_log_requests_countCounterNumber of pending requests
ratis_grpclog_appendernum_retriesCounterNumber of request retries
ratis_grpclog_appendernum_requestsCounterNumber of requests in total
ratis_grpclog_appendernum_install_snapshotCounterNumber of install snapshot requests