| // Licensed to the Apache Software Foundation (ASF) under one or more |
| // contributor license agreements. See the NOTICE file distributed with |
| // this work for additional information regarding copyright ownership. |
| // The ASF licenses this file to You under the Apache License, Version 2.0 |
| // (the "License"); you may not use this file except in compliance with |
| // the License. You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, software |
| // distributed under the License is distributed on an "AS IS" BASIS, |
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| // See the License for the specific language governing permissions and |
| // limitations under the License. |
| = Metrics |
| |
| This page describes metrics registers (categories) and the metrics available in each register. |
| |
| |
| == System |
| |
| |
| System metrics such as JVM or CPU metrics. |
| |
| Register name: `sys` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name |Type| Description |
| |CpuLoad| double| CPU load. |
| |CurrentThreadCpuTime | long| ThreadMXBean.getCurrentThreadCpuTime() |
| |CurrentThreadUserTime| long | ThreadMXBean.getCurrentThreadUserTime() |
| |DaemonThreadCount| integer| ThreadMXBean.getDaemonThreadCount() |
| |GcCpuLoad |double| GC CPU load. |
| |PeakThreadCount |integer| ThreadMXBean.getPeakThreadCount |
| |SystemLoadAverage| java.lang.Double| OperatingSystemMXBean.getSystemLoadAverage() |
| |ThreadCount |integer| ThreadMXBean.getThreadCount |
| |TotalExecutedTasks |long| Total executed tasks. |
| |TotalStartedThreadCount |long| ThreadMXBean.getTotalStartedThreadCount |
| |UpTime| long | RuntimeMxBean.getUptime() |
| |memory.heap.committed| long| MemoryUsage.getHeapMemoryUsage().getCommitted() |
| |memory.heap.init | long| MemoryUsage.getHeapMemoryUsage().getInit() |
| |memory.heap.used |long| MemoryUsage.getHeapMemoryUsage().getUsed() |
| |memory.nonheap.committed| long| MemoryUsage.getNonHeapMemoryUsage().getCommitted() |
| |memory.nonheap.init |long | MemoryUsage.getNonHeapMemoryUsage().getInit() |
| |memory.nonheap.max |long | MemoryUsage.getNonHeapMemoryUsage().getMax() |
| |memory.nonheap.used |long | MemoryUsage.getNonHeapMemoryUsage().getUsed() |
| |=== |
| |
| |
| == Caches |
| |
| Cache metrics. |
| |
| Register name: `cache.{cache_name}.{near}` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name | Type | Description |
| |CacheEvictions | long|The total number of evictions from the cache. |
| |CacheGets |long|The total number of gets to the cache. |
| |CacheHits |long|The number of get requests that were satisfied by the cache. |
| |CacheMisses |long|A miss is a get request that is not satisfied. |
| |CachePuts |long|The total number of puts to the cache. |
| |CacheRemovals | long|The total number of removals from the cache. |
| |CacheTxCommits | long|Total number of transaction commits. |
| |CacheTxRollbacks |long|Total number of transaction rollbacks. |
| |CommitTime |histogram | Commit time in nanoseconds. |
| |CommitTimeTotal |long| The total time of commit, in nanoseconds. |
| |EntryProcessorHits | long|The total number of invocations on keys, which exist in cache. |
| |EntryProcessorInvokeTimeNanos | long|The total time of cache invocations, in nanoseconds. |
| |EntryProcessorMaxInvocationTime |long|So far, the maximum time to execute cache invokes. |
| |EntryProcessorMinInvocationTime |long|So far, the minimum time to execute cache invokes. |
| |EntryProcessorMisses |long|The total number of invocations on keys, which don't exist in cache. |
| |EntryProcessorPuts |long|The total number of cache invocations, caused update. |
| |EntryProcessorReadOnlyInvocations |long|The total number of cache invocations, caused no updates. |
| |EntryProcessorRemovals |long|The total number of cache invocations, caused removals. |
| |EstimatedRebalancingKeys|long|Number estimated to rebalance keys. |
| |GetTime |histogram| Get time in nanoseconds. |
| |GetTimeTotal|long|The total time of cache gets, in nanoseconds. |
| |IsIndexRebuildInProgress|boolean | True if index rebuild is in progress. |
| |OffHeapEvictions|long|The total number of evictions from the off-heap memory. |
| |OffHeapGets |long|The total number of get requests to the off-heap memory. |
| |OffHeapHits |long|The number of get requests that were satisfied by the off-heap memory. |
| |OffHeapMisses |long|A miss is a get request that is not satisfied by off-heap memory. |
| |OffHeapPuts |long|The total number of put requests to the off-heap memory. |
| |OffHeapRemovals |long|The total number of removals from the off-heap memory. |
| |PutTime | histogram| Put time in nanoseconds. |
| |PutTimeTotal|long|The total time of cache puts, in nanoseconds. |
| |QueryCompleted |long|Count of completed queries. |
| |QueryExecuted |long|Count of executed queries. |
| |QueryFailed |long|Count of failed queries. |
| |QueryMaximumTime |long| Maximum query execution time. |
| |QueryMinimalTime |long| Minimum query execution time. |
| |QuerySumTime |long| Query summary time. |
| |RebalanceClearingPartitionsLeft |long| Number of partitions need to be cleared before actual rebalance start. |
| |RebalanceStartTime |long| Rebalance start time. |
| |RebalancedKeys |long| Number of already rebalanced keys. |
| |RebalancingBytesRate|long|Estimated rebalancing speed in bytes. |
| |RebalancingKeysRate |long|Estimated rebalancing speed in keys. |
| |RemoveTime |histogram| Remove time in nanoseconds. |
| |RemoveTimeTotal |long|The total time of cache removal, in nanoseconds. |
| |RollbackTime|histogram| Rollback time in nanoseconds. |
| |RollbackTimeTotal |long|The total time of rollback, in nanoseconds. |
| |TotalRebalancedBytes|long|Number of already rebalanced bytes. |
| |=== |
| |
| == Cache Groups |
| |
| |
| Register name: `cacheGroups.{group_name}` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name | Type | Description |
| |AffinityPartitionsAssignmentMap |java.util.Map| Affinity partitions assignment map. |
| |Caches |java.util.ArrayList| List of caches |
| |IndexBuildCountPartitionsLeft | long| Number of partitions need processed for finished indexes create or rebuilding. |
| |LocalNodeMovingPartitionsCount |integer| Count of partitions with state MOVING for this cache group located on this node. |
| |LocalNodeOwningPartitionsCount |integer| Count of partitions with state OWNING for this cache group located on this node. |
| |LocalNodeRentingEntriesCount | long| Count of entries remains to evict in RENTING partitions located on this node for this cache group. |
| |LocalNodeRentingPartitionsCount |integer| Count of partitions with state RENTING for this cache group located on this node. |
| |MaximumNumberOfPartitionCopies | integer| Maximum number of partition copies for all partitions of this cache group. |
| |MinimumNumberOfPartitionCopies |integer| Minimum number of partition copies for all partitions of this cache group. |
| |MovingPartitionsAllocationMap |java.util.Map| Allocation map of partitions with state MOVING in the cluster. |
| |OwningPartitionsAllocationMap |java.util.Map | Allocation map of partitions with state OWNING in the cluster. |
| |PartitionIds |java.util.ArrayList| Local partition ids. |
| |SparseStorageSize | long| Storage space allocated for group adjusted for possible sparsity, in bytes. |
| |StorageSize |long| Storage space allocated for group, in bytes. |
| |TotalAllocatedPages |long| Cache group total allocated pages. |
| |TotalAllocatedSize |long| Total size of memory allocated for group, in bytes. |
| |=== |
| |
| |
| == Transactions |
| |
| Transaction metrics. |
| |
| Register name: `tx` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name | Type | Description |
| |AllOwnerTransactions| java.util.HashMap| Map of local node owning transactions. |
| |LockedKeysNumber | long| The number of keys locked on the node. |
| |OwnerTransactionsNumber |long| The number of active transactions for which this node is the initiator. |
| |TransactionsHoldingLockNumber | long| The number of active transactions holding at least one key lock. |
| |LastCommitTime |long| Last commit time. |
| |nodeSystemTimeHistogram| histogram| Transactions system times on node represented as histogram. |
| |nodeUserTimeHistogram| histogram| Transactions user times on node represented as histogram. |
| |LastRollbackTime| long| Last rollback time. |
| |totalNodeSystemTime |long| Total transactions system time on node. |
| |totalNodeUserTime |long| Total transactions user time on node. |
| |txCommits |integer| Number of transaction commits. |
| |txRollbacks |integer| Number of transaction rollbacks. |
| |=== |
| |
| |
| == Partition Map Exchange |
| |
| Partition map exchange metrics. |
| |
| Register name: `pme` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name |Type | Description |
| |CacheOperationsBlockedDuration |long | Current PME cache operations blocked duration in milliseconds. |
| |CacheOperationsBlockedDurationHistogram |histogram | Histogram of cache operations blocked PME durations in milliseconds. |
| |Duration |long | Current PME duration in milliseconds. |
| |DurationHistogram | histogram | Histogram of PME durations in milliseconds. |
| |=== |
| |
| |
| == Compute Jobs |
| |
| Register name: `compute.jobs` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name| Type| Description |
| |compute.jobs.Active |long| Number of active jobs currently executing. |
| |compute.jobs.Canceled |long| Number of cancelled jobs that are still running. |
| |compute.jobs.ExecutionTime |long| Total execution time of jobs. |
| |compute.jobs.Finished |long| Number of finished jobs. |
| |compute.jobs.Rejected |long| Number of jobs rejected after more recent collision resolution operation. |
| |compute.jobs.Started |long| Number of started jobs. |
| |compute.jobs.Waiting |long| Number of currently queued jobs waiting to be executed. |
| |compute.jobs.WaitingTime |long| Total time jobs spent on waiting queue. |
| |=== |
| |
| == Thread Pools |
| |
| Register name: `threadPools.{thread_pool_name}` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name | Type | Description |
| |ActiveCount |long | Approximate number of threads that are actively executing tasks. |
| |CompletedTaskCount| long | Approximate total number of tasks that have completed execution. |
| |CorePoolSize |long | The core number of threads. |
| |KeepAliveTime| long | Thread keep-alive time, which is the amount of time which threads in excess of the core pool size may remain idle before being terminated. |
| |LargestPoolSize| long | Largest number of threads that have ever simultaneously been in the pool. |
| |MaximumPoolSize |long | The maximum allowed number of threads. |
| |PoolSize |long| Current number of threads in the pool. |
| |QueueSize |long | Current size of the execution queue. |
| |RejectedExecutionHandlerClass| string | Class name of current rejection handler. |
| |Shutdown | boolean| True if this executor has been shut down. |
| |TaskCount | long | Approximate total number of tasks that have been scheduled for execution. |
| |Terminated |boolean| True if all tasks have completed following shut down. |
| |Terminating |long| True if terminating but not yet terminated. |
| |ThreadFactoryClass| string| Class name of thread factory used to create new threads. |
| |=== |
| |
| |
| == Cache Group IO |
| |
| Register name: `io.statistics.cacheGroups.{group_name}` |
| |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name | Type | Description |
| |LOGICAL_READS | long | Number of logical reads |
| |PHYSICAL_READS | long | Number of physical reads |
| |grpId | integer | Group id |
| |name | string | Name of the index |
| |startTime | long | Statistics collect start time |
| |=== |
| |
| |
| == Sorted Indexes |
| |
| Register name: `io.statistics.sortedIndexes.{cache_name}.{index_name}` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name | Type | Description |
| |LOGICAL_READS_INNER |long| Number of logical reads for inner tree node |
| |LOGICAL_READS_LEAF | long | Number of logical reads for leaf tree node |
| |PHYSICAL_READS_INNER| long| Number of physical reads for inner tree node |
| |PHYSICAL_READS_LEAF| long| Number of physical reads for leaf tree node |
| |indexName| string| Name of the index |
| |name| string| Name of the cache |
| |startTime| long| Statistics collection start time |
| |=== |
| |
| |
| == Hash Indexes |
| |
| Register name: `io.statistics.hashIndexes.{cache_name}.{index_name}` |
| |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name | Type| Description |
| |LOGICAL_READS_INNER| long| Number of logical reads for inner tree node |
| |LOGICAL_READS_LEAF| long| Number of logical reads for leaf tree node |
| |PHYSICAL_READS_INNER| long| Number of physical reads for inner tree node |
| |PHYSICAL_READS_LEAF| long| Number of physical reads for leaf tree node |
| |indexName| string| Name of the index |
| |name| string| Name of the cache |
| |startTime| long| Statistics collection start time |
| |=== |
| |
| |
| == Communication IO |
| |
| Register name: `io.communication` |
| |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name| Type| Description |
| |OutboundMessagesQueueSize| integer| Outbound messages queue size. |
| |SentMessagesCount | integer| Sent messages count. |
| |SentBytesCount | long | Sent bytes count. |
| |ReceivedBytesCount| long| Received bytes count. |
| |ReceivedMessagesCount| integer| Received messages count. |
| |=== |
| |
| |
| == Data Region IO |
| |
| Register name: `io.dataregion.{data_region_name}` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name | Type | Description |
| |AllocationRate | long| Allocation rate (pages per second) averaged across rateTimeInternal. |
| |CheckpointBufferSize | long | Checkpoint buffer size in bytes. |
| |DirtyPages | long| Number of pages in memory not yet synchronized with persistent storage. |
| |EmptyDataPages| long| Calculates empty data pages count for region. It counts only totally free pages that can be reused (e. g. pages that are contained in reuse bucket of free list). |
| |EvictionRate| long| Eviction rate (pages per second). |
| |LargeEntriesPagesCount| long| Count of pages that fully ocupied by large entries that go beyond page size |
| |OffHeapSize| long| Offheap size in bytes. |
| |OffheapUsedSize| long| Offheap used size in bytes. |
| |PagesFillFactor| double| The percentage of the used space. |
| |PagesRead| long| Number of pages read from last restart. |
| |PagesReplaceAge| long| Average age at which pages in memory are replaced with pages from persistent storage (milliseconds). |
| |PagesReplaceRate| long| Rate at which pages in memory are replaced with pages from persistent storage (pages per second). |
| |PagesReplaced| long| Number of pages replaced from last restart. |
| |PagesWritten| long| Number of pages written from last restart. |
| |PhysicalMemoryPages| long| Number of pages residing in physical RAM. |
| |PhysicalMemorySize | long| Gets total size of pages loaded to the RAM, in bytes |
| |TotalAllocatedPages |long| Total number of allocated pages. |
| |TotalAllocatedSize| long | Gets a total size of memory allocated in the data region, in bytes |
| |TotalThrottlingTime| long| Total throttling threads time in milliseconds. The Ignite throttles threads that generate dirty pages during the ongoing checkpoint. |
| |UsedCheckpointBufferSize | long| Gets used checkpoint buffer size in bytes |
| |
| |=== |
| |
| |
| == Data Storage |
| |
| Data Storage metrics. |
| |
| Register name: `io.datastorage` |
| |
| [cols="2,1,3",opts="header"] |
| |=== |
| |Name | Type | Description |
| |CheckpointTotalTime| long | Total duration of checkpoint |
| |LastCheckpointCopiedOnWritePagesNumber| long | Number of pages copied to a temporary checkpoint buffer during the last checkpoint. |
| |LastCheckpointDataPagesNumber| long | Total number of data pages written during the last checkpoint. |
| |LastCheckpointDuration | long | Duration of the last checkpoint in milliseconds. |
| |LastCheckpointFsyncDuration| long | Duration of the sync phase of the last checkpoint in milliseconds. |
| |LastCheckpointLockWaitDuration| long| Duration of the checkpoint lock wait in milliseconds. |
| |LastCheckpointMarkDuration | long | Duration of the checkpoint lock wait in milliseconds. |
| |LastCheckpointPagesWriteDuration| long| Duration of the checkpoint pages write in milliseconds. |
| |LastCheckpointTotalPagesNumber| long| Total number of pages written during the last checkpoint. |
| |SparseStorageSize | long| Storage space allocated adjusted for possible sparsity, in bytes. |
| |StorageSize | long| Storage space allocated, in bytes. |
| |WalArchiveSegments | integer| Current number of WAL segments in the WAL archive. |
| |WalBuffPollSpinsRate| long | WAL buffer poll spins number over the last time interval. |
| |WalFsyncTimeDuration | long | Total duration of fsync |
| |WalFsyncTimeNum |long | Total count of fsync |
| |WalLastRollOverTime |long | Time of the last WAL segment rollover. |
| |WalLoggingRate | long| Average number of WAL records per second written during the last time interval. |
| |WalTotalSize| long | Total size in bytes for storage wal files. |
| |WalWritingRate| long | Average number of bytes per second written during the last time interval. |
| |=== |