| import{_ as s,c as i,b as a,d as e,t as n,o as r}from"./app-W3EENNaa.js";const d={};function l(o,t){return r(),i("div",null,[t[102]||(t[102]=a('<p>Along with IoTDB running, we hope to observe the status of IoTDB, so as to troubleshoot system problems or discover potential system risks in time. A series of metrics that can <strong>reflect the operating status of the system</strong> are system monitoring metrics.</p><h2 id="_1-when-to-use-metric-framework" tabindex="-1"><a class="header-anchor" href="#_1-when-to-use-metric-framework"><span>1. When to use metric framework?</span></a></h2><p>Belows are some typical application scenarios</p><ol><li><p>System is running slowly</p><p>When system is running slowly, we always hope to have information about system's running status as detail as possible, such as:</p><ul><li>JVM:Is there FGC? How long does it cost? How much does the memory usage decreased after GC? Are there lots of threads?</li><li>System:Is the CPU usage too hi?Are there many disk IOs?</li><li>Connections:How many connections are there in the current time?</li><li>Interface:What is the TPS and latency of every interface?</li><li>Thread Pool:Are there many pending tasks?</li><li>Cache Hit Ratio</li></ul></li><li><p>No space left on device</p><p>When meet a "no space left on device" error, we really want to know which kind of data file had a rapid rise in the past hours.</p></li><li><p>Is the system running in abnormal status</p><p>We could use the count of error logs、the alive status of nodes in cluster, etc, to determine whether the system is running abnormally.</p></li></ol><h2 id="_2-who-will-use-metric-framework" tabindex="-1"><a class="header-anchor" href="#_2-who-will-use-metric-framework"><span>2. Who will use metric framework?</span></a></h2><p>Any person cares about the system's status, including but not limited to RD, QA, SRE, DBA, can use the metrics to work<br> more efficiently.</p><h2 id="_3-what-is-metrics" tabindex="-1"><a class="header-anchor" href="#_3-what-is-metrics"><span>3. What is metrics?</span></a></h2><h3 id="_3-1-key-concept" tabindex="-1"><a class="header-anchor" href="#_3-1-key-concept"><span>3.1. Key Concept</span></a></h3><p>In IoTDB's metric module, each metrics is uniquely identified by <code>Metric Name</code> and <code>Tags</code>.</p><ul><li><code>Metric Name</code>: Metric type name, such as <code>logback_events</code> means log events.</li><li><code>Tags</code>: indicator classification, in the form of Key-Value pairs, each indicator can have 0 or more categories, common<br> Key-Value pairs: <ul><li><code>name = xxx</code>: The name of the monitored object, which is the description of <strong>business logic</strong>. For example, for a<br> monitoring item of type <code>Metric Name = entry_seconds_count</code>, the meaning of name refers to the monitored business<br> interface.</li><li><code>type = xxx</code>: Monitoring indicator type subdivision, which is a description of <strong>monitoring indicator</strong> itself.<br> For example, for monitoring items of type <code>Metric Name = point</code>, the meaning of type refers to the specific type<br> of monitoring points.</li><li><code>status = xxx</code>: The status of the monitored object is a description of <strong>business logic</strong>. For example, for<br> monitoring items of type <code>Metric Name = Task</code>, this parameter can be used to distinguish the status of the<br> monitored object.</li><li><code>user = xxx</code>: The relevant user of the monitored object is a description of <strong>business logic</strong>. For example, count<br> the total points written by the <code>root</code> user.</li><li>Customize according to the specific situation: For example, there is a level classification under<br> logback_events_total, which is used to indicate the number of logs under a specific level.</li></ul></li><li><code>Metric Level</code>: The level of metric managing level, The default startup level is <code>Core</code> level, the recommended startup<br> level is <code>Important level</code>, and the audit strictness is <code>Core > Important > Normal > All</code><ul><li><code>Core</code>: Core metrics of the system, used by the <strong>operation and maintenance personnel</strong>, which is related to the *<br><em>performance, stability, and security</em>* of the system, such as the status of the instance, the load of the system,<br> etc.</li><li><code>Important</code>: Important metrics of the module, which is used by <strong>operation and maintenance and testers</strong>, and is<br> directly related to <strong>the running status of each module</strong>, such as the number of merged files, execution status,<br> etc.</li><li><code>Normal</code>: Normal metrics of the module, used by <strong>developers</strong> to facilitate <strong>locating the module</strong> when problems<br> occur, such as specific key operation situations in the merger.</li><li><code>All</code>: All metrics of the module, used by <strong>module developers</strong>, often used when the problem is reproduced, so as<br> to solve the problem quickly.</li></ul></li></ul><h3 id="_3-2-external-data-format-for-metrics" tabindex="-1"><a class="header-anchor" href="#_3-2-external-data-format-for-metrics"><span>3.2. External data format for metrics</span></a></h3><ul><li>IoTDB provides metrics in JMX, Prometheus and IoTDB formats: <ul><li>For JMX, metrics can be obtained through <code>org.apache.iotdb.metrics</code>.</li><li>For Prometheus, the value of the metrics can be obtained through the externally exposed port</li><li>External exposure in IoTDB mode: metrics can be obtained by executing IoTDB queries</li></ul></li></ul><h2 id="_4-the-detail-of-metrics" tabindex="-1"><a class="header-anchor" href="#_4-the-detail-of-metrics"><span>4. The detail of metrics</span></a></h2><p>Currently, IoTDB provides metrics for some main modules externally, and with the development of new functions and system optimization or refactoring, metrics will be added and updated synchronously.</p><p>If you want to add your own metrics data in IoTDB, please see the [IoTDB Metric Framework] (<a href="https://github.com/apache/iotdb/tree/master/metrics" target="_blank" rel="noopener noreferrer">https://github.com/apache/iotdb/tree/master/metrics</a>) document.</p><h3 id="_4-1-core-level-metrics" tabindex="-1"><a class="header-anchor" href="#_4-1-core-level-metrics"><span>4.1. Core level metrics</span></a></h3><p>Core-level metrics are enabled by default during system operation. The addition of each Core-level metrics needs to be<br> carefully evaluated. The current Core-level metrics are as follows:</p><h4 id="_4-1-1-cluster" tabindex="-1"><a class="header-anchor" href="#_4-1-1-cluster"><span>4.1.1. Cluster</span></a></h4>',18)),e("table",null,[t[5]||(t[5]=e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])],-1)),e("tbody",null,[t[3]||(t[3]=e("tr",null,[e("td",null,"config_node"),e("td",null,'name="total",status="Registered/Online/Unknown"'),e("td",null,"AutoGauge"),e("td",null,"The number of registered/online/unknown confignodes")],-1)),t[4]||(t[4]=e("tr",null,[e("td",null,"data_node"),e("td",null,'name="total",status="Registered/Online/Unknown"'),e("td",null,"AutoGauge"),e("td",null,"The number of registered/online/unknown datanodes")],-1)),e("tr",null,[t[0]||(t[0]=e("td",null,"points",-1)),e("td",null,'database="'+n(o.database)+'", type="flush"',1),t[1]||(t[1]=e("td",null,"Gauge",-1)),t[2]||(t[2]=e("td",null,"The point number of last flushed memtable",-1))])])]),t[103]||(t[103]=a('<h4 id="_4-1-2-iotdb-process" tabindex="-1"><a class="header-anchor" href="#_4-1-2-iotdb-process"><span>4.1.2. IoTDB process</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>process_cpu_load</td><td>name="process"</td><td>AutoGauge</td><td>The current CPU usage of IoTDB process, Unit: %</td></tr><tr><td>process_cpu_time</td><td>name="process"</td><td>AutoGauge</td><td>The total CPU time occupied of IoTDB process, Unit: ns</td></tr><tr><td>process_max_mem</td><td>name="memory"</td><td>AutoGauge</td><td>The maximum available memory of IoTDB process</td></tr><tr><td>process_total_mem</td><td>name="memory"</td><td>AutoGauge</td><td>The current requested memory for IoTDB process</td></tr><tr><td>process_free_mem</td><td>name="memory"</td><td>AutoGauge</td><td>The free available memory of IoTDB process</td></tr></tbody></table><h4 id="_4-1-3-system" tabindex="-1"><a class="header-anchor" href="#_4-1-3-system"><span>4.1.3. System</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>sys_cpu_load</td><td>name="system"</td><td>AutoGauge</td><td>The current CPU usage of system, Unit: %</td></tr><tr><td>sys_cpu_cores</td><td>name="system"</td><td>Gauge</td><td>The available number of CPU cores</td></tr><tr><td>sys_total_physical_memory_size</td><td>name="memory"</td><td>Gauge</td><td>The maximum physical memory of system</td></tr><tr><td>sys_free_physical_memory_size</td><td>name="memory"</td><td>AutoGauge</td><td>The current available memory of system</td></tr><tr><td>sys_total_swap_space_size</td><td>name="memory"</td><td>AutoGauge</td><td>The maximum swap space of system</td></tr><tr><td>sys_free_swap_space_size</td><td>name="memory"</td><td>AutoGauge</td><td>The available swap space of system</td></tr><tr><td>sys_committed_vm_size</td><td>name="memory"</td><td>AutoGauge</td><td>The space of virtual memory available to running processes</td></tr><tr><td>sys_disk_total_space</td><td>name="disk"</td><td>AutoGauge</td><td>The total disk space</td></tr><tr><td>sys_disk_free_space</td><td>name="disk"</td><td>AutoGauge</td><td>The available disk space</td></tr></tbody></table><h3 id="_4-2-important-level-metrics" tabindex="-1"><a class="header-anchor" href="#_4-2-important-level-metrics"><span>4.2. Important level metrics</span></a></h3><h4 id="_4-2-1-cluster" tabindex="-1"><a class="header-anchor" href="#_4-2-1-cluster"><span>4.2.1. Cluster</span></a></h4>',6)),e("table",null,[t[12]||(t[12]=e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])],-1)),e("tbody",null,[e("tr",null,[t[6]||(t[6]=e("td",null,"cluster_node_leader_count",-1)),e("td",null,'name="'+n(o.ip)+":"+n(o.port)+'"',1),t[7]||(t[7]=e("td",null,"Gauge",-1)),t[8]||(t[8]=e("td",null,"The count of consensus group leader on each node",-1))]),e("tr",null,[t[9]||(t[9]=e("td",null,"cluster_node_status",-1)),e("td",null,'name="'+n(o.ip)+":"+n(o.port)+'",type="ConfigNode/DataNode"',1),t[10]||(t[10]=e("td",null,"Gauge",-1)),t[11]||(t[11]=e("td",null,"The current node status, 0=Unkonwn 1=online",-1))])])]),t[104]||(t[104]=e("h4",{id:"_4-2-2-node",tabindex:"-1"},[e("a",{class:"header-anchor",href:"#_4-2-2-node"},[e("span",null,"4.2.2. Node")])],-1)),e("table",null,[t[24]||(t[24]=e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])],-1)),e("tbody",null,[t[19]||(t[19]=e("tr",null,[e("td",null,"quantity"),e("td",null,'name="database"'),e("td",null,"AutoGauge"),e("td",null,"The number of database")],-1)),t[20]||(t[20]=e("tr",null,[e("td",null,"quantity"),e("td",null,'name="timeSeries"'),e("td",null,"AutoGauge"),e("td",null,"The number of timeseries")],-1)),t[21]||(t[21]=e("tr",null,[e("td",null,"quantity"),e("td",null,'name="pointsIn"'),e("td",null,"Counter"),e("td",null,"The number of write points")],-1)),t[22]||(t[22]=e("tr",null,[e("td",null,"region"),e("td",null,'name="total",type="SchemaRegion"'),e("td",null,"AutoGauge"),e("td",null,"The total number of SchemaRegion in PartitionTable")],-1)),t[23]||(t[23]=e("tr",null,[e("td",null,"region"),e("td",null,'name="total",type="DataRegion"'),e("td",null,"AutoGauge"),e("td",null,"The total number of DataRegion in PartitionTable")],-1)),e("tr",null,[t[13]||(t[13]=e("td",null,"region",-1)),e("td",null,'name="'+n(o.ip)+":"+n(o.port)+'",type="SchemaRegion"',1),t[14]||(t[14]=e("td",null,"Gauge",-1)),t[15]||(t[15]=e("td",null,"The number of SchemaRegion in PartitionTable of specific node",-1))]),e("tr",null,[t[16]||(t[16]=e("td",null,"region",-1)),e("td",null,'name="'+n(o.ip)+":"+n(o.port)+'",type="DataRegion"',1),t[17]||(t[17]=e("td",null,"Gauge",-1)),t[18]||(t[18]=e("td",null,"The number of DataRegion in PartitionTable of specific node",-1))])])]),t[105]||(t[105]=e("h4",{id:"_4-2-3-iotconsensus",tabindex:"-1"},[e("a",{class:"header-anchor",href:"#_4-2-3-iotconsensus"},[e("span",null,"4.2.3. IoTConsensus")])],-1)),e("table",null,[t[58]||(t[58]=e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])],-1)),e("tbody",null,[e("tr",null,[t[25]||(t[25]=e("td",null,"mutli_leader",-1)),e("td",null,'name="logDispatcher-'+n(o.IP)+":"+n(o.Port)+'", region="'+n(o.region)+'", type="currentSyncIndex"',1),t[26]||(t[26]=e("td",null,"AutoGauge",-1)),t[27]||(t[27]=e("td",null,"The sync index of synchronization thread in replica group",-1))]),e("tr",null,[t[28]||(t[28]=e("td",null,"mutli_leader",-1)),e("td",null,'name="logDispatcher-'+n(o.IP)+":"+n(o.Port)+'", region="'+n(o.region)+'", type="cachedRequestInMemoryQueue"',1),t[29]||(t[29]=e("td",null,"AutoGauge",-1)),t[30]||(t[30]=e("td",null,"The size of cache requests of synchronization thread in replica group",-1))]),e("tr",null,[t[31]||(t[31]=e("td",null,"mutli_leader",-1)),e("td",null,'name="IoTConsensusServerImpl", region="'+n(o.region)+'", type="searchIndex"',1),t[32]||(t[32]=e("td",null,"AutoGauge",-1)),t[33]||(t[33]=e("td",null,"The write process of main process in replica group",-1))]),e("tr",null,[t[34]||(t[34]=e("td",null,"mutli_leader",-1)),e("td",null,'name="IoTConsensusServerImpl", region="'+n(o.region)+'", type="safeIndex"',1),t[35]||(t[35]=e("td",null,"AutoGauge",-1)),t[36]||(t[36]=e("td",null,"The sync index of replica group",-1))]),e("tr",null,[t[37]||(t[37]=e("td",null,"stage",-1)),e("td",null,'name="iot_consensus", region="'+n(o.region)+'", type="getStateMachineLock"',1),t[38]||(t[38]=e("td",null,"Histogram",-1)),t[39]||(t[39]=e("td",null,"The time consumed to get statemachine lock in main process",-1))]),e("tr",null,[t[40]||(t[40]=e("td",null,"stage",-1)),e("td",null,'name="iot_consensus", region="'+n(o.region)+'", type="checkingBeforeWrite"',1),t[41]||(t[41]=e("td",null,"Histogram",-1)),t[42]||(t[42]=e("td",null,"The time consumed to precheck before write in main process",-1))]),e("tr",null,[t[43]||(t[43]=e("td",null,"stage",-1)),e("td",null,'name="iot_consensus", region="'+n(o.region)+'", type="writeStateMachine"',1),t[44]||(t[44]=e("td",null,"Histogram",-1)),t[45]||(t[45]=e("td",null,"The time consumed to write statemachine in main process",-1))]),e("tr",null,[t[46]||(t[46]=e("td",null,"stage",-1)),e("td",null,'name="iot_consensus", region="'+n(o.region)+'", type="offerRequestToQueue"',1),t[47]||(t[47]=e("td",null,"Histogram",-1)),t[48]||(t[48]=e("td",null,"The time consumed to try to offer request to queue in main process",-1))]),e("tr",null,[t[49]||(t[49]=e("td",null,"stage",-1)),e("td",null,'name="iot_consensus", region="'+n(o.region)+'", type="consensusWrite"',1),t[50]||(t[50]=e("td",null,"Histogram",-1)),t[51]||(t[51]=e("td",null,"The time consumed to the whole write in main process",-1))]),e("tr",null,[t[52]||(t[52]=e("td",null,"stage",-1)),e("td",null,'name="iot_consensus", region="'+n(o.region)+'", type="constructBatch"',1),t[53]||(t[53]=e("td",null,"Histogram",-1)),t[54]||(t[54]=e("td",null,"The time consumed to construct batch in synchronization thread",-1))]),e("tr",null,[t[55]||(t[55]=e("td",null,"stage",-1)),e("td",null,'name="iot_consensus", region="'+n(o.region)+'", type="syncLogTimePerRequest"',1),t[56]||(t[56]=e("td",null,"Histogram",-1)),t[57]||(t[57]=e("td",null,"The time consumed to sync log in asynchronous callback process",-1))])])]),t[106]||(t[106]=a('<h4 id="_4-2-4-cache" tabindex="-1"><a class="header-anchor" href="#_4-2-4-cache"><span>4.2.4. Cache</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>cache_hit</td><td>name="chunk"</td><td>AutoGauge</td><td>The cache hit ratio of ChunkCache, Unit: %</td></tr><tr><td>cache_hit</td><td>name="schema"</td><td>AutoGauge</td><td>The cache hit ratio of SchemaCache, Unit: %</td></tr><tr><td>cache_hit</td><td>name="timeSeriesMeta"</td><td>AutoGauge</td><td>The cache hit ratio of TimeseriesMetadataCache, Unit: %</td></tr><tr><td>cache_hit</td><td>name="bloomFilter"</td><td>AutoGauge</td><td>The interception rate of bloomFilter in TimeseriesMetadataCache, Unit: %</td></tr><tr><td>cache</td><td>name="Database", type="hit"</td><td>Counter</td><td>The hit number of Database Cache</td></tr><tr><td>cache</td><td>name="Database", type="all"</td><td>Counter</td><td>The access number of Database Cache</td></tr><tr><td>cache</td><td>name="SchemaPartition", type="hit"</td><td>Counter</td><td>The hit number of SchemaPartition Cache</td></tr><tr><td>cache</td><td>name="SchemaPartition", type="all"</td><td>Counter</td><td>The access number of SSchemaPartition Cache</td></tr><tr><td>cache</td><td>name="DataPartition", type="hit"</td><td>Counter</td><td>The hit number of DataPartition Cache</td></tr><tr><td>cache</td><td>name="DataPartition", type="all"</td><td>Counter</td><td>The access number of SDataPartition Cache</td></tr></tbody></table><h4 id="_4-2-5-interface" tabindex="-1"><a class="header-anchor" href="#_4-2-5-interface"><span>4.2.5. Interface</span></a></h4>',3)),e("table",null,[t[73]||(t[73]=e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])],-1)),e("tbody",null,[e("tr",null,[t[59]||(t[59]=e("td",null,"operation",-1)),e("td",null,'name = "'+n(o.name)+'"',1),t[60]||(t[60]=e("td",null,"Histogram",-1)),t[61]||(t[61]=e("td",null,"The time consumed of operations in client",-1))]),e("tr",null,[t[62]||(t[62]=e("td",null,"entry",-1)),e("td",null,'name="'+n(o.interface)+'"',1),t[63]||(t[63]=e("td",null,"Timer",-1)),t[64]||(t[64]=e("td",null,"The time consumed of thrift operations",-1))]),t[65]||(t[65]=e("tr",null,[e("td",null,"thrift_connections"),e("td",null,'name="ConfigNodeRPC"'),e("td",null,"AutoGauge"),e("td",null,"The number of thrift internal connections in ConfigNode")],-1)),t[66]||(t[66]=e("tr",null,[e("td",null,"thrift_connections"),e("td",null,'name="Internal"'),e("td",null,"AutoGauge"),e("td",null,"The number of thrift internal connections in DataNode")],-1)),t[67]||(t[67]=e("tr",null,[e("td",null,"thrift_connections"),e("td",null,'name="MPPDataExchange"'),e("td",null,"AutoGauge"),e("td",null,"The number of thrift internal connections in MPP")],-1)),t[68]||(t[68]=e("tr",null,[e("td",null,"thrift_connections"),e("td",null,'name="RPC"'),e("td",null,"AutoGauge"),e("td",null,"The number of thrift connections of Client")],-1)),t[69]||(t[69]=e("tr",null,[e("td",null,"thrift_active_threads"),e("td",null,'name="ConfigNodeRPC-Service"'),e("td",null,"AutoGauge"),e("td",null,"The number of thrift active internal connections in ConfigNode")],-1)),t[70]||(t[70]=e("tr",null,[e("td",null,"thrift_active_threads"),e("td",null,'name="DataNodeInternalRPC-Service"'),e("td",null,"AutoGauge"),e("td",null,"The number of thrift active internal connections in DataNode")],-1)),t[71]||(t[71]=e("tr",null,[e("td",null,"thrift_active_threads"),e("td",null,'name="MPPDataExchangeRPC-Service"'),e("td",null,"AutoGauge"),e("td",null,"The number of thrift active internal connections in MPP")],-1)),t[72]||(t[72]=e("tr",null,[e("td",null,"thrift_active_threads"),e("td",null,'name="ClientRPC-Service"'),e("td",null,"AutoGauge"),e("td",null,"The number of thrift active connections of client")],-1))])]),t[107]||(t[107]=e("h4",{id:"_4-2-6-memory",tabindex:"-1"},[e("a",{class:"header-anchor",href:"#_4-2-6-memory"},[e("span",null,"4.2.6. Memory")])],-1)),e("table",null,[t[83]||(t[83]=e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])],-1)),e("tbody",null,[e("tr",null,[t[74]||(t[74]=e("td",null,"mem",-1)),e("td",null,'name="database_'+n(o.name)+'"',1),t[75]||(t[75]=e("td",null,"AutoGauge",-1)),t[76]||(t[76]=e("td",null,"The memory usage of DataRegion in DataNode, Unit: byte",-1))]),e("tr",null,[t[77]||(t[77]=e("td",null,"mem",-1)),e("td",null,'name="chunkMetaData_'+n(o.name)+'"',1),t[78]||(t[78]=e("td",null,"AutoGauge",-1)),t[79]||(t[79]=e("td",null,"The memory usage of chunkMetaData when writting TsFile, Unit: byte",-1))]),t[80]||(t[80]=e("tr",null,[e("td",null,"mem"),e("td",null,'name="IoTConsensus"'),e("td",null,"AutoGauge"),e("td",null,"The memory usage of IoTConsensus, Unit: byte")],-1)),t[81]||(t[81]=e("tr",null,[e("td",null,"mem"),e("td",null,'name="schema_region_total_usage"'),e("td",null,"AutoGauge"),e("td",null,"The memory usage of all SchemaRegion, Unit: byte")],-1)),t[82]||(t[82]=e("tr",null,[e("td",null,"mem"),e("td",null,'name="schema_region_total_remaining"'),e("td",null,"AutoGauge"),e("td",null,"The memory remaining for all SchemaRegion, Unit: byte")],-1))])]),t[108]||(t[108]=a('<h4 id="_4-2-7-task" tabindex="-1"><a class="header-anchor" href="#_4-2-7-task"><span>4.2.7. Task</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>queue</td><td>name="compaction_inner", status="running/waiting"</td><td>Gauge</td><td>The number of inner compaction tasks</td></tr><tr><td>queue</td><td>name="compaction_cross", status="running/waiting"</td><td>Gauge</td><td>The number of cross compatcion tasks</td></tr><tr><td>cost_task</td><td>name="inner_compaction/cross_compaction/flush"</td><td>Gauge</td><td>The time consumed of compaction tasks</td></tr><tr><td>queue</td><td>name="flush",status="running/waiting"</td><td>AutoGauge</td><td>The number of flush tasks</td></tr><tr><td>queue</td><td>name="Sub_RawQuery",status="running/waiting"</td><td>AutoGauge</td><td>The number of Sub_RawQuery</td></tr></tbody></table><h4 id="_4-2-8-compaction" tabindex="-1"><a class="header-anchor" href="#_4-2-8-compaction"><span>4.2.8. Compaction</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>data_written</td><td>name="compaction", type="aligned/not-aligned/total"</td><td>Counter</td><td>The written size of compaction</td></tr><tr><td>data_read</td><td>name="compaction"</td><td>Counter</td><td>The read size of compaction</td></tr><tr><td>compaction_task_count</td><td>name = "inner_compaction", type="sequence"</td><td>Counter</td><td>The number of inner sequence compction</td></tr><tr><td>compaction_task_count</td><td>name = "inner_compaction", type="unsequence"</td><td>Counter</td><td>The number of inner sequence compction</td></tr><tr><td>compaction_task_count</td><td>name = "cross_compaction", type="cross"</td><td>Counter</td><td>The number of corss compction</td></tr></tbody></table><h4 id="_4-2-9-file" tabindex="-1"><a class="header-anchor" href="#_4-2-9-file"><span>4.2.9. File</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>file_size</td><td>name="wal"</td><td>AutoGauge</td><td>The size of WAL file, Unit: byte</td></tr><tr><td>file_size</td><td>name="seq"</td><td>AutoGauge</td><td>The size of sequence TsFile, Unit: byte</td></tr><tr><td>file_size</td><td>name="unseq"</td><td>AutoGauge</td><td>The size of unsequence TsFile, Unit: byte</td></tr><tr><td>file_size</td><td>name="inner-seq-temp"</td><td>AutoGauge</td><td>The size of inner sequence space compaction temporal file</td></tr><tr><td>file_size</td><td>name="inner-unseq-temp"</td><td>AutoGauge</td><td>The size of inner unsequence space compaction temporal file</td></tr><tr><td>file_size</td><td>name="cross-temp"</td><td>AutoGauge</td><td>The size of cross space compaction temoporal file</td></tr><tr><td>file_size</td><td>name="mods</td><td>AutoGauge</td><td>The size of modification files</td></tr><tr><td>file_count</td><td>name="wal"</td><td>AutoGauge</td><td>The count of WAL file</td></tr><tr><td>file_count</td><td>name="seq"</td><td>AutoGauge</td><td>The count of sequence TsFile</td></tr><tr><td>file_count</td><td>name="unseq"</td><td>AutoGauge</td><td>The count of unsequence TsFile</td></tr><tr><td>file_count</td><td>name="inner-seq-temp"</td><td>AutoGauge</td><td>The count of inner sequence space compaction temporal file</td></tr><tr><td>file_count</td><td>name="inner-unseq-temp"</td><td>AutoGauge</td><td>The count of inner unsequence space compaction temporal file</td></tr><tr><td>file_count</td><td>name="cross-temp"</td><td>AutoGauge</td><td>The count of cross space compaction temporal file</td></tr><tr><td>file_count</td><td>name="open_file_handlers"</td><td>AutoGauge</td><td>The count of open files of the IoTDB process, only supports Linux and MacOS</td></tr><tr><td>file_count</td><td>name="mods</td><td>AutoGauge</td><td>The count of modification file</td></tr></tbody></table><h4 id="_4-2-10-iotdb-process" tabindex="-1"><a class="header-anchor" href="#_4-2-10-iotdb-process"><span>4.2.10. IoTDB Process</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>process_used_mem</td><td>name="memory"</td><td>AutoGauge</td><td>The used memory of IoTDB process</td></tr><tr><td>process_mem_ratio</td><td>name="memory"</td><td>AutoGauge</td><td>The used memory ratio of IoTDB process</td></tr><tr><td>process_threads_count</td><td>name="process"</td><td>AutoGauge</td><td>The number of thread of IoTDB process</td></tr><tr><td>process_status</td><td>name="process"</td><td>AutoGauge</td><td>The status of IoTDB process, 1=live, 0=dead</td></tr></tbody></table><h4 id="_4-2-11-log" tabindex="-1"><a class="header-anchor" href="#_4-2-11-log"><span>4.2.11. Log</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>logback_events</td><td>level="trace/debug/info/warn/error"</td><td>Counter</td><td>The number of log events</td></tr></tbody></table><h4 id="_4-2-12-jvm-thread" tabindex="-1"><a class="header-anchor" href="#_4-2-12-jvm-thread"><span>4.2.12. JVM Thread</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>jvm_threads_live_threads</td><td></td><td>AutoGauge</td><td>The number of live thread</td></tr><tr><td>jvm_threads_daemon_threads</td><td></td><td>AutoGauge</td><td>The number of daemon thread</td></tr><tr><td>jvm_threads_peak_threads</td><td></td><td>AutoGauge</td><td>The number of peak thread</td></tr><tr><td>jvm_threads_states_threads</td><td>state="runnable/blocked/waiting/timed-waiting/new/terminated"</td><td>AutoGauge</td><td>The number of thread in different states</td></tr></tbody></table><h4 id="_4-2-13-jvm-gc" tabindex="-1"><a class="header-anchor" href="#_4-2-13-jvm-gc"><span>4.2.13. JVM GC</span></a></h4>',13)),e("table",null,[t[94]||(t[94]=e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])],-1)),e("tbody",null,[t[87]||(t[87]=e("tr",null,[e("td",null,"jvm_gc_pause"),e("td",null,'action="end of major GC/end of minor GC",cause="xxxx"'),e("td",null,"Timer"),e("td",null,"The number and time consumed of Young GC/Full Gc caused by different reason")],-1)),t[88]||(t[88]=e("tr",null,[e("td"),e("td"),e("td"),e("td")],-1)),e("tr",null,[t[84]||(t[84]=e("td",null,"jvm_gc_concurrent_phase_time",-1)),e("td",null,'action="'+n(o.action)+'",cause="'+n(o.cause)+'"',1),t[85]||(t[85]=e("td",null,"Timer",-1)),t[86]||(t[86]=e("td",null,"The number and time consumed of Young GC/Full Gc caused by different",-1))]),t[89]||(t[89]=e("tr",null,[e("td"),e("td"),e("td"),e("td")],-1)),t[90]||(t[90]=e("tr",null,[e("td",null,"jvm_gc_max_data_size_bytes"),e("td"),e("td",null,"AutoGauge"),e("td",null,"The historical maximum value of old memory")],-1)),t[91]||(t[91]=e("tr",null,[e("td",null,"jvm_gc_live_data_size_bytes"),e("td"),e("td",null,"AutoGauge"),e("td",null,"The usage of old memory")],-1)),t[92]||(t[92]=e("tr",null,[e("td",null,"jvm_gc_memory_promoted_bytes"),e("td"),e("td",null,"Counter"),e("td",null,"The accumulative value of positive memory growth of old memory")],-1)),t[93]||(t[93]=e("tr",null,[e("td",null,"jvm_gc_memory_allocated_bytes"),e("td"),e("td",null,"Counter"),e("td",null,"The accumulative value of positive memory growth of allocated memory")],-1))])]),t[109]||(t[109]=e("h4",{id:"_4-2-14-jvm-memory",tabindex:"-1"},[e("a",{class:"header-anchor",href:"#_4-2-14-jvm-memory"},[e("span",null,"4.2.14. JVM Memory")])],-1)),t[110]||(t[110]=e("table",null,[e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])]),e("tbody",null,[e("tr",null,[e("td",null,"jvm_buffer_memory_used_bytes"),e("td",null,'id="direct/mapped"'),e("td",null,"AutoGauge"),e("td",null,"The used size of buffer")]),e("tr",null,[e("td",null,"jvm_buffer_total_capacity_bytes"),e("td",null,'id="direct/mapped"'),e("td",null,"AutoGauge"),e("td",null,"The max size of buffer")]),e("tr",null,[e("td",null,"jvm_buffer_count_buffers"),e("td",null,'id="direct/mapped"'),e("td",null,"AutoGauge"),e("td",null,"The number of buffer")]),e("tr",null,[e("td",null,"jvm_memory_committed_bytes"),e("td",{area:'heap/nonheap,id="xxx",'}),e("td",null,"AutoGauge"),e("td",null,"The committed memory of JVM")]),e("tr",null,[e("td",null,"jvm_memory_max_bytes"),e("td",{area:'heap/nonheap,id="xxx",'}),e("td",null,"AutoGauge"),e("td",null,"The max memory of JVM")]),e("tr",null,[e("td",null,"jvm_memory_used_bytes"),e("td",{area:'heap/nonheap,id="xxx",'}),e("td",null,"AutoGauge"),e("td",null,"The used memory of JVM")])])],-1)),t[111]||(t[111]=a('<h4 id="_4-2-15-jvm-class" tabindex="-1"><a class="header-anchor" href="#_4-2-15-jvm-class"><span>4.2.15. JVM Class</span></a></h4><table><thead><tr><th>Metric</th><th>Tags</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td>jvm_classes_unloaded_classes</td><td></td><td>AutoGauge</td><td>The number of unloaded class</td></tr><tr><td>jvm_classes_loaded_classes</td><td></td><td>AutoGauge</td><td>The number of loaded class</td></tr></tbody></table><h4 id="_4-2-16-jvm-compilation" tabindex="-1"><a class="header-anchor" href="#_4-2-16-jvm-compilation"><span>4.2.16. JVM Compilation</span></a></h4>',3)),t[112]||(t[112]=e("table",null,[e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])]),e("tbody",null,[e("tr",null,[e("td",null,"jvm_compilation_time_ms"),e("td",{compiler:"HotSpot 64-Bit Tiered Compilers,"}),e("td",null,"AutoGauge"),e("td",null,"The time consumed in compilation")])])],-1)),t[113]||(t[113]=e("h3",{id:"_4-3-normal-level-metrics",tabindex:"-1"},[e("a",{class:"header-anchor",href:"#_4-3-normal-level-metrics"},[e("span",null,"4.3. Normal level Metrics")])],-1)),t[114]||(t[114]=e("h4",{id:"_4-3-1-cluster",tabindex:"-1"},[e("a",{class:"header-anchor",href:"#_4-3-1-cluster"},[e("span",null,"4.3.1. Cluster")])],-1)),e("table",null,[t[101]||(t[101]=e("thead",null,[e("tr",null,[e("th",null,"Metric"),e("th",null,"Tags"),e("th",null,"Type"),e("th",null,"Description")])],-1)),e("tbody",null,[e("tr",null,[t[95]||(t[95]=e("td",null,"region",-1)),e("td",null,'name="'+n(o.DatabaseName)+'",type="SchemaRegion/DataRegion"',1),t[96]||(t[96]=e("td",null,"AutoGauge",-1)),t[97]||(t[97]=e("td",null,"The number of DataRegion/SchemaRegion of database in specific node",-1))]),e("tr",null,[t[98]||(t[98]=e("td",null,"slot",-1)),e("td",null,'name="'+n(o.DatabaseName)+'",type="schemaSlotNumber/dataSlotNumber"',1),t[99]||(t[99]=e("td",null,"AutoGauge",-1)),t[100]||(t[100]=e("td",null,"The number of DataSlot/SchemaSlot of database in specific node",-1))])])]),t[115]||(t[115]=a(`<h3 id="_4-4-all-metric" tabindex="-1"><a class="header-anchor" href="#_4-4-all-metric"><span>4.4. All Metric</span></a></h3><p>Currently there is no All level metrics, and it will continue to be added in the future.</p><h2 id="_5-how-to-get-these-metrics" tabindex="-1"><a class="header-anchor" href="#_5-how-to-get-these-metrics"><span>5. How to get these metrics?</span></a></h2><p>The relevant configuration of the metric module is in <code>conf/iotdb-{datanode/confignode}.properties</code>, and all configuration items support hot loading through the <code>load configuration</code> command.</p><h3 id="_5-1-jmx" tabindex="-1"><a class="header-anchor" href="#_5-1-jmx"><span>5.1. JMX</span></a></h3><p>For metrics exposed externally using JMX, you can view them through Jconsole. After entering the Jconsole monitoring<br> page, you will first see an overview of various running conditions of IoTDB. Here you can see heap memory information,<br> thread information, class information, and the server's CPU usage.</p><h4 id="_5-1-1-obtain-metric-data" tabindex="-1"><a class="header-anchor" href="#_5-1-1-obtain-metric-data"><span>5.1.1. Obtain metric data</span></a></h4><p>After connecting to JMX, you can find the "MBean" named "org.apache.iotdb.metrics" through the "MBeans" tab, and you can<br> view the specific values of all monitoring metrics in the sidebar.</p><img style="width:100%;max-width:800px;max-height:600px;margin-left:auto;margin-right:auto;display:block;" alt="metric-jmx" src="https://alioss.timecho.com/docs/img/github/204018765-6fda9391-ebcf-4c80-98c5-26f34bd74df0.png"><h4 id="_5-1-2-get-other-relevant-data" tabindex="-1"><a class="header-anchor" href="#_5-1-2-get-other-relevant-data"><span>5.1.2. Get other relevant data</span></a></h4><p>After connecting to JMX, you can find the "MBean" named "org.apache.iotdb.service" through the "MBeans" tab, as shown in<br> the image below, to understand the basic status of the service</p><p><img style="width:100%;max-width:800px;max-height:600px;margin-left:auto;margin-right:auto;display:block;" src="https://alioss.timecho.com/docs/img/github/149951720-707f1ee8-32ee-4fde-9252-048caebd232e.png"> <br></p><p>In order to improve query performance, IOTDB caches ChunkMetaData and TsFileMetaData. Users can use MXBean and expand the sidebar <code>org.apache.iotdb.db.service</code> to view the cache hit ratio:</p><img style="width:100%;max-width:800px;max-height:600px;margin-left:auto;margin-right:auto;display:block;" src="https://alioss.timecho.com/docs/img/github/112426760-73e3da80-8d73-11eb-9a8f-9232d1f2033b.png"><h3 id="_5-2-prometheus" tabindex="-1"><a class="header-anchor" href="#_5-2-prometheus"><span>5.2. Prometheus</span></a></h3><h4 id="_5-2-1-the-mapping-from-metric-type-to-prometheus-forma" tabindex="-1"><a class="header-anchor" href="#_5-2-1-the-mapping-from-metric-type-to-prometheus-forma"><span>5.2.1. The mapping from metric type to prometheus forma</span></a></h4><blockquote><p>For metrics whose Metric Name is name and Tags are K1=V1, ..., Kn=Vn, the mapping is as follows, where value is a specific value</p></blockquote><blockquote><p>For metrics whose Metric Name is name and Tags are K1=V1, ..., Kn=Vn, the mapping is as follows, where value is a<br> specific value</p></blockquote><table><thead><tr><th>Metric Type</th><th>Mapping</th></tr></thead><tbody><tr><td>Counter</td><td>name_total{k1="V1", ..., Kn="Vn"} value</td></tr><tr><td>AutoGauge、Gauge</td><td>name{k1="V1", ..., Kn="Vn"} value</td></tr><tr><td>Histogram</td><td>name_max{k1="V1", ..., Kn="Vn"} value <br> name_sum{k1="V1", ..., Kn="Vn"} value <br> name_count{k1="V1", ..., Kn="Vn"} value <br> name{k1="V1", ..., Kn="Vn", quantile="0.0"} value <br> name{k1="V1", ..., Kn="Vn", quantile="0.25"} value <br> name{k1="V1", ..., Kn="Vn", quantile="0.5"} value <br> name{k1="V1", ..., Kn="Vn", quantile="0.75"} value <br> name{k1="V1", ..., Kn="Vn", quantile="1.0"} value</td></tr><tr><td>Rate</td><td>name_total{k1="V1", ..., Kn="Vn"} value <br> name_total{k1="V1", ..., Kn="Vn", rate="m1"} value <br> name_total{k1="V1", ..., Kn="Vn", rate="m5"} value <br> name_total{k1="V1", ..., Kn="Vn", rate="m15"} value <br> name_total{k1="V1", ..., Kn="Vn", rate="mean"} value</td></tr><tr><td>Timer</td><td>name_seconds_max{k1="V1", ..., Kn="Vn"} value <br> name_seconds_sum{k1="V1", ..., Kn="Vn"} value <br> name_seconds_count{k1="V1", ..., Kn="Vn"} value <br> name_seconds{k1="V1", ..., Kn="Vn", quantile="0.0"} value <br> name_seconds{k1="V1", ..., Kn="Vn", quantile="0.25"} value <br> name_seconds{k1="V1", ..., Kn="Vn", quantile="0.5"} value <br> name_seconds{k1="V1", ..., Kn="Vn", quantile="0.75"} value <br> name_seconds{k1="V1", ..., Kn="Vn", quantile="1.0"} value</td></tr></tbody></table><h4 id="_5-2-2-config-file" tabindex="-1"><a class="header-anchor" href="#_5-2-2-config-file"><span>5.2.2. Config File</span></a></h4><ol><li>Taking DataNode as an example, modify the iotdb-datanode.properties configuration file as follows:</li></ol><div class="language-properties line-numbers-mode" data-highlighter="prismjs" data-ext="properties" data-title="properties"><pre><code><span class="line"><span class="token key attr-name">dn_metric_reporter_list</span><span class="token punctuation">=</span><span class="token value attr-value">PROMETHEUS</span></span> |