knox-site/docs/dev-guide/knox_monitoring_api.md - knox - Git at Google

 <!--
    Licensed to the Apache Software Foundation (ASF) under one or more
    contributor license agreements.  See the NOTICE file distributed with
    this work for additional information regarding copyright ownership.
    The ASF licenses this file to You under the Apache License, Version 2.0
    (the "License"); you may not use this file except in compliance with
    the License.  You may obtain a copy of the License at

        https://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
 -->
 Health Monitoring REST API
 ===

 Knox provides REST-ful API for monitoring the core service. It primarily exposes the health of the Knox service that includes service status (up/down) as well as other health metrics. This is a work-in-progress feature, which started with an extensible framework to support basic functionalities. In particular, it currently supports the API to  A) *ping* the service and B) time-based statistics related to all API calls.

 #### Health Monitoring Setup
 The basic setup includes two major steps A) add configurations to enable the metrics collection and reporting B) write a topology file and upload it into *topologies* directory.

 ##### Service Configurations
 At first, we need to make sure the gateway configurations to gather and report to JMX are turned on in *gateway-site.xml*. The following two configurations into *gateway-site.xml* will serve the purpose.

 ```
 <property>
    <name>gateway.metrics.enabled</name>
    <value>true</value>
    <description>Boolean flag indicates whether to enable the metrics collection</description>
 </property>
 <property>
    <name>gateway.jmx.metrics.reporting.enabled</name>
    <value>true</value>
    <description>Boolean flag indicates whether to enable the metrics reporting using JMX</description>
 </property>

 ```

 ##### health.xml Topology
 In order to enable health monitoring REST service, you need to add a new topology file (i.e. *health.xml*). The following is an example that is configured to test the basic functionalities of Knox service. It is highly recommended using more restricted authentication mechanism.

 ```
 <topology>

     <gateway>

         <provider>
             <role>authentication</role>
             <name>ShiroProvider</name>
             <enabled>true</enabled>
             <param>
                 <!--
                 session timeout in minutes,  this is really idle timeout,
                 defaults to 30 mins, if the property value is not defined,,
                 current client authentication would expire if client idles continuously for more than this value
                 -->
                 <name>sessionTimeout</name>
                 <value>30</value>
             </param>
             <param>
                 <name>main.ldapRealm</name>
                 <value>org.apache.knox.gateway.shirorealm.KnoxLdapRealm</value>
             </param>
             <param>
                 <name>main.ldapContextFactory</name>
                 <value>org.apache.knox.gateway.shirorealm.KnoxLdapContextFactory</value>
             </param>
             <param>
                 <name>main.ldapRealm.contextFactory</name>
                 <value>$ldapContextFactory</value>
             </param>
             <param>
                 <name>main.ldapRealm.userDnTemplate</name>
                 <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
             </param>
             <param>
                 <name>main.ldapRealm.contextFactory.url</name>
                 <value>ldap://localhost:33389</value>
             </param>
             <param>
                 <name>main.ldapRealm.contextFactory.authenticationMechanism</name>
                 <value>simple</value>
             </param>
             <param>
                 <name>urls./**</name>
                 <value>authcBasic</value>
             </param>
         </provider>

         <provider>
             <role>authorization</role>
             <name>AclsAuthz</name>
             <enabled>false</enabled>
             <param>
                 <name>knox.acl</name>
                 <value>admin;*;*</value>
             </param>
         </provider>

         <provider>
             <role>identity-assertion</role>
             <name>Default</name>
             <enabled>false</enabled>
         </provider>

         <provider>
             <role>hostmap</role>
             <name>static</name>
             <enabled>true</enabled>
             <param><name>localhost</name><value>sandbox,sandbox.hortonworks.com</value></param>
         </provider>

     </gateway>

     <service>
         <role>HEALTH</role>
     </service>

 </topology>
 ```

 Just as with any Knox service, the gateway providers protect the health monitoring REST service defined above it. In this case, the ShiroProvider is taking care of HTTP Basic Auth using LDAP. Once the user authenticates with LDAP, the request processing continues to the *Health* service that will perform the necessary actions.

 The authenticate/federation provider can be swapped out to fit your deployment environment.

 After creating the file health.xml with above contents, you need to copy the file to *KNOX_HOME/conf/topologies* directory. If Knox/gateway service is not running, you can start it using "*bin/gateway.sh start*". Otherwise the service would automatically pick this new '*health*' service. When gateway service registers the new service, it displays the following log messages in *log/gateway.log*.

 ```
 2017-08-22 03:44:25,045 INFO  knox.gateway (GatewayServer.java:handleCreateDeployment(677)) - Deploying topology health to /home/joe/knox/knox-0.12.0/bin/../data/deployments/health.topo.15e080a91c0
 2017-08-22 03:44:25,045 INFO  knox.gateway (GatewayServer.java:internalDeactivateTopology(596)) - Deactivating topology health
 2017-08-22 03:44:25,119 INFO  knox.gateway (DefaultGatewayServices.java:initializeContribution(197)) - Creating credential store for the cluster: health
 2017-08-22 03:44:25,142 INFO  knox.gateway (GatewayServer.java:internalActivateTopology(566)) - Activating topology health
 2017-08-22 03:44:25,142 INFO  knox.gateway (GatewayServer.java:internalActivateArchive(576)) - Activating topology health archive %2F

 ```
 ##### Verify

 Once the health service is active, you can verify it by using the following *curl* command. The '*ping*' end point displays if the service is up. This end point can be utilized for monitoring the basic health of a Knox service.

 ```
 $ curl -i -k -u guest:guest-password -X GET 'https://localhost:8445/gateway/health/v1/ping'
 HTTP/1.1 200 OK
 Date: Tue, 22 Aug 2017 07:09:37 GMT
 Set-Cookie: JSESSIONID=1o82bcvoqbhbb1apt7zs8ubybb;Path=/gateway/health;Secure;HttpOnly
 Expires: Thu, 01 Jan 1970 00:00:00 GMT
 Set-Cookie: rememberMe=deleteMe; Path=/gateway/health; Max-Age=0; Expires=Mon, 21-Aug-2017 07:09:37 GMT
 Cache-Control: must-revalidate,no-cache,no-store
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Length: 3
 Server: Jetty(9.2.15.v20160210)

 OK
 ```

 To retrieve the meaningful metrics details of various service calls, you may need to run multiple REST calls such as the followings. After that, execute the metrics REST call as shown below with a sample output. As shown, metrics output is returned in JSON format.

 ```
 curl -i -k -u guest:guest-password -X GET 'https://localhost:8445/gateway/sandbox/webhdfs/v1/?op=LISTSTATUS'
 ```
 ```
 $ curl -i -k -u guest:guest-password -X GET 'https://localhost:8445/gateway/health/v1/metrics?pretty=true'
 HTTP/1.1 200 OK
 Date: Tue, 22 Aug 2017 07:10:44 GMT
 Set-Cookie: JSESSIONID=kqntcdaje9uai3pup7ffvfw4;Path=/gateway/health;Secure;HttpOnly
 Expires: Thu, 01 Jan 1970 00:00:00 GMT
 Set-Cookie: rememberMe=deleteMe; Path=/gateway/health; Max-Age=0; Expires=Mon, 21-Aug-2017 07:10:44 GMT
 Content-Type: application/json
 Cache-Control: must-revalidate,no-cache,no-store
 Transfer-Encoding: chunked
 Server: Jetty(9.2.15.v20160210)

 {
   "version" : "3.0.0",
   "gauges" : { },
   "counters" : { },
   "histograms" : { },
   "meters" : { },
   "timers" : {
     "client./gateway/health/v1/metrics.GET-requests" : {
       "count" : 5,
       "max" : 0.624587973,
       "mean" : 0.027655743001736188,
       "min" : 0.006145587,
       "p50" : 0.010020548,
       "p75" : 0.010020548,
       "p95" : 0.074454725,
       "p98" : 0.624587973,
       "p99" : 0.624587973,
       "p999" : 0.624587973,
       "stddev" : 0.0929226225229978,
       "m15_rate" : 2.657500857422334E-7,
       "m1_rate" : 5.770087852901534E-89,
       "m5_rate" : 4.769163772973399E-19,
       "mean_rate" : 4.0952378345310894E-4,
       "duration_units" : "seconds",
       "rate_units" : "calls/second"
     },
     "client./gateway/health/v1/ping.GET-requests" : {
       "count" : 1,
       "max" : 0.017257638000000002,
       "mean" : 0.017257638000000002,
       "min" : 0.017257638000000002,
       "p50" : 0.017257638000000002,
       "p75" : 0.017257638000000002,
       "p95" : 0.017257638000000002,
       "p98" : 0.017257638000000002,
       "p99" : 0.017257638000000002,
       "p999" : 0.017257638000000002,
       "stddev" : 0.0,
       "m15_rate" : 0.18710139700632353,
       "m1_rate" : 0.0735758882342885,
       "m5_rate" : 0.1637461506155964,
       "mean_rate" : 0.014990517517814805,
       "duration_units" : "seconds",
       "rate_units" : "calls/second"
     },
     "client./gateway/sandbox/health/v1/.GET-requests" : {
       "count" : 1,
       "max" : 4.01873E-4,
       "mean" : 4.01873E-4,
       "min" : 4.01873E-4,
       "p50" : 4.01873E-4,
       "p75" : 4.01873E-4,
       "p95" : 4.01873E-4,
       "p98" : 4.01873E-4,
       "p99" : 4.01873E-4,
       "p999" : 4.01873E-4,
       "stddev" : 0.0,
       "m15_rate" : 2.536740427767808E-7,
       "m1_rate" : 7.074903404511115E-90,
       "m5_rate" : 4.081014139447941E-19,
       "mean_rate" : 8.179827684854002E-5,
       "duration_units" : "seconds",
       "rate_units" : "calls/second"
     },
     "client./gateway/sandbox/v1/health/.GET-requests" : {
       "count" : 1,
       "max" : 5.470700000000001E-4,
       "mean" : 5.470700000000001E-4,
       "min" : 5.470700000000001E-4,
       "p50" : 5.470700000000001E-4,
       "p75" : 5.470700000000001E-4,
       "p95" : 5.470700000000001E-4,
       "p98" : 5.470700000000001E-4,
       "p99" : 5.470700000000001E-4,
       "p999" : 5.470700000000001E-4,
       "stddev" : 0.0,
       "m15_rate" : 2.413022137213267E-7,
       "m1_rate" : 3.341947732164585E-90,
       "m5_rate" : 3.512561421726287E-19,
       "mean_rate" : 8.149518570285245E-5,
       "duration_units" : "seconds",
       "rate_units" : "calls/second"
     },
     "client./gateway/sandbox/webhdfs/v1/.GET-requests" : {
       "count" : 4,
       "max" : 0.463745401,
       "mean" : 0.024924118143299912,
       "min" : 0.016542244,
       "p50" : 0.024799078000000002,
       "p75" : 0.033933548,
       "p95" : 0.033933548,
       "p98" : 0.033933548,
       "p99" : 0.033933548,
       "p999" : 0.033933548,
       "stddev" : 0.007284773511002474,
       "m15_rate" : 2.120680068580741E-8,
       "m1_rate" : 4.7541228609699333E-91,
       "m5_rate" : 1.5806080232092864E-20,
       "mean_rate" : 2.7314359915623396E-4,
       "duration_units" : "seconds",
       "rate_units" : "calls/second"
     },
     "service./gateway/sandbox/webhdfs/v1/.get-requests" : {
       "count" : 3,
       "max" : 0.014635496000000001,
       "mean" : 0.00342438191233768,
       "min" : 0.0020088890000000002,
       "p50" : 0.0020088890000000002,
       "p75" : 0.005144646,
       "p95" : 0.005144646,
       "p98" : 0.005144646,
       "p99" : 0.005144646,
       "p999" : 0.005144646,
       "stddev" : 0.0015604555820128599,
       "m15_rate" : 1.9913776931949195E-8,
       "m1_rate" : 3.1334281325640874E-91,
       "m5_rate" : 1.055281734633953E-20,
       "mean_rate" : 2.0486339070804923E-4,
       "duration_units" : "seconds",
       "rate_units" : "calls/second"
     }
   }
 }
 ```
 #### REST End Points
 As mentioned above, currently Knox provides a few monitoring APIs to start with. The list will gradually grow to support new use-cases.
 ##### /ping
 This end-point can be used to determine if a Knox gateway service is alive or not. It is useful for basic health monitoring of the core service. Although most of the results of REST calls are in JSON format, this one (*/ping*) is in plain text.

 Sample response

 ```
 OK
 ```

 ##### /metrics
 This end-point returns all Knox metrics grouped by individual call type. For example, timer metrics for all *webhdfs* calls are aggregated into one set of metrics and then returned in a separate JSON element. This end-point also supports an option (*/metrics?pretty=true*) to pretty print the metrics output.

 A sample response with *pretty=true* is shown below:

 ```
 {
   "version" : "3.0.0",
   "gauges" : { },
   "counters" : { },
   "histograms" : { },
   "meters" : { },
   "timers" : {
     "client./gateway/health/v1/ping.GET-requests" : {
       "count" : 1,
       "max" : 0.017257638000000002,
       "mean" : 0.017257638000000002,
       "min" : 0.017257638000000002,
       "p50" : 0.017257638000000002,
       "p75" : 0.017257638000000002,
       "p95" : 0.017257638000000002,
       "p98" : 0.017257638000000002,
       "p99" : 0.017257638000000002,
       "p999" : 0.017257638000000002,
       "stddev" : 0.0,
       "m15_rate" : 0.18710139700632353,
       "m1_rate" : 0.0735758882342885,
       "m5_rate" : 0.1637461506155964,
       "mean_rate" : 0.014990517517814805,
       "duration_units" : "seconds",
       "rate_units" : "calls/second"
     },
     "client./gateway/sandbox/v1/health/.GET-requests" : {
       "count" : 1,
       "max" : 5.470700000000001E-4,
       "mean" : 5.470700000000001E-4,
       "min" : 5.470700000000001E-4,
       "p50" : 5.470700000000001E-4,
       "p75" : 5.470700000000001E-4,
       "p95" : 5.470700000000001E-4,
       "p98" : 5.470700000000001E-4,
       "p99" : 5.470700000000001E-4,
       "p999" : 5.470700000000001E-4,
       "stddev" : 0.0,
       "m15_rate" : 2.413022137213267E-7,
       "m1_rate" : 3.341947732164585E-90,
       "m5_rate" : 3.512561421726287E-19,
       "mean_rate" : 8.149518570285245E-5,
       "duration_units" : "seconds",
       "rate_units" : "calls/second"
     },
     "client./gateway/sandbox/webhdfs/v1/.GET-requests" : {
       "count" : 4,
       "max" : 0.463745401,
       "mean" : 0.024924118143299912,
       "min" : 0.016542244,
       "p50" : 0.024799078000000002,
       "p75" : 0.033933548,
       "p95" : 0.033933548,
       "p98" : 0.033933548,
       "p99" : 0.033933548,
       "p999" : 0.033933548,
       "stddev" : 0.007284773511002474,
       "m15_rate" : 2.120680068580741E-8,
       "m1_rate" : 4.7541228609699333E-91,
       "m5_rate" : 1.5806080232092864E-20,
       "mean_rate" : 2.7314359915623396E-4,
       "duration_units" : "seconds",
       "rate_units" : "calls/second"
     }
   }
 }
 ```
	<!--
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	https://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->
	Health Monitoring REST API
	===

	Knox provides REST-ful API for monitoring the core service. It primarily exposes the health of the Knox service that includes service status (up/down) as well as other health metrics. This is a work-in-progress feature, which started with an extensible framework to support basic functionalities. In particular, it currently supports the API to A) ping the service and B) time-based statistics related to all API calls.

	#### Health Monitoring Setup
	The basic setup includes two major steps A) add configurations to enable the metrics collection and reporting B) write a topology file and upload it into topologies directory.

	##### Service Configurations
	At first, we need to make sure the gateway configurations to gather and report to JMX are turned on in gateway-site.xml. The following two configurations into gateway-site.xml will serve the purpose.

	```
	<property>
	<name>gateway.metrics.enabled</name>
	<value>true</value>
	<description>Boolean flag indicates whether to enable the metrics collection</description>
	</property>
	<property>
	<name>gateway.jmx.metrics.reporting.enabled</name>
	<value>true</value>
	<description>Boolean flag indicates whether to enable the metrics reporting using JMX</description>
	</property>

	```

	##### health.xml Topology
	In order to enable health monitoring REST service, you need to add a new topology file (i.e. health.xml). The following is an example that is configured to test the basic functionalities of Knox service. It is highly recommended using more restricted authentication mechanism.

	```
	<topology>

	<gateway>

	<provider>
	<role>authentication</role>
	<name>ShiroProvider</name>
	<enabled>true</enabled>
	<param>
	<!--
	session timeout in minutes, this is really idle timeout,
	defaults to 30 mins, if the property value is not defined,,
	current client authentication would expire if client idles continuously for more than this value
	-->
	<name>sessionTimeout</name>
	<value>30</value>
	</param>
	<param>
	<name>main.ldapRealm</name>
	<value>org.apache.knox.gateway.shirorealm.KnoxLdapRealm</value>
	</param>
	<param>
	<name>main.ldapContextFactory</name>
	<value>org.apache.knox.gateway.shirorealm.KnoxLdapContextFactory</value>
	</param>
	<param>
	<name>main.ldapRealm.contextFactory</name>
	<value>$ldapContextFactory</value>
	</param>
	<param>
	<name>main.ldapRealm.userDnTemplate</name>
	<value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
	</param>
	<param>
	<name>main.ldapRealm.contextFactory.url</name>
	<value>ldap://localhost:33389</value>
	</param>
	<param>
	<name>main.ldapRealm.contextFactory.authenticationMechanism</name>
	<value>simple</value>
	</param>
	<param>
	<name>urls./**</name>
	<value>authcBasic</value>
	</param>
	</provider>

	<provider>
	<role>authorization</role>
	<name>AclsAuthz</name>
	<enabled>false</enabled>
	<param>
	<name>knox.acl</name>
	<value>admin;;</value>
	</param>
	</provider>

	<provider>
	<role>identity-assertion</role>
	<name>Default</name>
	<enabled>false</enabled>
	</provider>

	<provider>
	<role>hostmap</role>
	<name>static</name>
	<enabled>true</enabled>
	<param><name>localhost</name><value>sandbox,sandbox.hortonworks.com</value></param>
	</provider>

	</gateway>

	<service>
	<role>HEALTH</role>
	</service>

	</topology>
	```

	Just as with any Knox service, the gateway providers protect the health monitoring REST service defined above it. In this case, the ShiroProvider is taking care of HTTP Basic Auth using LDAP. Once the user authenticates with LDAP, the request processing continues to the Health service that will perform the necessary actions.

	The authenticate/federation provider can be swapped out to fit your deployment environment.

	After creating the file health.xml with above contents, you need to copy the file to KNOX_HOME/conf/topologies directory. If Knox/gateway service is not running, you can start it using "bin/gateway.sh start". Otherwise the service would automatically pick this new 'health' service. When gateway service registers the new service, it displays the following log messages in log/gateway.log.

	```
	2017-08-22 03:44:25,045 INFO knox.gateway (GatewayServer.java:handleCreateDeployment(677)) - Deploying topology health to /home/joe/knox/knox-0.12.0/bin/../data/deployments/health.topo.15e080a91c0
	2017-08-22 03:44:25,045 INFO knox.gateway (GatewayServer.java:internalDeactivateTopology(596)) - Deactivating topology health
	2017-08-22 03:44:25,119 INFO knox.gateway (DefaultGatewayServices.java:initializeContribution(197)) - Creating credential store for the cluster: health
	2017-08-22 03:44:25,142 INFO knox.gateway (GatewayServer.java:internalActivateTopology(566)) - Activating topology health
	2017-08-22 03:44:25,142 INFO knox.gateway (GatewayServer.java:internalActivateArchive(576)) - Activating topology health archive %2F

	```
	##### Verify

	Once the health service is active, you can verify it by using the following curl command. The 'ping' end point displays if the service is up. This end point can be utilized for monitoring the basic health of a Knox service.

	```
	$ curl -i -k -u guest:guest-password -X GET 'https://localhost:8445/gateway/health/v1/ping'
	HTTP/1.1 200 OK
	Date: Tue, 22 Aug 2017 07:09:37 GMT
	Set-Cookie: JSESSIONID=1o82bcvoqbhbb1apt7zs8ubybb;Path=/gateway/health;Secure;HttpOnly
	Expires: Thu, 01 Jan 1970 00:00:00 GMT
	Set-Cookie: rememberMe=deleteMe; Path=/gateway/health; Max-Age=0; Expires=Mon, 21-Aug-2017 07:09:37 GMT
	Cache-Control: must-revalidate,no-cache,no-store
	Content-Type: text/plain; charset=ISO-8859-1
	Content-Length: 3
	Server: Jetty(9.2.15.v20160210)

	OK
	```

	To retrieve the meaningful metrics details of various service calls, you may need to run multiple REST calls such as the followings. After that, execute the metrics REST call as shown below with a sample output. As shown, metrics output is returned in JSON format.

	```
	curl -i -k -u guest:guest-password -X GET 'https://localhost:8445/gateway/sandbox/webhdfs/v1/?op=LISTSTATUS'
	```
	```
	$ curl -i -k -u guest:guest-password -X GET 'https://localhost:8445/gateway/health/v1/metrics?pretty=true'
	HTTP/1.1 200 OK
	Date: Tue, 22 Aug 2017 07:10:44 GMT
	Set-Cookie: JSESSIONID=kqntcdaje9uai3pup7ffvfw4;Path=/gateway/health;Secure;HttpOnly
	Expires: Thu, 01 Jan 1970 00:00:00 GMT
	Set-Cookie: rememberMe=deleteMe; Path=/gateway/health; Max-Age=0; Expires=Mon, 21-Aug-2017 07:10:44 GMT
	Content-Type: application/json
	Cache-Control: must-revalidate,no-cache,no-store
	Transfer-Encoding: chunked
	Server: Jetty(9.2.15.v20160210)

	{
	"version" : "3.0.0",
	"gauges" : { },
	"counters" : { },
	"histograms" : { },
	"meters" : { },
	"timers" : {
	"client./gateway/health/v1/metrics.GET-requests" : {
	"count" : 5,
	"max" : 0.624587973,
	"mean" : 0.027655743001736188,
	"min" : 0.006145587,
	"p50" : 0.010020548,
	"p75" : 0.010020548,
	"p95" : 0.074454725,
	"p98" : 0.624587973,
	"p99" : 0.624587973,
	"p999" : 0.624587973,
	"stddev" : 0.0929226225229978,
	"m15_rate" : 2.657500857422334E-7,
	"m1_rate" : 5.770087852901534E-89,
	"m5_rate" : 4.769163772973399E-19,
	"mean_rate" : 4.0952378345310894E-4,
	"duration_units" : "seconds",
	"rate_units" : "calls/second"
	},
	"client./gateway/health/v1/ping.GET-requests" : {
	"count" : 1,
	"max" : 0.017257638000000002,
	"mean" : 0.017257638000000002,
	"min" : 0.017257638000000002,
	"p50" : 0.017257638000000002,
	"p75" : 0.017257638000000002,
	"p95" : 0.017257638000000002,
	"p98" : 0.017257638000000002,
	"p99" : 0.017257638000000002,
	"p999" : 0.017257638000000002,
	"stddev" : 0.0,
	"m15_rate" : 0.18710139700632353,
	"m1_rate" : 0.0735758882342885,
	"m5_rate" : 0.1637461506155964,
	"mean_rate" : 0.014990517517814805,
	"duration_units" : "seconds",
	"rate_units" : "calls/second"
	},
	"client./gateway/sandbox/health/v1/.GET-requests" : {
	"count" : 1,
	"max" : 4.01873E-4,
	"mean" : 4.01873E-4,
	"min" : 4.01873E-4,
	"p50" : 4.01873E-4,
	"p75" : 4.01873E-4,
	"p95" : 4.01873E-4,
	"p98" : 4.01873E-4,
	"p99" : 4.01873E-4,
	"p999" : 4.01873E-4,
	"stddev" : 0.0,
	"m15_rate" : 2.536740427767808E-7,
	"m1_rate" : 7.074903404511115E-90,
	"m5_rate" : 4.081014139447941E-19,
	"mean_rate" : 8.179827684854002E-5,
	"duration_units" : "seconds",
	"rate_units" : "calls/second"
	},
	"client./gateway/sandbox/v1/health/.GET-requests" : {
	"count" : 1,
	"max" : 5.470700000000001E-4,
	"mean" : 5.470700000000001E-4,
	"min" : 5.470700000000001E-4,
	"p50" : 5.470700000000001E-4,
	"p75" : 5.470700000000001E-4,
	"p95" : 5.470700000000001E-4,
	"p98" : 5.470700000000001E-4,
	"p99" : 5.470700000000001E-4,
	"p999" : 5.470700000000001E-4,
	"stddev" : 0.0,
	"m15_rate" : 2.413022137213267E-7,
	"m1_rate" : 3.341947732164585E-90,
	"m5_rate" : 3.512561421726287E-19,
	"mean_rate" : 8.149518570285245E-5,
	"duration_units" : "seconds",
	"rate_units" : "calls/second"
	},
	"client./gateway/sandbox/webhdfs/v1/.GET-requests" : {
	"count" : 4,
	"max" : 0.463745401,
	"mean" : 0.024924118143299912,
	"min" : 0.016542244,
	"p50" : 0.024799078000000002,
	"p75" : 0.033933548,
	"p95" : 0.033933548,
	"p98" : 0.033933548,
	"p99" : 0.033933548,
	"p999" : 0.033933548,
	"stddev" : 0.007284773511002474,
	"m15_rate" : 2.120680068580741E-8,
	"m1_rate" : 4.7541228609699333E-91,
	"m5_rate" : 1.5806080232092864E-20,
	"mean_rate" : 2.7314359915623396E-4,
	"duration_units" : "seconds",
	"rate_units" : "calls/second"
	},
	"service./gateway/sandbox/webhdfs/v1/.get-requests" : {
	"count" : 3,
	"max" : 0.014635496000000001,
	"mean" : 0.00342438191233768,
	"min" : 0.0020088890000000002,
	"p50" : 0.0020088890000000002,
	"p75" : 0.005144646,
	"p95" : 0.005144646,
	"p98" : 0.005144646,
	"p99" : 0.005144646,
	"p999" : 0.005144646,
	"stddev" : 0.0015604555820128599,
	"m15_rate" : 1.9913776931949195E-8,
	"m1_rate" : 3.1334281325640874E-91,
	"m5_rate" : 1.055281734633953E-20,
	"mean_rate" : 2.0486339070804923E-4,
	"duration_units" : "seconds",
	"rate_units" : "calls/second"
	}
	}
	}
	```
	#### REST End Points
	As mentioned above, currently Knox provides a few monitoring APIs to start with. The list will gradually grow to support new use-cases.
	##### /ping
	This end-point can be used to determine if a Knox gateway service is alive or not. It is useful for basic health monitoring of the core service. Although most of the results of REST calls are in JSON format, this one (/ping) is in plain text.

	Sample response

	```
	OK
	```

	##### /metrics
	This end-point returns all Knox metrics grouped by individual call type. For example, timer metrics for all webhdfs calls are aggregated into one set of metrics and then returned in a separate JSON element. This end-point also supports an option (/metrics?pretty=true) to pretty print the metrics output.

	A sample response with pretty=true is shown below:

	```
	{
	"version" : "3.0.0",
	"gauges" : { },
	"counters" : { },
	"histograms" : { },
	"meters" : { },
	"timers" : {
	"client./gateway/health/v1/ping.GET-requests" : {
	"count" : 1,
	"max" : 0.017257638000000002,
	"mean" : 0.017257638000000002,
	"min" : 0.017257638000000002,
	"p50" : 0.017257638000000002,
	"p75" : 0.017257638000000002,
	"p95" : 0.017257638000000002,
	"p98" : 0.017257638000000002,
	"p99" : 0.017257638000000002,
	"p999" : 0.017257638000000002,
	"stddev" : 0.0,
	"m15_rate" : 0.18710139700632353,
	"m1_rate" : 0.0735758882342885,
	"m5_rate" : 0.1637461506155964,
	"mean_rate" : 0.014990517517814805,
	"duration_units" : "seconds",
	"rate_units" : "calls/second"
	},
	"client./gateway/sandbox/v1/health/.GET-requests" : {
	"count" : 1,
	"max" : 5.470700000000001E-4,
	"mean" : 5.470700000000001E-4,
	"min" : 5.470700000000001E-4,
	"p50" : 5.470700000000001E-4,
	"p75" : 5.470700000000001E-4,
	"p95" : 5.470700000000001E-4,
	"p98" : 5.470700000000001E-4,
	"p99" : 5.470700000000001E-4,
	"p999" : 5.470700000000001E-4,
	"stddev" : 0.0,
	"m15_rate" : 2.413022137213267E-7,
	"m1_rate" : 3.341947732164585E-90,
	"m5_rate" : 3.512561421726287E-19,
	"mean_rate" : 8.149518570285245E-5,
	"duration_units" : "seconds",
	"rate_units" : "calls/second"
	},
	"client./gateway/sandbox/webhdfs/v1/.GET-requests" : {
	"count" : 4,
	"max" : 0.463745401,
	"mean" : 0.024924118143299912,
	"min" : 0.016542244,
	"p50" : 0.024799078000000002,
	"p75" : 0.033933548,
	"p95" : 0.033933548,
	"p98" : 0.033933548,
	"p99" : 0.033933548,
	"p999" : 0.033933548,
	"stddev" : 0.007284773511002474,
	"m15_rate" : 2.120680068580741E-8,
	"m1_rate" : 4.7541228609699333E-91,
	"m5_rate" : 1.5806080232092864E-20,
	"mean_rate" : 2.7314359915623396E-4,
	"duration_units" : "seconds",
	"rate_units" : "calls/second"
	}
	}
	}
	```