blob: 823990c1bb6f1170c2c48a957f6ef25187894dc4 [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Traffic Monitor &mdash; Traffic Control 1.2.1 documentation </title>
<link rel="shortcut icon" href="../_static/favicon.ico"/>
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<link rel="stylesheet" href="../_static/theme_overrides.css" type="text/css" />
<link rel="top" title="Traffic Control 1.2.1 documentation" href="../index.html"/>
<link rel="up" title="Traffic Control Overview" href="index.html"/>
<link rel="next" title="Traffic Stats" href="traffic_stats.html"/>
<link rel="prev" title="Traffic Router" href="traffic_router.html"/>
<script src="_static/js/modernizr.min.js"></script>
<body class="wy-body-for-nav" role="document">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-nav-search">
<a href="/" class="icon icon-home"> Traffic Control
<img src="../_static/tc_logo.png" class="logo" />
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
<li class="toctree-l1"><a class="reference internal" href="../basics/index.html">CDN Basics</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../basics/content_delivery_networks.html">Content Delivery Networks</a></li>
<li class="toctree-l2"><a class="reference internal" href="../basics/http_11.html">HTTP 1.1</a></li>
<li class="toctree-l2"><a class="reference internal" href="../basics/caching_proxies.html">Caching Proxies</a></li>
<li class="toctree-l2"><a class="reference internal" href="../basics/cache_revalidation.html">Cache Control Headers and Revalidation</a></li>
<ul class="current">
<li class="toctree-l1 current"><a class="reference internal" href="index.html">Traffic Control Overview</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="introduction.html">Introduction</a></li>
<li class="toctree-l2"><a class="reference internal" href="traffic_ops.html">Traffic Ops</a></li>
<li class="toctree-l2"><a class="reference internal" href="traffic_router.html">Traffic Router</a></li>
<li class="toctree-l2 current"><a class="current reference internal" href="">Traffic Monitor</a></li>
<li class="toctree-l2"><a class="reference internal" href="traffic_stats.html">Traffic Stats</a></li>
<li class="toctree-l2"><a class="reference internal" href="traffic_portal.html">Traffic Portal</a></li>
<li class="toctree-l2"><a class="reference internal" href="traffic_server.html">Traffic Server</a></li>
<li class="toctree-l2"><a class="reference internal" href="traffic_vault.html">Traffic Vault</a></li>
<li class="toctree-l1"><a class="reference internal" href="../admin/index.html">Administrator&#8217;s Guide</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../admin/traffic_ops_install.html">Installing Traffic Ops</a></li>
<li class="toctree-l2"><a class="reference internal" href="../admin/traffic_ops_config.html">Configuring Traffic Ops</a></li>
<li class="toctree-l2"><a class="reference internal" href="../admin/traffic_ops_using.html">Using Traffic Ops</a></li>
<li class="toctree-l2"><a class="reference internal" href="../admin/traffic_ops_extensions.html">Managing Traffic Ops Extensions</a></li>
<li class="toctree-l2"><a class="reference internal" href="../admin/traffic_monitor.html">Traffic Monitor Administration</a></li>
<li class="toctree-l2"><a class="reference internal" href="../admin/traffic_router.html">Traffic Router Administration</a></li>
<li class="toctree-l2"><a class="reference internal" href="../admin/traffic_stats.html">Traffic Stats Administration</a></li>
<li class="toctree-l2"><a class="reference internal" href="../admin/traffic_server.html">Traffic Server Administration</a></li>
<li class="toctree-l2"><a class="reference internal" href="../admin/traffic_vault.html">Traffic Vault Administration</a></li>
<li class="toctree-l2"><a class="reference internal" href="../admin/quick_howto/index.html">Quick How To Guides</a></li>
<li class="toctree-l1"><a class="reference internal" href="../development/index.html">Developer&#8217;s Guide</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../development/traffic_ops.html">Traffic Ops</a></li>
<li class="toctree-l2"><a class="reference internal" href="../development/traffic_router.html">Traffic Router</a></li>
<li class="toctree-l2"><a class="reference internal" href="../development/traffic_monitor.html">Traffic Monitor</a></li>
<li class="toctree-l2"><a class="reference internal" href="../development/traffic_stats.html">Traffic Stats</a></li>
<li class="toctree-l2"><a class="reference internal" href="../development/traffic_server.html">Traffic Server</a></li>
<li class="toctree-l1"><a class="reference internal" href="../faq/index.html">FAQ</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../faq/general.html">General</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/development.html">Development</a></li>
<li class="toctree-l2"><a class="reference internal" href="../faq/administration.html">Running a Traffic Control CDN</a></li>
<li class="toctree-l1"><a class="reference internal" href="../glossary.html">Glossary</a></li>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
<nav class="wy-nav-top" role="navigation" aria-label="top navigation">
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../index.html">Traffic Control</a>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li><a href="../index.html">Traffic Control 1.2.1</a> &raquo;</li>
<li><a href="index.html">Traffic Control Overview</a> &raquo;</li>
<li>Traffic Monitor</li>
<li class="wy-breadcrumbs-aside">
<a href="../_sources/overview/traffic_monitor.txt" rel="nofollow"> View page source</a>
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
<a href="traffic_stats.html" class="btn btn-neutral float-right" title="Traffic Stats">Next <span class="fa fa-arrow-circle-right"></span></a>
<a href="traffic_router.html" class="btn btn-neutral" title="Traffic Router"><span class="fa fa-arrow-circle-left"></span> Previous</a>
<div role="main" class="document">
<span class="target" id="reference-label-tc-tm"></span><span class="target" id="index-0"></span><div class="section" id="traffic-monitor">
<h1>Traffic Monitor<a class="headerlink" href="#traffic-monitor" title="Permalink to this headline"></a></h1>
<p>Traffic Monitor is a Java/Tomcat application that monitors the caches in a CDN for a variety of metrics. These metrics are for use in determining the overall health of a given cache and the related delivery services. A given CDN can operate a number of Traffic Monitors, from a number of geographically diverse locations, to prevent false positives caused by network problems at a given site.</p>
<p>Traffic Monitors operate independently, but use the state of other Traffic Monitors in conjunction with their own state, to provide a consistent view of CDN cache health to upstream applications such as Traffic Router. Health Protocol governs the cache and Delivery Service availability.</p>
<p>Traffic Monitor provides a view into CDN health using several RESTful JSON endpoints, which are consumed by other Traffic Monitors and upstream components such as Traffic Router. Traffic Monitor is also responsible for serving the overall CDN configuration to Traffic Router, which ensures that the configuration of these two critical components remain synchronized as operational and health related changes propagate through the CDN.</p>
<div class="section" id="arrow-cache-monitoring">
<span id="rl-astats"></span><h2><img alt="arrow" src="../_images/fwda1.png" /> Cache Monitoring<a class="headerlink" href="#arrow-cache-monitoring" title="Permalink to this headline"></a></h2>
<div><p>Traffic Monitor polls all caches configured with a status of <code class="docutils literal"><span class="pre">REPORTED</span></code> or <code class="docutils literal"><span class="pre">ADMIN_DOWN</span></code> at an interval specified as a configuration parameter in Traffic Ops. If the cache is set to <code class="docutils literal"><span class="pre">ADMIN_DOWN</span></code> it is marked as unavailable but still polled for availability and statistics. If the cache is explicitly configured with a status of <code class="docutils literal"><span class="pre">ONLINE</span></code> or <code class="docutils literal"><span class="pre">OFFLINE</span></code>, it is not polled by Traffic Monitor and presented to Traffic Router as configured, regardless of actual availability.</p>
<p>Traffic Monitor makes HTTP requests at regular intervals to a special URL on each EDGE cache and consumes the JSON output. The special URL is a plugin running on the Apache Traffic Server (ATS) caches called astats, which is restricted to Traffic Monitor only. The astats plugin provides insight into application and system performance, such as:</p>
<ul class="simple">
<li>Throughput (e.g. bytes in, bytes out, etc).</li>
<li>Transactions (e.g. number of 2xx, 3xx, 4xx responses, etc).</li>
<li>Connections (e.g. from clients, to parents, origins, etc).</li>
<li>Cache performance (e.g.: hits, misses, refreshes, etc).</li>
<li>Storage performance (e.g.: writes, reads, frags, directories, etc).</li>
<li>System performance (e.g: load average, network interface throughput, etc).</li>
<p>Many of the application level statistics are available at the global or aggregate level, some at the Delivery Service (remap rule) level. Traffic Monitor uses the system level performance to determine the overall health of the cache by evaluating network throughput and load against values configured in Traffic Ops. Traffic Monitor also uses throughput and transaction statistics at the remap rule level to determine Delivery Service health.</p>
<p>If astats is unavailable due to a network related issue or the system statistics have exceeded the configured thresholds, Traffic Monitor will mark the cache as unavailable. If the delivery service statistics exceed the configured thresholds, the delivery service is marked as unavailable, and Traffic Router will start sending clients to the overflow destinations for that delivery service, but the cache remains available to serve other content,</p>
<div class="admonition seealso">
<p class="first admonition-title">See also</p>
<p class="last">For more information on ATS Statistics, see the <a class="reference external" href="">ATS documentation</a></p>
<div class="section" id="arrow-health-protocol">
<span id="rl-health-proto"></span><h2><img alt="arrow" src="../_images/fwda1.png" /> Health Protocol<a class="headerlink" href="#arrow-health-protocol" title="Permalink to this headline"></a></h2>
<div><p>Redundant Traffic Monitor servers operate independently from each other but take the state of other Traffic Monitors into account when asked for health state information. In the above overview of cache monitoring, the behavior of Traffic Monitor pertains only to how an individual instance detects and handles failures. The Health Protocol adds another dimension to the health state of the CDN by merging the states of all Traffic Monitors into one, and then taking the <em>optimistic</em> approach when dealing with a cache or Delivery Service that might have been marked as unavailable by this particular instance or a peer instance of Traffic Monitor.</p>
<p>Upon startup or configuration change in Traffic Ops, in addition to caches, Traffic Monitor begins polling its peer Traffic Monitors whose state is set to <code class="docutils literal"><span class="pre">ONLINE</span></code>. Each <code class="docutils literal"><span class="pre">ONLINE</span></code> Traffic Monitor polls all of its peers at a configurable interval and saves the peer&#8217;s state for later use. When polling its peers, Traffic Monitor asks for the raw health state from each respective peer, which is strictly that instance&#8217;s view of the CDN&#8217;s health. When any <code class="docutils literal"><span class="pre">ONLINE</span></code> Traffic Monitor is asked for CDN health by an upstream component, such as Traffic Router, the component gets the health protocol influenced version of CDN health (non-raw view).</p>
<p>In operation of the health protocol, Traffic Monitor takes all health states from all peers, including the locally known health state, and serves an optimistic outlook to the requesting client. This means that, for example, if three of the four Traffic Monitors see a given cache or Delivery Service as exceeding its thresholds and unavailable, it is still considered available. Only if all Traffic Monitors agree that the given object is unavailable is that state propagated to upstream components. This optimistic approach to the Health Protocol is counter to the &#8220;fail fast&#8221; philosophy, but serves well for large networks with complicated geography and or routing. The optimistic Health Protocol allows network failures or latency to occur without affecting overall traffic routing, as Traffic Monitors can and do have a different view of the network when deployed in geographically diverse locations. Short polling intervals of both the caches and Traffic Monitor peers help to reduce customer impact of outages.</p>
<p>It is not uncommon for a cache to be marked unavailable by Traffic Monitor - in fact, it is business as usual for many CDNs. A hot video asset may cause a single cache (say cache-03) to get close to it&#8217;s interface capacity, the health protocol &#8220;kicks in&#8221;, and Traffic Monitor marks cache-03 as unavailable. New clients want to see the same asset, and now, Traffic Router will send these customers to another cache (say cache-01) in the same cachegroup. The load is now shared between cache-01 and cache-03. As clients finish watching the asset on cache-03, it will drop below the threshold and gets marked available again, and new clients will now go back to cache-03 again.</p>
<p>It is less common for a delivery service to be marked unavailable by Traffic Monitor, the delivery service thresholds are usually used for overflow situations at extreme peaks to protect other delivery services in the CDN from getting impacted.</p>
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
<a href="traffic_stats.html" class="btn btn-neutral float-right" title="Traffic Stats">Next <span class="fa fa-arrow-circle-right"></span></a>
<a href="traffic_router.html" class="btn btn-neutral" title="Traffic Router"><span class="fa fa-arrow-circle-left"></span> Previous</a>
<div role="contentinfo">
Built with <a href="">Sphinx</a> using a <a href="">theme</a> provided by <a href="">Read the Docs</a>.
<script type="text/javascript">
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="../_static/js/theme.js"></script>
<script type="text/javascript">
jQuery(function () {