doc/downstream-caching.html - incubator-pagespeed-website - Git at Google

 <!--
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->

 <html>
   <head>
     <meta name="viewport" content="width=device-width, initial-scale=1">
     <title>Configuring Downstream Caches</title>
     <link rel="stylesheet" href="doc.css">
   </head>
   <body>
 <!--#include virtual="_header.html" -->


   <div id=content>
 <h1>Configuring Downstream Caches</h1>
 <style>
 .diff-del { color: darkred }
 .diff-add { color: green }
 </style>

 <h2 id="standard">Standard Configuration</h2>
    <p class="warning"><strong>Note: This feature is currently experimental.
      Options and configuration described here are subject to change in future
      releases. Please subscribe to the <a href="mailing-lists#announcements"
      >announcements mailing list</a> to keep yourself informed of updates to
      this feature.</strong></p>

    <p>
      By default PageSpeed serves HTML files with <code>Cache-Control: no-cache,
      max-age=0</code> so that changes to the HTML and its resources are sent
      fresh on each request.  The HTML can be cached, however, if you:
      <ul>
        <li> Set up a <code>PURGE</code> handler in your cache.
        <li> Tell PageSpeed the url for the <code>PURGE</code> handler.
        <li> Have the cache set the <code>PS-CapabilityList</code> header
             so PageSpeed will emit HTML that can be sent to any browser.
        <li> Have the cache occasionally pass through requests to the origin with
             the <code>PS-ShouldBeacon</code> header set.
      </ul>
    </p>

    <p>
      For example, if you're running a cache on port 80 that reverse proxies to
      your site on port 8080, then you'd need to tell PageSpeed to send
      its <code>PURGE</code> requests to port 80:
    </p>
 <dl>
   <dt>Apache:<dd><pre class="prettyprint">
 ModPagespeedDownstreamCachePurgeLocationPrefix http://localhost:80</pre>
   <dt>Nginx:<dd><pre class="prettyprint">
 pagespeed DownstreamCachePurgeLocationPrefix http://localhost:80;</pre>
 </dl>
    <p>
      You also need to give PageSpeed a key so it can allow the cache to request
      rebeaconing without allowing external entities to do so:
    </p>
 <dl>
   <dt>Apache:<dd><pre class="prettyprint">
 ModPagespeedDownstreamCacheRebeaconingKey "&lt;your-secret-key&gt;"</pre>
   <dt>Nginx:<dd><pre class="prettyprint">
 pagespeed DownstreamCacheRebeaconingKey "&lt;your-secret-key&gt;";</pre>
 </dl>
    <p>
      These are the only changes you need to make to the PageSpeed configuration
      file, but before you restart you also need to make some changes to your
      cache configuration.  These vary by cache; below are configurations for
      Varnish 3.x, Varnish 4.x, and Nginx's proxy_cache:
    </p>
 <dl>
   <dt>Varnish 3.x:<dd><pre class="prettyprint">
 acl purge {
   # If PageSpeed isn't running on the same server as your cache, list the IP(s)
   # of the PageSpeed machine(s) here.
   "127.0.0.1";
 }
 sub vcl_recv {
   # Tell PageSpeed not to use optimizations specific to this request.
   set req.http.PS-CapabilityList = "fully general optimizations only";

   # Don't allow external entities to force beaconing.
   remove req.http.PS-ShouldBeacon;

   # Authenticate the purge request by IP.
   if (req.request == "PURGE") {
     if (!client.ip ~ purge) {
       error 405 "Not allowed.";
     }
     return (lookup);
   }
 }

 # Mark HTML as uncacheable.  If we can't send them purge requests they can't
 # cache our html.
 sub vcl_fetch {
    if (beresp.http.Content-Type ~ "text/html") {
      remove beresp.http.Cache-Control;
      set beresp.http.Cache-Control = "no-cache, max-age=0";
    }
    return (deliver);
 }

 sub vcl_hit {
   # Make purging happen in response to a PURGE request.  This happens
   # automatically in Varnish 4.x so we don't need it there.
   if (req.request == "PURGE") {
     purge;
     error 200 "Purged.";
   }

   # 5% of the time ignore that we got a cache hit and send the request to the
   # backend anyway for instrumentation.
   if (std.random(0, 100) < 5) {
     set req.http.PS-ShouldBeacon = "&lt;your-secret-key&gt;";
     return (pass);
   }
 }

 sub vcl_miss {
   # Make purging happen in response to a PURGE request.  This happens
   # automatically in Varnish 4.x so we don't need it there.
   if (req.request == "PURGE") {
     purge;
     error 200 "Purged.";
   }

   # Instrument 25% of cache misses.
   if (std.random(0, 100) < 25) {
     set req.http.PS-ShouldBeacon = "&lt;your-secret-key&gt;";
     return (pass);
   }
 }</pre>
   <dt>Varnish 4.x:<dd><pre class="prettyprint">
 acl purge {
   # If PageSpeed isn't running on the same server as your cache, list the IP(s)
   # of the PageSpeed machine(s) here.
   "127.0.0.1";
 }

 sub vcl_recv {
   # Tell PageSpeed not to use optimizations specific to this request.
   set req.http.PS-CapabilityList = "fully general optimizations only";

   # Don't allow external entities to force beaconing.
   unset req.http.PS-ShouldBeacon;

   # Authenticate the purge request by IP.
   if (req.method == "PURGE") {
     if (!client.ip ~ purge) {
       return (synth(405,"Not allowed."));
     }
     return (purge);
   }
 }

 # Mark HTML as uncacheable.  If we can't send them purge requests they can't
 # cache our html.
 sub vcl_backend_response {
    if (beresp.http.Content-Type ~ "text/html") {
      unset beresp.http.Cache-Control;
      set beresp.http.Cache-Control = "no-cache, max-age=0";
    }
    return (deliver);
 }

 sub vcl_hit {
   # 5% of the time ignore that we got a cache hit and send the request to the
   # backend anyway for instrumentation.
   if (std.random(0, 100) < 5) {
     set req.http.PS-ShouldBeacon = "&lt;your-secret-key&gt;";
     return (pass);
   }
 }
 sub vcl_miss {
   # Instrument 25% of cache misses.
   if (std.random(0, 100) < 25) {
     set req.http.PS-ShouldBeacon = "&lt;your-secret-key&gt;";
     return (pass);
   }
 }</pre>
   <dt>Nginx proxy_cache:<dd><pre class="prettyprint">
 http {
   # Define a mapping used to mark HTML as uncacheable.
   map $upstream_http_content_type $new_cache_control_header_val {
     default $upstream_http_cache_control;
     "~*text/html" "no-cache, max-age=0";
   }

   server {
     # PageSpeed's beacon dependent filters need the cache to let some requests
     # through to the backend.  This code below depends on the <a href="http://wiki.nginx.org/HttpSetMiscModule">ngx_set_misc</a>
     # module and randomly passes 5% of traffic to the backend for rebeaconing.
     set $should_beacon_header_val "";
     set_random $rand 0 100;
     if ($rand ~* "^[0-4]$") {
       set $should_beacon_header_val "&lt;your-secret-key&gt;";
       set $bypass_cache 1;
     }

     location / {
       # existing proxy_pass
       # existing proxy_cache
       # existing proxy_cache_key

       # What servers should we accept PURGE requests from?  If PageSpeed isn't
       # running on the same server as your cache, list the IP(s) of the
       # PageSpeed machine(s) here.
       #
       # This requires rebuilding with the ngx_cache_purge module:
       #   https://github.com/FRiCKLE/ngx_cache_purge
       proxy_cache_purge PURGE from 127.0.0.1;

       # Mark HTML as uncacheable.  If we can't send them purge requests they
       # can't cache our html.  Uses the map defined above.
       proxy_hide_header Cache-Control;
       add_header Cache-Control $new_cache_control_header_val;

       # Tell PageSpeed not to use optimizations specific to this request.
       proxy_set_header PS-CapabilityList "fully general optimizations only";

       # See discussion of rebeaconing above.
       proxy_cache_bypass $bypass_cache;
       proxy_hide_header PS-ShouldBeacon;
       proxy_set_header PS-ShouldBeacon $should_beacon_header_val;
     }
   }
 }</pre>
 </dl>

 <p>
   When running with downstream caching all resources referenced from the HTML
   will be cache-extended as usual, so if you have resources that need to be
   cached for a short time then they can be stale.  If so,
   either <code>Disallow</code> those resources, so PageSpeed doesn't inline or
   cache-extend them, or decrease the cache lifetime on your HTML.
 </p>

 <h2 id="additional">Additional Options</h2>

 The configuration above should be a good fit for most sites, but PageSpeed's
 downstream caching is highly configurable with many options that allow you to
 tweak it for your particular setup.

 <h3 id="beaconing">Beaconing</h3>
 <p>
   Several filters such as
   <a href="reference-image-optimize#inline_images">inline_images</a>,
   <a href="filter-inline-preview-images">inline_preview_images</a>,
   <a href="filter-lazyload-images">lazyload_images</a> and
   <a href="filter-prioritize-critical-css">prioritize_critical_css</a>
   depend extensively on client beacons to determine critical images and
   CSS. When such filters are enabled, pages periodically have beaconing
   JavaScript inserted as part of the rewriting process.
   The <a href="#standard">standard configuration</a> passes through 5% of cache
   hits to the backend with a <code>PS-ShouldBeacon</code> header set, so that
   these filters can continue to receive the beacons they need.
 </p>
 <p>
   If you have a high traffic site, 5% is probably a larger share than you need
   for PageSpeed to receive sufficient beacons.  In that case you can decrease
   the percentage of traffic to pass through.  For example, here's how you'd
   decrease it to 2%:
 </p>
 <dl>
   <dt>Varnish 3.x or 4.x:<dd><pre>
 <span class=diff-del>-  if (std.random(0, 100) < 5) {</span>
 <span class=diff-add>+  if (std.random(0, 100) < 2) {</span></pre>
   <dt>Nginx proxy_cache<dd><pre>
 <span class=diff-del>-  if ($rand ~* "^[0-4]$") {</span>
 <span class=diff-add>+  if ($rand ~* "^[01]$") {</span></pre>
 </dl>
 <p>
   Alternatively, you may be willing to give up the benefit of the
   beaconing-dependent filters in exchange for never intentionally bypassing the
   cache.  If so, you should <a href="config_filters#beacons">turn off
   beaconing</a> and beacon-dependent filters in PageSpeed:
 </p>
 <dl>
   <dt>Apache:<dd><pre class="prettyprint"
 >ModPagespeedCriticalImagesBeaconEnabled false
 ModPagespeedDisableFilters prioritize_critical_css</pre>
   <dt>Nginx:<dd><pre class="prettyprint"
 >pagespeed CriticalImagesBeaconEnabled false;
 pagespeed DisableFilters prioritize_critical_css;</pre>
 </dl>
 <p>
   Additionally you should remove the proxy config that handles beaconing:
 </p>
 <dl>
   <dt>Varnish 3.x:<dd><pre><span class="diff-del"
 >-  remove req.http.PS-ShouldBeacon;
 ...
 -  if (std.random(0, 100) < 5) {
 -    set req.http.PS-ShouldBeacon = "&lt;your-secret-key&gt;";
 -    return (pass);
 -  }
 ...
 -  if (std.random(0, 100) < 25) {
 -    set req.http.PS-ShouldBeacon = "&lt;your-secret-key&gt;";
 -    return (pass);
 -  }</span></pre>
   <dt>Varnish 4.x:<dd><pre><span class="diff-del"
 >-  unset req.http.PS-ShouldBeacon;
 ...
 -  sub vcl_hit {
 -    if (std.random(0, 100) < 5) {
 -      set req.http.PS-ShouldBeacon = "&lt;your-secret-key&gt;";
 -      return (pass);
 -    }
 -  }
 -  sub vcl_miss {
 -    if (std.random(0, 100) < 25) {
 -      set req.http.PS-ShouldBeacon = "&lt;your-secret-key&gt;";
 -      return (pass);
 -    }
 -  }</span></pre>
   <dt>Nginx proxy_cache<dd><pre><span class="diff-del"
 >-  set $should_beacon_header_val "";
 -  set_random $rand 0 100;
 -  if ($rand ~* "^[0-4]$") {
 -    set $should_beacon_header_val "&lt;your-secret-key&gt;";
 -    set $bypass_cache 1;
 -  }
 ...
 -  proxy_cache_bypass $bypass_cache;
 -  proxy_hide_header PS-ShouldBeacon;
 -  proxy_set_header PS-ShouldBeacon $should_beacon_header_val;</span></pre>
 </dl>


 <h3 id="ps-resource">PageSpeed Resources</h3>
 <p>
   Because PageSpeed already caches its optimized resources, you may want to
   exclude them caching by the downstream cache.  If so, you can set:

 <dl>
   <dt>Varnish 3.x and 4.x:<dd><pre><span class="diff-add"
 >+  if (req.url ~ "\.pagespeed\.([a-z]\.)?[a-z]{2}\.[^.]{10}\.[^.]+") {
 +    return (pass);
 +  }</span></pre>
   <dt>Nginx proxy_cache<dd><pre><span class="diff-add"
 >+  if ($uri ~ "\.pagespeed\.([a-z]\.)?[a-z]{2}\.[^.]{10}\.[^.]+") {
 +    set $bypass_cache "1";
 +  }</span></pre>
 </dl>

 <p>
   If you have enabled <a href="restricting_urls#url_signatures">URL signing</a>,
   change the <code>10</code> in the regexp to <code>20</code> to account for the
   additional characters in the hash.
 </p>

 <h3 id="ps-capabilitylist">PS-CapabilityList</h3>

 <p>
   Typically PageSpeed will produce different HTML for different browsers.  For
   example, when responding to a request that has <code>Accept:
   image/webp</code>, PageSpeed knows the requesting browser supports WebP and so
   it can send these images, while if the <code>Accept</code> header doesn't
   mention WebP then it will send JPEG or PNG.  To suppress this behavior,
   the <a href="#standard">standard configuration</a> above sets a header:
 </p>
 <pre>
   PS-CapabilityList: fully general optimizations only
 </pre>
 <p>
   This header can also be used to tell PageSpeed to make specific optimizations.
   There are five capabilities PageSpeed can take advantage of that aren't
   supported in all browsers, and it gives them each a code:
 </p>
 <table>
   <tr><th>Capability<th>Code
   <tr><td>Inline Images<td><code>ii</code>
   <tr><td>Lazyload Images<td><code>ll</code>
   <tr><td>WebP Images<td><code>jw</code>
   <tr><td>Lossless WebP Images<td><code>ws</code>
   <tr><td>Animated WebP Images<td><code>wa</code>
   <tr><td>Defer Javascript<td><code>dj</code>
 </table>
 <p>
   For example, you could include whether the <code>Accept</code> header
   includes <code>image/webp</code> in your cache key, and then for the
   fraction of traffic that claimed webp support send:
 </p>
 <pre>
   PS-CapabilityList: jw:
 </pre>
 <p>
   Every page would go through to your origin twice and be cached twice, once
   processed with WebP support and once without.
 </p>
 <p>
   You can combine multiple capabilities together with a comma.  For example, if
   you decided to make a cache fragment for Chrome 30+, which supports all of
   these, for that fragment you would send:
 </p>
 <pre>
   PS-CapabilityList: ll,ii,dj,jw,ws:
 </pre>
 <p>
   For Firefox 4+, which supports all of these but WebP, you would send:
 </p>
 <pre>
   PS-CapabilityList: ll,ii,dj:
 </pre>
 <p>
   To use this header properly, however, you have to know which capabilities are
   supported by which browsers in the version of PageSpeed you're using and craft
   regular expressions to match exactly those ones.  This is very difficult to do
   in general because it involves duplicating the code in
   <a href="https://github.com/apache/incubator-pagespeed-mod/blob/latest-stable/pagespeed/kernel/http/user_agent_matcher.cc"><code>user_agent_matcher.cc</code></a>
   as regexes, but a simple division is:
   <ul>
     <li>Chrome 32+: <code>ll,ii,dj,jw,wa,ws</code>
     <li>Firefox 4+, Safari, IE10 (but not IE11): <code>ll,ii,dj</code>
     <li>Everything else: <code>fully general optimizations only</code>
   </ul>
 </p>

 <h3 id="purge-get">Purging with GET</h3>

 <p>
   If you're integrating PageSpeed with a cache that doesn't
   support <code>PURGE</code> requests but does support purging in response to a
   prefixed <code>GET</code> request, PageSpeed can support that.  You would
   configure your cache to treat a <code>GET</code> to
   <code>/purge/foo/bar</code> as a request to purge <code>/foo/bar</code> and
   configure PageSpeed as:
 </p>
 <dl>
   <dt>Apache:<dd><pre class="prettyprint">
 ModPagespeedDownstreamCachePurgeLocationPrefix http://CACHE-HOST:PORT/purge
 ModPagespeedDownstreamCachePurgeMethod GET</pre>
   <dt>Nginx:<dd><pre class="prettyprint">
 pagespeed DownstreamCachePurgeLocationPrefix http://CACHE-HOST:PORT/purge;
 pagespeed DownstreamCachePurgeMethod GET;</pre>
 </dl>

 <h3 id="purge-threshold">Purge Threshold</h3>
 <p>
   Whenever PageSpeed serves an HTML response that is not fully optimized it
   continues rewriting in the background.  When it finishes, if the HTML it
   served was less than 95% optimized, it sends a purge request to the downstream
   cache.  The next request to come in will bypass the cache and come back to
   PageSpeed where it can serve out the now more highly optimized page.  If you
   want to change what point PageSpeed considers the page done and stops
   optimizing, you can set a different value for this threshold.  For example, to
   lower it to 80%, so that PageSpeed is satisfied with a page that is only 80%
   optimized, you would set:
 </p>
 <dl>
   <dt>Apache:<dd><pre class="prettyprint">
 ModPagespeedDownstreamCacheRewrittenPercentageThreshold 80</pre>
   <dt>Nginx:<dd><pre class="prettyprint">
 pagespeed DownstreamCacheRewrittenPercentageThreshold 80;</pre>
 </dl>

 <h3 id="script-variables">Script Variables</h3>
 <p class="note"><strong>Note: Nginx-only</strong></p>
 <p class="note"><strong>Note: New feature as of 1.10.33.0</strong></p>

 <p>
   In ngx_pagespeed <code>DownstreamCachePurgeLocationPrefix</code>,
   <code>DownstreamCachePurgeMethod</code>, and
   <code>DownstreamCacheRewrittenPercentageThreshold</code> support script
   variables, so it's possible to set them on a per-request basis.  Turn this on
   with:
 <pre class="prettyprint">
 http {
   pagespeed ProcessScriptVariables on;
   ...
 }</pre>
   You can then use script variables in arguments for these commands:
 <pre class="prettyprint">
   pagespeed DownstreamCachePurgeLocationPrefix "$purge_location";
   pagespeed DownstreamCachePurgeMethod "$cache_purge_method";
   pagespeed DownstreamCacheRewrittenPercentageThreshold "$rewrite_threshold";</pre>
 </p>
 <p>
   For more details on script variables, including how to handle dollar signs,
   see <a href="system#nginx_script_variables">Script Variable Support</a>.
 </p>
 <h2 id="implementation">Implementation Details</h2>
 <p>
   To support downstream caching PageSpeed sends a purge request to the caching
   layer whenever it identifies an opportunity for more rewriting to be done on
   content that was just served. Such opportunities could arise because of, say,
   the resources now becoming available in the PageSpeed cache or an image
   compression operation completing. The cache purge forces the next request for
   the HTML file to come all the way to the backend PageSpeed server and obtain
   better rewritten content, which is then stored in the cache. This interaction
   between the PageSpeed server and the downstream caching layer is depicted in
   the diagram given below.
 </p>
   <img src="images/downstream_caching.png" >
 <p>
   In the interaction depicted above, note that the partially optimized HTML will
   be served from the cache until a purge request gets sent by the PageSpeed
   server.  It is recommended to set up PageSpeed and the downstream caching
   layer servers on a one to one basis so that the purges can be sent to the
   correct downstream server.
 </p>
   </div>
   <!--#include virtual="_footer.html" -->
   </body>
 </html>
	<!--
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	-->

	<html>
	<head>
	<meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Configuring Downstream Caches</title>
	<link rel="stylesheet" href="doc.css">
	</head>
	<body>
	<!--#include virtual="_header.html" -->


	<div id=content>
	<h1>Configuring Downstream Caches</h1>
	<style>
	.diff-del { color: darkred }
	.diff-add { color: green }
	</style>

	<h2 id="standard">Standard Configuration</h2>
	<p class="warning"><strong>Note: This feature is currently experimental.
	Options and configuration described here are subject to change in future
	releases. Please subscribe to the <a href="mailing-lists#announcements"
	>announcements mailing list</a> to keep yourself informed of updates to
	this feature.</strong></p>

	<p>
	By default PageSpeed serves HTML files with <code>Cache-Control: no-cache,
	max-age=0</code> so that changes to the HTML and its resources are sent
	fresh on each request. The HTML can be cached, however, if you:
	<ul>
	<li> Set up a <code>PURGE</code> handler in your cache.
	<li> Tell PageSpeed the url for the <code>PURGE</code> handler.
	<li> Have the cache set the <code>PS-CapabilityList</code> header
	so PageSpeed will emit HTML that can be sent to any browser.
	<li> Have the cache occasionally pass through requests to the origin with
	the <code>PS-ShouldBeacon</code> header set.
	</ul>
	</p>

	<p>
	For example, if you're running a cache on port 80 that reverse proxies to
	your site on port 8080, then you'd need to tell PageSpeed to send
	its <code>PURGE</code> requests to port 80:
	</p>
	<dl>
	<dt>Apache:<dd><pre class="prettyprint">
	ModPagespeedDownstreamCachePurgeLocationPrefix http://localhost:80</pre>
	<dt>Nginx:<dd><pre class="prettyprint">
	pagespeed DownstreamCachePurgeLocationPrefix http://localhost:80;</pre>
	</dl>
	<p>
	You also need to give PageSpeed a key so it can allow the cache to request
	rebeaconing without allowing external entities to do so:
	</p>
	<dl>
	<dt>Apache:<dd><pre class="prettyprint">
	ModPagespeedDownstreamCacheRebeaconingKey "<your-secret-key>"</pre>
	<dt>Nginx:<dd><pre class="prettyprint">
	pagespeed DownstreamCacheRebeaconingKey "<your-secret-key>";</pre>
	</dl>
	<p>
	These are the only changes you need to make to the PageSpeed configuration
	file, but before you restart you also need to make some changes to your
	cache configuration. These vary by cache; below are configurations for
	Varnish 3.x, Varnish 4.x, and Nginx's proxy_cache:
	</p>
	<dl>
	<dt>Varnish 3.x:<dd><pre class="prettyprint">
	acl purge {
	# If PageSpeed isn't running on the same server as your cache, list the IP(s)
	# of the PageSpeed machine(s) here.
	"127.0.0.1";
	}
	sub vcl_recv {
	# Tell PageSpeed not to use optimizations specific to this request.
	set req.http.PS-CapabilityList = "fully general optimizations only";

	# Don't allow external entities to force beaconing.
	remove req.http.PS-ShouldBeacon;

	# Authenticate the purge request by IP.
	if (req.request == "PURGE") {
	if (!client.ip ~ purge) {
	error 405 "Not allowed.";
	}
	return (lookup);
	}
	}

	# Mark HTML as uncacheable. If we can't send them purge requests they can't
	# cache our html.
	sub vcl_fetch {
	if (beresp.http.Content-Type ~ "text/html") {
	remove beresp.http.Cache-Control;
	set beresp.http.Cache-Control = "no-cache, max-age=0";
	}
	return (deliver);
	}

	sub vcl_hit {
	# Make purging happen in response to a PURGE request. This happens
	# automatically in Varnish 4.x so we don't need it there.
	if (req.request == "PURGE") {
	purge;
	error 200 "Purged.";
	}

	# 5% of the time ignore that we got a cache hit and send the request to the
	# backend anyway for instrumentation.
	if (std.random(0, 100) < 5) {
	set req.http.PS-ShouldBeacon = "<your-secret-key>";
	return (pass);
	}
	}

	sub vcl_miss {
	# Make purging happen in response to a PURGE request. This happens
	# automatically in Varnish 4.x so we don't need it there.
	if (req.request == "PURGE") {
	purge;
	error 200 "Purged.";
	}

	# Instrument 25% of cache misses.
	if (std.random(0, 100) < 25) {
	set req.http.PS-ShouldBeacon = "<your-secret-key>";
	return (pass);
	}
	}</pre>
	<dt>Varnish 4.x:<dd><pre class="prettyprint">
	acl purge {
	# If PageSpeed isn't running on the same server as your cache, list the IP(s)
	# of the PageSpeed machine(s) here.
	"127.0.0.1";
	}

	sub vcl_recv {
	# Tell PageSpeed not to use optimizations specific to this request.
	set req.http.PS-CapabilityList = "fully general optimizations only";

	# Don't allow external entities to force beaconing.
	unset req.http.PS-ShouldBeacon;

	# Authenticate the purge request by IP.
	if (req.method == "PURGE") {
	if (!client.ip ~ purge) {
	return (synth(405,"Not allowed."));
	}
	return (purge);
	}
	}

	# Mark HTML as uncacheable. If we can't send them purge requests they can't
	# cache our html.
	sub vcl_backend_response {
	if (beresp.http.Content-Type ~ "text/html") {
	unset beresp.http.Cache-Control;
	set beresp.http.Cache-Control = "no-cache, max-age=0";
	}
	return (deliver);
	}

	sub vcl_hit {
	# 5% of the time ignore that we got a cache hit and send the request to the
	# backend anyway for instrumentation.
	if (std.random(0, 100) < 5) {
	set req.http.PS-ShouldBeacon = "<your-secret-key>";
	return (pass);
	}
	}
	sub vcl_miss {
	# Instrument 25% of cache misses.
	if (std.random(0, 100) < 25) {
	set req.http.PS-ShouldBeacon = "<your-secret-key>";
	return (pass);
	}
	}</pre>
	<dt>Nginx proxy_cache:<dd><pre class="prettyprint">
	http {
	# Define a mapping used to mark HTML as uncacheable.
	map $upstream_http_content_type $new_cache_control_header_val {
	default $upstream_http_cache_control;
	"~*text/html" "no-cache, max-age=0";
	}

	server {
	# PageSpeed's beacon dependent filters need the cache to let some requests
	# through to the backend. This code below depends on the <a href="http://wiki.nginx.org/HttpSetMiscModule">ngx_set_misc</a>
	# module and randomly passes 5% of traffic to the backend for rebeaconing.
	set $should_beacon_header_val "";
	set_random $rand 0 100;
	if ($rand ~* "^[0-4]$") {
	set $should_beacon_header_val "<your-secret-key>";
	set $bypass_cache 1;
	}

	location / {
	# existing proxy_pass
	# existing proxy_cache
	# existing proxy_cache_key

	# What servers should we accept PURGE requests from? If PageSpeed isn't
	# running on the same server as your cache, list the IP(s) of the
	# PageSpeed machine(s) here.
	#
	# This requires rebuilding with the ngx_cache_purge module:
	# https://github.com/FRiCKLE/ngx_cache_purge
	proxy_cache_purge PURGE from 127.0.0.1;

	# Mark HTML as uncacheable. If we can't send them purge requests they
	# can't cache our html. Uses the map defined above.
	proxy_hide_header Cache-Control;
	add_header Cache-Control $new_cache_control_header_val;

	# Tell PageSpeed not to use optimizations specific to this request.
	proxy_set_header PS-CapabilityList "fully general optimizations only";

	# See discussion of rebeaconing above.
	proxy_cache_bypass $bypass_cache;
	proxy_hide_header PS-ShouldBeacon;
	proxy_set_header PS-ShouldBeacon $should_beacon_header_val;
	}
	}
	}</pre>
	</dl>

	<p>
	When running with downstream caching all resources referenced from the HTML
	will be cache-extended as usual, so if you have resources that need to be
	cached for a short time then they can be stale. If so,
	either <code>Disallow</code> those resources, so PageSpeed doesn't inline or
	cache-extend them, or decrease the cache lifetime on your HTML.
	</p>

	<h2 id="additional">Additional Options</h2>

	The configuration above should be a good fit for most sites, but PageSpeed's
	downstream caching is highly configurable with many options that allow you to
	tweak it for your particular setup.

	<h3 id="beaconing">Beaconing</h3>
	<p>
	Several filters such as
	<a href="reference-image-optimize#inline_images">inline_images</a>,
	<a href="filter-inline-preview-images">inline_preview_images</a>,
	<a href="filter-lazyload-images">lazyload_images</a> and
	<a href="filter-prioritize-critical-css">prioritize_critical_css</a>
	depend extensively on client beacons to determine critical images and
	CSS. When such filters are enabled, pages periodically have beaconing
	JavaScript inserted as part of the rewriting process.
	The <a href="#standard">standard configuration</a> passes through 5% of cache
	hits to the backend with a <code>PS-ShouldBeacon</code> header set, so that
	these filters can continue to receive the beacons they need.
	</p>
	<p>
	If you have a high traffic site, 5% is probably a larger share than you need
	for PageSpeed to receive sufficient beacons. In that case you can decrease
	the percentage of traffic to pass through. For example, here's how you'd
	decrease it to 2%:
	</p>
	<dl>
	<dt>Varnish 3.x or 4.x:<dd><pre>
	<span class=diff-del>- if (std.random(0, 100) < 5) {</span>
	<span class=diff-add>+ if (std.random(0, 100) < 2) {</span></pre>
	<dt>Nginx proxy_cache<dd><pre>
	<span class=diff-del>- if ($rand ~* "^[0-4]$") {</span>
	<span class=diff-add>+ if ($rand ~* "^[01]$") {</span></pre>
	</dl>
	<p>
	Alternatively, you may be willing to give up the benefit of the
	beaconing-dependent filters in exchange for never intentionally bypassing the
	cache. If so, you should <a href="config_filters#beacons">turn off
	beaconing</a> and beacon-dependent filters in PageSpeed:
	</p>
	<dl>
	<dt>Apache:<dd><pre class="prettyprint"
	>ModPagespeedCriticalImagesBeaconEnabled false
	ModPagespeedDisableFilters prioritize_critical_css</pre>
	<dt>Nginx:<dd><pre class="prettyprint"
	>pagespeed CriticalImagesBeaconEnabled false;
	pagespeed DisableFilters prioritize_critical_css;</pre>
	</dl>
	<p>
	Additionally you should remove the proxy config that handles beaconing:
	</p>
	<dl>
	<dt>Varnish 3.x:<dd><pre><span class="diff-del"
	>- remove req.http.PS-ShouldBeacon;
	...
	- if (std.random(0, 100) < 5) {
	- set req.http.PS-ShouldBeacon = "<your-secret-key>";
	- return (pass);
	- }
	...
	- if (std.random(0, 100) < 25) {
	- set req.http.PS-ShouldBeacon = "<your-secret-key>";
	- return (pass);
	- }</span></pre>
	<dt>Varnish 4.x:<dd><pre><span class="diff-del"
	>- unset req.http.PS-ShouldBeacon;
	...
	- sub vcl_hit {
	- if (std.random(0, 100) < 5) {
	- set req.http.PS-ShouldBeacon = "<your-secret-key>";
	- return (pass);
	- }
	- }
	- sub vcl_miss {
	- if (std.random(0, 100) < 25) {
	- set req.http.PS-ShouldBeacon = "<your-secret-key>";
	- return (pass);
	- }
	- }</span></pre>
	<dt>Nginx proxy_cache<dd><pre><span class="diff-del"
	>- set $should_beacon_header_val "";
	- set_random $rand 0 100;
	- if ($rand ~* "^[0-4]$") {
	- set $should_beacon_header_val "<your-secret-key>";
	- set $bypass_cache 1;
	- }
	...
	- proxy_cache_bypass $bypass_cache;
	- proxy_hide_header PS-ShouldBeacon;
	- proxy_set_header PS-ShouldBeacon $should_beacon_header_val;</span></pre>
	</dl>


	<h3 id="ps-resource">PageSpeed Resources</h3>
	<p>
	Because PageSpeed already caches its optimized resources, you may want to
	exclude them caching by the downstream cache. If so, you can set:

	<dl>
	<dt>Varnish 3.x and 4.x:<dd><pre><span class="diff-add"
	>+ if (req.url ~ "\.pagespeed\.([a-z]\.)?[a-z]{2}\.[^.]{10}\.[^.]+") {
	+ return (pass);
	+ }</span></pre>
	<dt>Nginx proxy_cache<dd><pre><span class="diff-add"
	>+ if ($uri ~ "\.pagespeed\.([a-z]\.)?[a-z]{2}\.[^.]{10}\.[^.]+") {
	+ set $bypass_cache "1";
	+ }</span></pre>
	</dl>

	<p>
	If you have enabled <a href="restricting_urls#url_signatures">URL signing</a>,
	change the <code>10</code> in the regexp to <code>20</code> to account for the
	additional characters in the hash.
	</p>

	<h3 id="ps-capabilitylist">PS-CapabilityList</h3>

	<p>
	Typically PageSpeed will produce different HTML for different browsers. For
	example, when responding to a request that has <code>Accept:
	image/webp</code>, PageSpeed knows the requesting browser supports WebP and so
	it can send these images, while if the <code>Accept</code> header doesn't
	mention WebP then it will send JPEG or PNG. To suppress this behavior,
	the <a href="#standard">standard configuration</a> above sets a header:
	</p>
	<pre>
	PS-CapabilityList: fully general optimizations only
	</pre>
	<p>
	This header can also be used to tell PageSpeed to make specific optimizations.
	There are five capabilities PageSpeed can take advantage of that aren't
	supported in all browsers, and it gives them each a code:
	</p>
	<table>
	<tr><th>Capability<th>Code
	<tr><td>Inline Images<td><code>ii</code>
	<tr><td>Lazyload Images<td><code>ll</code>
	<tr><td>WebP Images<td><code>jw</code>
	<tr><td>Lossless WebP Images<td><code>ws</code>
	<tr><td>Animated WebP Images<td><code>wa</code>
	<tr><td>Defer Javascript<td><code>dj</code>
	</table>
	<p>
	For example, you could include whether the <code>Accept</code> header
	includes <code>image/webp</code> in your cache key, and then for the
	fraction of traffic that claimed webp support send:
	</p>
	<pre>
	PS-CapabilityList: jw:
	</pre>
	<p>
	Every page would go through to your origin twice and be cached twice, once
	processed with WebP support and once without.
	</p>
	<p>
	You can combine multiple capabilities together with a comma. For example, if
	you decided to make a cache fragment for Chrome 30+, which supports all of
	these, for that fragment you would send:
	</p>
	<pre>
	PS-CapabilityList: ll,ii,dj,jw,ws:
	</pre>
	<p>
	For Firefox 4+, which supports all of these but WebP, you would send:
	</p>
	<pre>
	PS-CapabilityList: ll,ii,dj:
	</pre>
	<p>
	To use this header properly, however, you have to know which capabilities are
	supported by which browsers in the version of PageSpeed you're using and craft
	regular expressions to match exactly those ones. This is very difficult to do
	in general because it involves duplicating the code in
	<a href="https://github.com/apache/incubator-pagespeed-mod/blob/latest-stable/pagespeed/kernel/http/user_agent_matcher.cc"><code>user_agent_matcher.cc</code></a>
	as regexes, but a simple division is:
	<ul>
	<li>Chrome 32+: <code>ll,ii,dj,jw,wa,ws</code>
	<li>Firefox 4+, Safari, IE10 (but not IE11): <code>ll,ii,dj</code>
	<li>Everything else: <code>fully general optimizations only</code>
	</ul>
	</p>

	<h3 id="purge-get">Purging with GET</h3>

	<p>
	If you're integrating PageSpeed with a cache that doesn't
	support <code>PURGE</code> requests but does support purging in response to a
	prefixed <code>GET</code> request, PageSpeed can support that. You would
	configure your cache to treat a <code>GET</code> to
	<code>/purge/foo/bar</code> as a request to purge <code>/foo/bar</code> and
	configure PageSpeed as:
	</p>
	<dl>
	<dt>Apache:<dd><pre class="prettyprint">
	ModPagespeedDownstreamCachePurgeLocationPrefix http://CACHE-HOST:PORT/purge
	ModPagespeedDownstreamCachePurgeMethod GET</pre>
	<dt>Nginx:<dd><pre class="prettyprint">
	pagespeed DownstreamCachePurgeLocationPrefix http://CACHE-HOST:PORT/purge;
	pagespeed DownstreamCachePurgeMethod GET;</pre>
	</dl>

	<h3 id="purge-threshold">Purge Threshold</h3>
	<p>
	Whenever PageSpeed serves an HTML response that is not fully optimized it
	continues rewriting in the background. When it finishes, if the HTML it
	served was less than 95% optimized, it sends a purge request to the downstream
	cache. The next request to come in will bypass the cache and come back to
	PageSpeed where it can serve out the now more highly optimized page. If you
	want to change what point PageSpeed considers the page done and stops
	optimizing, you can set a different value for this threshold. For example, to
	lower it to 80%, so that PageSpeed is satisfied with a page that is only 80%
	optimized, you would set:
	</p>
	<dl>
	<dt>Apache:<dd><pre class="prettyprint">
	ModPagespeedDownstreamCacheRewrittenPercentageThreshold 80</pre>
	<dt>Nginx:<dd><pre class="prettyprint">
	pagespeed DownstreamCacheRewrittenPercentageThreshold 80;</pre>
	</dl>

	<h3 id="script-variables">Script Variables</h3>
	<p class="note"><strong>Note: Nginx-only</strong></p>
	<p class="note"><strong>Note: New feature as of 1.10.33.0</strong></p>

	<p>
	In ngx_pagespeed <code>DownstreamCachePurgeLocationPrefix</code>,
	<code>DownstreamCachePurgeMethod</code>, and
	<code>DownstreamCacheRewrittenPercentageThreshold</code> support script
	variables, so it's possible to set them on a per-request basis. Turn this on
	with:
	<pre class="prettyprint">
	http {
	pagespeed ProcessScriptVariables on;
	...
	}</pre>
	You can then use script variables in arguments for these commands:
	<pre class="prettyprint">
	pagespeed DownstreamCachePurgeLocationPrefix "$purge_location";
	pagespeed DownstreamCachePurgeMethod "$cache_purge_method";
	pagespeed DownstreamCacheRewrittenPercentageThreshold "$rewrite_threshold";</pre>
	</p>
	<p>
	For more details on script variables, including how to handle dollar signs,
	see <a href="system#nginx_script_variables">Script Variable Support</a>.
	</p>
	<h2 id="implementation">Implementation Details</h2>
	<p>
	To support downstream caching PageSpeed sends a purge request to the caching
	layer whenever it identifies an opportunity for more rewriting to be done on
	content that was just served. Such opportunities could arise because of, say,
	the resources now becoming available in the PageSpeed cache or an image
	compression operation completing. The cache purge forces the next request for
	the HTML file to come all the way to the backend PageSpeed server and obtain
	better rewritten content, which is then stored in the cache. This interaction
	between the PageSpeed server and the downstream caching layer is depicted in
	the diagram given below.
	</p>
	<img src="images/downstream_caching.png" >
	<p>
	In the interaction depicted above, note that the partially optimized HTML will
	be served from the cache until a purge request gets sent by the PageSpeed
	server. It is recommended to set up PageSpeed and the downstream caching
	layer servers on a one to one basis so that the purges can be sent to the
	correct downstream server.
	</p>
	</div>
	<!--#include virtual="_footer.html" -->
	</body>
	</html>