blob: 18399f7d353f898063e5e5f268f45e07c117552e [file] [log] [blame]
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
.. include:: ../../common.defs
.. _developer-cache-tiered-storage:
Tiered Storage
**************
Tiered storage is an attempt to allow |TS| to take advantage of physical storage
with different properties. This design concerns only mechanism. Policies to take
advantage of these are outside of the scope of this document. Instead we will
presume an *oracle* which implements this policy and describe the queries that
must be answered by the oracle and the effects of the answers.
Beyond avoiding question of tier policy, the design is also intended to be
effectively identical to current operations for the case where there is only
one tier.
The most common case for tiers is an ordered list of tiers, where higher tiers
are presumed faster but more expensive (or more limited in capacity). This is
not required. It might be that different tiers are differentiated by other
properties (such as expected persistence). The design here is intended to
handle both cases.
The design presumes that if a user has multiple tiers of storage and an ordering
for those tiers, they will usually want content stored at one tier level to also
be stored at every other lower level as well, so that it does not have to be
copied if evicted from a higher tier.
Configuration
=============
Each :term:`storage unit` in :file:`storage.config` can be marked with a
*quality* value which is 32 bit number. Storage units that are not marked are
all assigned the same value which is guaranteed to be distinct from all explicit
values. The quality value is arbitrary from the point of view of this design,
serving as a tag rather than a numeric value. The user (via the oracle) can
impose what ever additional meaning is useful on this value (rating, bit
slicing, etc.).
In such cases, all :term:`volumes <cache volume>` should be explicitly assigned
a value, as the default (unmarked) value is not guaranteed to have any
relationship to explicit values. The unmarked value is intended to be useful in
situations where the user has no interest in tiered storage and so wants to let
|TS| automatically handle all volumes as a single tier.
Operations
==========
After a client request is received and processed, volume assignment is done. For
each tier, the oracle would return one of four values along with a volume
pointer:
``READ``
The tier appears to have the object and can serve it.
``WRITE``
The object is not in this tier and should be written to this tier if
possible.
``RW``
Treat as ``READ`` if possible, but if the object turns out to not in the
cache treat as ``WRITE``.
``NO_SALE``
Do not interact with this tier for this object.
The :term:`volume <cache volume>` returned for the tier must be a volume with
the corresponding tier quality value. In effect, the current style of volume
assignment is done for each tier, by assigning one volume out of all of the
volumes of the same quality and returning one of ``RW`` or ``WRITE``, depending
on whether the initial volume directory lookup succeeds. Note that as with
current volume assignment, it is presumed this can be done from in memory
structures (no disk I/O required).
If the oracle returns ``READ`` or ``RW`` for more than one tier, it must also
return an ordering for those tiers (it may return an ordering for all tiers,
ones that are not readable will be ignored). For each tier, in that order, a
read of cache storage is attempted for the object. A successful read locks that
tier as the provider of cached content. If no tier has a successful read, or no
tier is marked ``READ`` or ``RW`` then it is a cache miss. Any tier marked
``RW`` that fails the read test is demoted to ``WRITE``.
If the object is cached, every tier that returns ``WRITE`` receives the object
to store in the selected volume (this includes ``RW`` returns that are demoted
to ``WRITE``). This is a cache to cache copy, not from the :term:`origin server`.
In this case, tiers marked ``RW`` that are not tested for read will not receive
any data and will not be further involved in the request processing.
For a cache miss, all tiers marked ``WRITE`` will receive data from the origin
server connection (if successful).
This means, among other things, that if there is a tier with the object all
other tiers that are written will get a local copy of the object, and the origin
server will not be used. In terms of implementation, currently a cache write to
a volume is done via the construction of an instance of :cpp:class:`CacheVC`
which receives the object stream. For tiered storage, the same thing is done
for each target volume.
For cache volume overrides (via :file:`hosting.config`) this same process is
used except with only the volumes stripes contained within the specified cache
volume.
Copying
=======
It may be necessary to provide a mechanism to copy objects between tiers outside
of a client originated transaction. In terms of implementation, this is straight
forward using :cpp:class:`HttpTunnel` as if in a transaction, only using a
:cpp:class:`CacheVC` instance for both the producer and consumer. The more
difficult question is what event would trigger a possible copy. A signal could
be provided whenever a volume directory entry is deleted, although it should be
noted that the object in question may have already been evicted when this event
happens.
Additional Notes
================
As an example use, it would be possible to have only one cache volume that uses
tiered storage for a particular set of domains using volume tagging.
:file:`hosting.config` would be used to direct those domains to the selected
cache volume. The oracle would check the URL in parallel and return ``NO_SALE``
for the tiers in the target cache volume for other domains. For the other tier
(that of the unmarked storage units), the oracle would return ``RW`` for the
tier in all cases as that tier would not be queried for the target domains.