<?xml version="1.0"?>

<document>
	<properties>
		<title>Indexed Disk Auxiliary Cache</title>
		<author email="ASmuts@apache.org">Aaron Smuts</author>
	</properties>

	<body>
		<section name="Indexed Disk Auxiliary Cache">
			<p>
				The Indexed Disk Auxiliary Cache is an optional plugin
				for the JCS. It is primarily intended to provide a
				secondary store to ease the memory burden of the cache.
				When the memory cache exceeds its maximum size it tells
				the cache hub that the item to be removed from memory
				should be spooled to disk. The cache checks to see if
				any auxiliaries of type "disk" have been configured for
				the region. If the "Indexed Disk Auxiliary Cache" is
				used, the item will be spooled to disk.
			</p>

			<subsection name="Disk Indexing">
				<p>
					The Indexed Disk Auxiliary Cache follows the fastest
					pattern of disk caching. Items are stored at the end
					of a file dedicated to the cache region. The first
					byte of each disk entry specifies the length of the
					entry. The start position in the file is saved in
					memory, referenced by the item's key. Though this
					still requires memory, it is insignificant given the
					performance trade off. Depending on the key size,
					500,000 disk entries will probably only require
					about 1 MB of memory. Locating the position of an
					item is as fast as a map lookup and the retrieval of
					the item only requires 2 disk accesses.
				</p>
				<p>
					When items are removed from the disk cache, the
					location of the available block on the storage file
					is recorded in a sorted preferential array of a size
					not to exceed the maximum number of keys allowed in
					memory. This allows the disk cache to reuse empty
					spots, thereby keeping the file size to a minimum.
				</p>
			</subsection>

			<subsection name="Purgatory">
				<p>
					Writing to the disk cache is asynchronous and made
					efficient by using a memory staging area called
					purgatory. Retrievals check purgatory then disk for
					an item. When items are sent to purgatory they are
					simultaneously queued to be put to disk. If an item
					is retrieved from purgatory it will no longer be
					written to disk, since the cache hub will move it
					back to memory. Using purgatory insures that there
					is no wait for disk writes, unecessary disk writes
					are avoided for borderline items, and the items are
					always available.
				</p>
			</subsection>

			<subsection name="Persistence">
				<p>
					When the disk cache is properly shutdown, the memory
					index is written to disk and the value file is
					defragmented. When the cache starts up, the disk
					cache can be configured to read or delete the index
					file. This provides an unreliable persistence
					mechanism.
				</p>
			</subsection>

			<subsection name="Configuration">
				<p>
					The simple configuration and is done in the
					auxiliary cache section of the
					<code>cache.ccf</code>
					configuration file. In the example below, I created
					an Indexed Disk Auxiliary Cache referenced by
					<code>DC</code>
					. It uses files located in the "DiskPath" directory.
				</p>
				<p>
					The Disk indexes are equipped with an LRU storage
					limit. The maximum number of keys is configured by
					the maxKeySize parameter. If the maximum key size is
					less than 0, no limit will be placed on the number
					of keys. By default, the max key size is 5000.
				</p>
				<source>
					<![CDATA[
jcs.auxiliary.DC=
    org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheFactory
jcs.auxiliary.DC.attributes=
    org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheAttributes
jcs.auxiliary.DC.attributes.DiskPath=g:\dev\jakarta-turbine-stratum\raf
jcs.auxiliary.DC.attributes.MaxKeySize=100000
        ]]>
				</source>
			</subsection>

			<subsection name="Additional Configuration Options">
				<p>
					The indexed disk cache provides some additional
					configuration options.
				</p>
				<p>
					The purgatory size of the Disk cache is equipped
					with an LRU storage limit. The maximum number of
					elements allowed in purgatory is configured by the
					MaxPurgatorySize parameter. By default, the max
					purgatory size is 5000.
				</p>
				<p>
					Initial testing indicates that the disk cache
					performs better when the key and purgatory sizes are
					limited.
				</p>
				<source>
					<![CDATA[
jcs.auxiliary.DC.attributes.MaxPurgatorySize=10000
        ]]>
				</source>
				<p>
					Slots in the data file become empty when items are
					removed from the disk cache. The indexed disk cache
					keeps track of empty slots in the data file, so they
					can be reused. The slot locations are stored in a
					sorted preferential array -- the recycle bin. The
					smallest items are removed from the recycle bin when
					it reaches the specified limit. The
					MaxRecycleBinSize cannot be larger than the
					MaxKeySize. If the MaxKeySize is less than 0, the
					recycle bin will default to 5000.
				</p>
				<source>
					<![CDATA[
jcs.auxiliary.DC.attributes.MaxRecycleBinSize=10000
        ]]>
				</source>
				<p>
					The Disk cache can be configured to defragment the
					data file at runtime. Since defragmentation is only
					necessary if items have been removed, the
					deframentation interval is determined by the number
					of removes. Currently there is no way to schedule
					defragmentation to run at a set time. If you set the
					OptimizeAtRemoveCount to -1, no optimizations of the
					data file will occur until shutdown. By default the
					value is -1.
				</p>
				<source>
					<![CDATA[
jcs.auxiliary.DC.attributes.OptimizeAtRemoveCount=30000
        ]]>
				</source>
			</subsection>

			<subsection name="A Complete Configuration Example">
				<p>
					In this sample cache.ccf file, I configured the
					cache to use a disk cache, called DC, by default.
					Also, I explicitly set a cache region called
					myRegion1 to use DC. I specified custom settings for
					all of the Indexed Disk Cache configuration
					parameters.
				</p>
				<source>
					<![CDATA[        
##############################################################
##### Default Region Configuration
jcs.default=DC
jcs.default.cacheattributes=org.apache.jcs.engine.CompositeCacheAttributes
jcs.default.cacheattributes.MaxObjects=100
jcs.default.cacheattributes.MemoryCacheName=org.apache.jcs.engine.memory.lru.LRUMemoryCache

##############################################################
##### CACHE REGIONS
jcs.region.myRegion1=DC
jcs.region.myRegion1.cacheattributes=org.apache.jcs.engine.CompositeCacheAttributes
jcs.region.myRegion1.cacheattributes.MaxObjects=1000
jcs.region.myRegion1.cacheattributes.MemoryCacheName=org.apache.jcs.engine.memory.lru.LRUMemoryCache

##############################################################
##### AUXILIARY CACHES
# Indexed Disk Cache
jcs.auxiliary.DC=org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheFactory
jcs.auxiliary.DC.attributes=org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheAttributes
jcs.auxiliary.DC.attributes.DiskPath=target/test-sandbox/indexed-disk-cache
jcs.auxiliary.DC.attributes.MaxPurgatorySize=10000
jcs.auxiliary.DC.attributes.MaxKeySize=10000
jcs.auxiliary.DC.attributes.OptimizeAtRemoveCount=300000
jcs.auxiliary.DC.attributes.MaxRecycleBinSize=7500
        ]]>
				</source>
			</subsection>

			<subsection name="Using Thread Pools to Reduce Threads">
				<p>
					The Indexed Disk Cache allows you to use fewer
					threads than active regions. By default the disk
					cache will use the standard cache event queue which
					has a dedicated thread. Although the standard queue
					kills its worker thread after a minute of
					inactivity, you may want to restrict the total
					number of threads. You can accomplish this by using
					a pooled event queue.
				</p>
				<p>
					The configuration file below defines a disk cache
					called DC2. It uses an event queue of type POOLED.
					The queue is named disk_cache_event_queue. The
					disk_cache_event_queue is defined in the bottom of
					the file.
				</p>
				<source>
					<![CDATA[ 
##############################################################
################## DEFAULT CACHE REGION  #####################
# sets the default aux value for any non configured caches
jcs.default=DC2
jcs.default.cacheattributes=org.apache.jcs.engine.CompositeCacheAttributes
jcs.default.cacheattributes.MaxObjects=200001
jcs.default.cacheattributes.MemoryCacheName=org.apache.jcs.engine.memory.lru.LRUMemoryCache
jcs.default.cacheattributes.UseMemoryShrinker=false
jcs.default.cacheattributes.MaxMemoryIdleTimeSeconds=3600
jcs.default.cacheattributes.ShrinkerIntervalSeconds=60
jcs.default.elementattributes=org.apache.jcs.engine.ElementAttributes
jcs.default.elementattributes.IsEternal=false
jcs.default.elementattributes.MaxLifeSeconds=700
jcs.default.elementattributes.IdleTime=1800
jcs.default.elementattributes.IsSpool=true
jcs.default.elementattributes.IsRemote=true
jcs.default.elementattributes.IsLateral=true

##############################################################
################## AUXILIARY CACHES AVAILABLE ################

# Disk Cache Using a Pooled Event Queue -- this allows you
# to control the maximum number of threads it will use.
# Each region uses 1 thread by default in the SINGLE model.
# adding more threads than regions does not help performance.
# If you want to use a separate pool for each disk cache, either use
# the single model or define a different auxiliary for each region and use the Pooled type.
# SINGLE is generally best unless you ahve a huge # of regions.
jcs.auxiliary.DC2=org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheFactory
jcs.auxiliary.DC2.attributes=org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheAttributes
jcs.auxiliary.DC2.attributes.DiskPath=target/test-sandbox/raf
jcs.auxiliary.DC2.attributes.MaxPurgatorySize=10000
jcs.auxiliary.DC2.attributes.MaxKeySize=10000
jcs.auxiliary.DC2.attributes.MaxRecycleBinSize=5000
jcs.auxiliary.DC2.attributes.OptimizeAtRemoveCount=300000
jcs.auxiliary.DC2.attributes.EventQueueType=POOLED
jcs.auxiliary.DC2.attributes.EventQueuePoolName=disk_cache_event_queue

##############################################################
################## OPTIONAL THREAD POOL CONFIGURATION ########

# Disk Cache Event Queue Pool
thread_pool.disk_cache_event_queue.useBoundary=false
thread_pool.remote_cache_client.maximumPoolSize=15
thread_pool.disk_cache_event_queue.minimumPoolSize=1
thread_pool.disk_cache_event_queue.keepAliveTime=3500
thread_pool.disk_cache_event_queue.startUpSize=1
        ]]>
				</source>
			</subsection>
		</section>
	</body>
</document>
