blob: 5f8e14302a9b8744a4ec73b44d5b4feccd3530af [file] [log] [blame]
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
<link rel="icon" href="/favicon.ico" type="image/x-icon">
<title>Storm Distributed Cache API</title>
<!-- Bootstrap core CSS -->
<link href="/assets/css/bootstrap.min.css" rel="stylesheet">
<!-- Bootstrap theme -->
<link href="/assets/css/bootstrap-theme.min.css" rel="stylesheet">
<!-- Custom styles for this template -->
<link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css">
<link href="/css/style.css" rel="stylesheet">
<link href="/assets/css/owl.theme.css" rel="stylesheet">
<link href="/assets/css/owl.carousel.css" rel="stylesheet">
<script type="text/javascript" src="/assets/js/jquery.min.js"></script>
<script type="text/javascript" src="/assets/js/bootstrap.min.js"></script>
<script type="text/javascript" src="/assets/js/owl.carousel.min.js"></script>
<script type="text/javascript" src="/assets/js/storm.js"></script>
<!-- Just for debugging purposes. Don't actually copy these 2 lines! -->
<!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]-->
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<header>
<div class="container-fluid">
<div class="row">
<div class="col-md-5">
<a href="/index.html"><img src="/images/logo.png" class="logo" /></a>
</div>
<div class="col-md-5">
<h1>Version: 2.3.0</h1>
</div>
<div class="col-md-2">
<a href="/downloads.html" class="btn-std btn-block btn-download">Download</a>
</div>
</div>
</div>
</header>
<!--Header End-->
<!--Navigation Begin-->
<div class="navbar" role="banner">
<div class="container-fluid">
<div class="navbar-header">
<button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation">
<ul class="nav navbar-nav">
<li><a href="/index.html" id="home">Home</a></li>
<li><a href="/getting-help.html" id="getting-help">Getting Help</a></li>
<li><a href="/about/integrates.html" id="project-info">Project Information</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" id="documentation">Documentation <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="/releases/2.3.0/index.html">2.3.0</a></li>
<li><a href="/releases/2.2.0/index.html">2.2.0</a></li>
<li><a href="/releases/2.1.0/index.html">2.1.0</a></li>
<li><a href="/releases/2.0.0/index.html">2.0.0</a></li>
<li><a href="/releases/1.2.3/index.html">1.2.3</a></li>
</ul>
</li>
<li><a href="/talksAndVideos.html">Talks and Slideshows</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" id="contribute">Community <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="/contribute/Contributing-to-Storm.html">Contributing</a></li>
<li><a href="/contribute/People.html">People</a></li>
<li><a href="/contribute/BYLAWS.html">ByLaws</a></li>
</ul>
</li>
<li><a href="/2021/09/27/storm230-released.html" id="news">News</a></li>
</ul>
</nav>
</div>
</div>
<div class="container-fluid">
<h1 class="page-title">Storm Distributed Cache API</h1>
<div class="row">
<div class="col-md-12">
<!-- Documentation -->
<p class="post-meta"></p>
<div class="documentation-content"><h1 id="storm-distributed-cache-api">Storm Distributed Cache API</h1>
<p>The distributed cache feature in storm is used to efficiently distribute files
(or blobs, which is the equivalent terminology for a file in the distributed
cache and is used interchangeably in this document) that are large and can
change during the lifetime of a topology, such as geo-location data,
dictionaries, etc. Typical use cases include phrase recognition, entity
extraction, document classification, URL re-writing, location/address detection
and so forth. Such files may be several KB to several GB in size. For small
datasets that don&#39;t need dynamic updates, including them in the topology jar
could be fine. But for large files, the startup times could become very large.
In these cases, the distributed cache feature can provide fast topology startup,
especially if the files were previously downloaded for the same submitter and
are still in the cache. This is useful with frequent deployments, sometimes few
times a day with updated jars, because the large cached files will remain available
without changes. The large cached blobs that do not change frequently will
remain available in the distributed cache.</p>
<p>At the starting time of a topology, the user specifies the set of files the
topology needs. Once a topology is running, the user at any time can request for
any file in the distributed cache to be updated with a newer version. The
updating of blobs happens in an eventual consistency model. If the topology
needs to know what version of a file it has access to, it is the responsibility
of the user to find this information out. The files are stored in a cache with
Least-Recently Used (LRU) eviction policy, where the supervisor decides which
cached files are no longer needed and can delete them to free disk space. The
blobs can be compressed, and the user can request the blobs to be uncompressed
before it accesses them.</p>
<h2 id="motivation-for-distributed-cache">Motivation for Distributed Cache</h2>
<ul>
<li>Allows sharing blobs among topologies.</li>
<li>Allows updating the blobs from the command line.</li>
</ul>
<h2 id="distributed-cache-implementations">Distributed Cache Implementations</h2>
<p>The current BlobStore interface has the following two implementations
* LocalFsBlobStore
* HdfsBlobStore</p>
<p>Appendix A contains the interface for blobstore implementation.</p>
<h2 id="localfsblobstore">LocalFsBlobStore</h2>
<p><img src="images/local_blobstore.png" alt="LocalFsBlobStore"></p>
<p>Local file system implementation of Blobstore can be depicted in the above timeline diagram.</p>
<p>There are several stages from blob creation to blob download and corresponding execution of a topology.
The main stages can be depicted as follows</p>
<h3 id="blob-creation-command">Blob Creation Command</h3>
<p>Blobs in the blobstore can be created through command line using the following command.</p>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore create --file README.txt --acl o::rwa --replication-factor 4 key1
</code></pre></div>
<p>The above command creates a blob with a key name “key1” corresponding to the file README.txt.
The access given to all users being read, write and admin with a replication factor of 4.</p>
<h3 id="topology-submission-and-blob-mapping">Topology Submission and Blob Mapping</h3>
<p>Users can submit their topology with the following command. The command includes the
topology map configuration. The configuration holds two keys “key1” and “key2” with the
key “key1” having a local file name mapping named “blob_file” and it is not compressed.</p>
<div class="highlight"><pre><code class="language-" data-lang="">storm jar /home/y/lib/storm-starter/current/storm-starter-jar-with-dependencies.jar
org.apache.storm.starter.clj.word_count test_topo -c topology.blobstore.map='{"key1":{"localname":"blob_file", "uncompress":false},"key2":{}}'
</code></pre></div>
<h3 id="blob-creation-process">Blob Creation Process</h3>
<p>The creation of the blob takes place through the interface “ClientBlobStore”. Appendix B contains the “ClientBlobStore” interface.
The concrete implementation of this interface is the “NimbusBlobStore”. In the case of local file system the client makes a
call to the nimbus to create the blobs within the local file system. The nimbus uses the local file system implementation to create these blobs.
When a user submits a topology, the jar, configuration and code files are uploaded as blobs with the help of blobstore.
Also, all the other blobs specified by the topology are mapped to it with the help of topology.blobstore.map configuration.</p>
<h3 id="blob-download-by-the-supervisor">Blob Download by the Supervisor</h3>
<p>Finally, the blobs corresponding to a topology are downloaded by the supervisor once it receives the assignments from the nimbus through
the same “NimbusBlobStore” thrift client that uploaded the blobs. The supervisor downloads the code, jar and conf blobs by calling the
“NimbusBlobStore” client directly while the blobs specified in the topology.blobstore.map are downloaded and mapped locally with the help
of the Localizer. The Localizer talks to the “NimbusBlobStore” thrift client to download the blobs and adds the blob compression and local
blob name mapping logic to suit the implementation of a topology. Once all the blobs have been downloaded the workers are launched to run
the topologies.</p>
<h2 id="hdfsblobstore">HdfsBlobStore</h2>
<p><img src="images/hdfs_blobstore.png" alt="HdfsBlobStore"></p>
<p>The HdfsBlobStore functionality has a similar implementation and blob creation and download procedure barring how the replication
is handled in the two blobstore implementations. The replication in HDFS blobstore is obvious as HDFS is equipped to handle replication
and it requires no state to be stored inside the zookeeper. On the other hand, the local file system blobstore requires the state to be
stored on the zookeeper in order for it to work with nimbus HA. Nimbus HA allows the local filesystem to implement the replication feature
seamlessly by storing the state in the zookeeper about the running topologies and syncing the blobs on various nimbuses. On the supervisor’s
end, the supervisor and localizer talks to HdfsBlobStore through “HdfsClientBlobStore” implementation.</p>
<h2 id="additional-features-and-documentation">Additional Features and Documentation</h2>
<div class="highlight"><pre><code class="language-" data-lang="">storm jar /home/y/lib/storm-starter/current/storm-starter-jar-with-dependencies.jar org.apache.storm.starter.clj.word_count test_topo
-c topology.blobstore.map='{"key1":{"localname":"blob_file", "uncompress":false},"key2":{}}'
</code></pre></div>
<h3 id="compression">Compression</h3>
<p>The blobstore allows the user to specify the “uncompress” configuration to true or false. This configuration can be specified
in the topology.blobstore.map mentioned in the above command. This allows the user to upload a compressed file like a tarball/zip.
In local file system blobstore, the compressed blobs are stored on the nimbus node. The localizer code takes the responsibility to
uncompress the blob and store it on the supervisor node. Symbolic links to the blobs on the supervisor node are created within the worker
before the execution starts.</p>
<h3 id="local-file-name-mapping">Local File Name Mapping</h3>
<p>Apart from compression the blobstore helps to give the blob a name that can be used by the workers. The localizer takes
the responsibility of mapping the blob to a local name on the supervisor node.</p>
<h2 id="additional-blobstore-implementation-details">Additional Blobstore Implementation Details</h2>
<p>Blobstore uses a hashing function to create the blobs based on the key. The blobs are generally stored inside the directory specified by
the blobstore.dir configuration. By default, it is stored under “storm.local.dir/blobs” for local file system and a similar path on
hdfs file system.</p>
<p>Once a file is submitted, the blobstore reads the configs and creates a metadata for the blob with all the access control details. The metadata
is generally used for authorization while accessing the blobs. The blob key and version contribute to the hash code and there by the directory
under “storm.local.dir/blobs/data” where the data is placed. The blobs are generally placed in a positive number directory like 193,822 etc.</p>
<p>Once the topology is launched and the relevant blobs have been created, the supervisor downloads blobs related to the storm.conf, storm.ser
and storm.code first and all the blobs uploaded by the command line separately using the localizer to uncompress and map them to a local name
specified in the topology.blobstore.map configuration. The supervisor periodically updates blobs by checking for the change of version.
This allows updating the blobs on the fly and thereby making it a very useful feature.</p>
<p>For a local file system, the distributed cache on the supervisor node is set to 10240 MB as a soft limit and the clean up code attempts
to clean anything over the soft limit every 600 seconds based on LRU policy.</p>
<p>The HDFS blobstore implementation handles load better by removing the burden on the nimbus to store the blobs, which avoids it becoming a bottleneck. Moreover, it provides seamless replication of blobs. On the other hand, the local file system blobstore is not very efficient in
replicating the blobs and is limited by the number of nimbuses. Moreover, the supervisor talks to the HDFS blobstore directly without the
involvement of the nimbus and thereby reduces the load and dependency on nimbus.</p>
<h2 id="highly-available-nimbus">Highly Available Nimbus</h2>
<h3 id="problem-statement">Problem Statement:</h3>
<p>Currently the storm master aka nimbus, is a process that runs on a single machine under supervision. In most cases, the
nimbus failure is transient and it is restarted by the process that does supervision. However sometimes when disks fail and networks
partitions occur, nimbus goes down. Under these circumstances, the topologies run normally but no new topologies can be
submitted, no existing topologies can be killed/deactivated/activated and if a supervisor node fails then the
reassignments are not performed resulting in performance degradation or topology failures. With this project we intend,
to resolve this problem by running nimbus in a primary backup mode to guarantee that even if a nimbus server fails one
of the backups will take over. </p>
<h3 id="requirements-for-highly-available-nimbus">Requirements for Highly Available Nimbus:</h3>
<ul>
<li>Increase overall availability of nimbus.</li>
<li>Allow nimbus hosts to leave and join the cluster at will any time. A newly joined host should auto catch up and join
the list of potential leaders automatically. </li>
<li>No topology resubmissions required in case of nimbus fail overs.</li>
<li>No active topology should ever be lost. </li>
</ul>
<h4 id="leader-election">Leader Election:</h4>
<p>The nimbus server will use the following interface:</p>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">ILeaderElector</span> <span class="o">{</span>
<span class="cm">/**
* queue up for leadership lock. The call returns immediately and the caller
* must check isLeader() to perform any leadership action.
*/</span>
<span class="kt">void</span> <span class="nf">addToLeaderLockQueue</span><span class="o">();</span>
<span class="cm">/**
* Removes the caller from the leader lock queue. If the caller is leader
* also releases the lock.
*/</span>
<span class="kt">void</span> <span class="nf">removeFromLeaderLockQueue</span><span class="o">();</span>
<span class="cm">/**
*
* @return true if the caller currently has the leader lock.
*/</span>
<span class="kt">boolean</span> <span class="nf">isLeader</span><span class="o">();</span>
<span class="cm">/**
*
* @return the current leader's address , throws exception if noone has has lock.
*/</span>
<span class="n">InetSocketAddress</span> <span class="nf">getLeaderAddress</span><span class="o">();</span>
<span class="cm">/**
*
* @return list of current nimbus addresses, includes leader.
*/</span>
<span class="n">List</span><span class="o">&lt;</span><span class="n">InetSocketAddress</span><span class="o">&gt;</span> <span class="nf">getAllNimbusAddresses</span><span class="o">();</span>
<span class="o">}</span>
</code></pre></div>
<p>Once a nimbus comes up it calls addToLeaderLockQueue() function. The leader election code selects a leader from the queue.
If the topology code, jar or config blobs are missing, it would download the blobs from any other nimbus which is up and running.</p>
<p>The first implementation will be Zookeeper based. If the zookeeper connection is lost/reset resulting in loss of lock
or the spot in queue the implementation will take care of updating the state such that isLeader() will reflect the
current status. The leader like actions must finish in less than minimumOf(connectionTimeout, SessionTimeout) to ensure
the lock was held by nimbus for the entire duration of the action (Not sure if we want to just state this expectation
and ensure that zk configurations are set high enough which will result in higher failover time or we actually want to
create some sort of rollback mechanism for all actions, the second option needs a lot of code). If a nimbus that is not
leader receives a request that only a leader can perform, it will throw a RunTimeException.</p>
<h3 id="nimbus-state-store">Nimbus state store:</h3>
<p>To achieve fail over from primary to backup servers nimbus state/data needs to be replicated across all nimbus hosts or
needs to be stored in a distributed storage. Replicating the data correctly involves state management, consistency checks
and it is hard to test for correctness. However many storm users do not want to take extra dependency on another replicated
storage system like HDFS and still need high availability. The blobstore implementation along with the state storage helps
to overcome the failover scenarios in case a leader nimbus goes down.</p>
<p>To support replication we will allow the user to define a code replication factor which would reflect number of nimbus
hosts to which the code must be replicated before starting the topology. With replication comes the issue of consistency.
The topology is launched once the code, jar and conf blob files are replicated based on the &quot;topology.min.replication&quot; config.
Maintaining state for failover scenarios is important for local file system. The current implementation makes sure one of the
available nimbus is elected as a leader in the case of a failure. If the topology specific blobs are missing, the leader nimbus
tries to download them as and when they are needed. With this current architecture, we do not have to download all the blobs
required for a topology for a nimbus to accept leadership. This helps us in case the blobs are very large and avoid causing any
inadvertant delays in electing a leader.</p>
<p>The state for every blob is relevant for the local blobstore implementation. For HDFS blobstore the replication
is taken care by the HDFS. For handling the fail over scenarios for a local blobstore we need to store the state of the leader and
non-leader nimbuses within the zookeeper.</p>
<p>The state is stored under /storm/blobstore/key/nimbusHostPort:SequenceNumber for the blobstore to work to make nimbus highly available.
This state is used in the local file system blobstore to support replication. The HDFS blobstore does not have to store the state inside the
zookeeper.</p>
<ul>
<li><p>NimbusHostPort: This piece of information generally contains the parsed string holding the hostname and port of the nimbus.
It uses the same class “NimbusHostPortInfo” used earlier by the code-distributor interface to store the state and parse the data.</p></li>
<li><p>SequenceNumber: This is the blob sequence number information. The SequenceNumber information is implemented by a KeySequenceNumber class.
The sequence numbers are generated for every key. For every update, the sequence numbers are assigned based ona global sequence number
stored under /storm/blobstoremaxsequencenumber/key. For more details about how the numbers are generated you can look at the java docs for KeySequenceNumber.</p></li>
</ul>
<p><img src="images/nimbus_ha_blobstore.png" alt="Nimbus High Availability - BlobStore"></p>
<p>The sequence diagram proposes how the blobstore works and the state storage inside the zookeeper makes the nimbus highly available.
Currently, the thread to sync the blobs on a non-leader is within the nimbus. In the future, it will be nice to move the thread around
to the blobstore to make the blobstore coordinate the state change and blob download as per the sequence diagram.</p>
<h2 id="thrift-and-rest-api">Thrift and Rest API</h2>
<p>In order to avoid workers/supervisors/ui talking to zookeeper for getting master nimbus address we are going to modify the
<code>getClusterInfo</code> API so it can also return nimbus information. getClusterInfo currently returns <code>ClusterSummary</code> instance
which has a list of <code>supervisorSummary</code> and a list of <code>topologySummary</code> instances. We will add a list of <code>NimbusSummary</code>
to the <code>ClusterSummary</code>. See the structures below:</p>
<div class="highlight"><pre><code class="language-" data-lang="">struct ClusterSummary {
1: required list&lt;SupervisorSummary&gt; supervisors;
3: required list&lt;TopologySummary&gt; topologies;
4: required list&lt;NimbusSummary&gt; nimbuses;
}
struct NimbusSummary {
1: required string host;
2: required i32 port;
3: required i32 uptime_secs;
4: required bool isLeader;
5: required string version;
}
</code></pre></div>
<p>This will be used by StormSubmitter, Nimbus clients, supervisors and ui to discover the current leaders and participating
nimbus hosts. Any nimbus host will be able to respond to these requests. The nimbus hosts can read this information once
from zookeeper and cache it and keep updating the cache when the watchers are fired to indicate any changes,which should
be rare in general case.</p>
<p>Note: All nimbus hosts have watchers on zookeeper to be notified immediately as soon as a new blobs is available for download, the callback may or may not download
the code. Therefore, a background thread is triggered to download the respective blobs to run the topologies. The replication is achieved when the blobs are downloaded
onto non-leader nimbuses. So you should expect your topology submission time to be somewhere between 0 to (2 * nimbus.code.sync.freq.secs) for any
nimbus.min.replication.count &gt; 1.</p>
<h2 id="configuration">Configuration</h2>
<div class="highlight"><pre><code class="language-" data-lang="">blobstore.dir: The directory where all blobs are stored. For local file system it represents the directory on the nimbus
node and for HDFS file system it represents the hdfs file system path.
supervisor.blobstore.class: This configuration is meant to set the client for the supervisor in order to talk to the blobstore.
For a local file system blobstore it is set to “org.apache.storm.blobstore.NimbusBlobStore” and for the HDFS blobstore it is set
to “org.apache.storm.blobstore.HdfsClientBlobStore”.
supervisor.blobstore.download.thread.count: This configuration spawns multiple threads for from the supervisor in order download
blobs concurrently. The default is set to 5
supervisor.blobstore.download.max_retries: This configuration is set to allow the supervisor to retry for the blob download.
By default it is set to 3.
supervisor.localizer.cache.target.size.mb: The jvm opts provided to workers launched by this supervisor. All "%ID%" substrings
are replaced with an identifier for this worker. Also, "%WORKER-ID%", "%STORM-ID%" and "%WORKER-PORT%" are replaced with
appropriate runtime values for this worker. The distributed cache target size in MB. This is a soft limit to the size
of the distributed cache contents. It is set to 10240 MB.
supervisor.localizer.cleanup.interval.ms: The distributed cache cleanup interval. Controls how often it scans to attempt to
cleanup anything over the cache target size. By default it is set to 300000 milliseconds.
supervisor.localizer.update.blob.interval.secs: The distributed cache interval for checking for blobs to update. By
default it is set to 30 seconds.
nimbus.blobstore.class: Sets the blobstore implementation nimbus uses. It is set to "org.apache.storm.blobstore.LocalFsBlobStore"
nimbus.blobstore.expiration.secs: During operations with the blobstore, via master, how long a connection is idle before nimbus
considers it dead and drops the session and any associated connections. The default is set to 600.
storm.blobstore.inputstream.buffer.size.bytes: The buffer size it uses for blobstore upload. It is set to 65536 bytes.
client.blobstore.class: The blobstore implementation the storm client uses. The current implementation uses the default
config "org.apache.storm.blobstore.NimbusBlobStore".
blobstore.replication.factor: It sets the replication for each blob within the blobstore. The “topology.min.replication.count”
ensures the minimum replication the topology specific blobs are set before launching the topology. You might want to set the
“topology.min.replication.count &lt;= blobstore.replication”. The default is set to 3.
topology.min.replication.count : Minimum number of nimbus hosts where the code must be replicated before leader nimbus
can mark the topology as active and create assignments. Default is 1.
topology.max.replication.wait.time.sec: Maximum wait time for the nimbus host replication to achieve the nimbus.min.replication.count.
Once this time is elapsed nimbus will go ahead and perform topology activation tasks even if required nimbus.min.replication.count is not achieved.
The default is 60 seconds, a value of -1 indicates to wait for ever.
* nimbus.code.sync.freq.secs: Frequency at which the background thread on nimbus which syncs code for locally missing blobs. Default is 2 minutes.
</code></pre></div>
<p>Additionally, if you want to access to secure hdfs blobstore, you also need to set the following configs.<br>
<code>
storm.hdfs.login.keytab or blobstore.hdfs.keytab (deprecated)
storm.hdfs.login.principal or blobstore.hdfs.principal (deprecated)
</code></p>
<p>For example,
<code>
storm.hdfs.login.keytab: /etc/keytab
storm.hdfs.login.principal: primary/instance@REALM
</code></p>
<h2 id="using-the-distributed-cache-api-command-line-interface-cli">Using the Distributed Cache API, Command Line Interface (CLI)</h2>
<h3 id="creating-blobs">Creating blobs</h3>
<p>To use the distributed cache feature, the user first has to &quot;introduce&quot; files
that need to be cached and bind them to key strings. To achieve this, the user
uses the &quot;blobstore create&quot; command of the storm executable, as follows:</p>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore create [-f|--file FILE] [-a|--acl ACL1,ACL2,...] [--replication-factor NUMBER] [keyname]
</code></pre></div>
<p>The contents come from a FILE, if provided by -f or --file option, otherwise
from STDIN.<br>
The ACLs, which can also be a comma separated list of many ACLs, is of the
following format:</p>
<div class="highlight"><pre><code class="language-" data-lang="">&gt; [u|o]:[username]:[r-|w-|a-|_]
</code></pre></div>
<p>where: </p>
<ul>
<li>u = user<br></li>
<li>o = other<br></li>
<li>username = user for this particular ACL<br></li>
<li>r = read access<br></li>
<li>w = write access<br></li>
<li>a = admin access<br></li>
<li>_ = ignored<br></li>
</ul>
<p>The replication factor can be set to a value greater than 1 using --replication-factor.</p>
<p>Note: The replication right now is configurable for a hdfs blobstore but for a
local blobstore the replication always stays at 1. For a hdfs blobstore
the default replication is set to 3.</p>
<h6 id="example">Example:</h6>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore create --file README.txt --acl o::rwa --replication-factor 4 key1
</code></pre></div>
<p>In the above example, the <em>README.txt</em> file is added to the distributed cache.
It can be accessed using the key string &quot;<em>key1</em>&quot; for any topology that needs
it. The file is set to have read/write/admin access for others, a.k.a world
everything and the replication is set to 4.</p>
<h6 id="example">Example:</h6>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore create mytopo:data.tgz -f data.tgz -a u:alice:rwa,u:bob:rw,o::r
</code></pre></div>
<p>The above example createss a mytopo:data.tgz key using the data stored in
data.tgz. User alice would have full access, bob would have read/write access
and everyone else would have read access.</p>
<h3 id="making-dist-cache-files-accessible-to-topologies">Making dist. cache files accessible to topologies</h3>
<p>Once a blob is created, we can use it for topologies. This is generally achieved
by including the key string among the configurations of a topology, with the
following format. A shortcut is to add the configuration item on the command
line when starting a topology by using the <strong>-c</strong> command:</p>
<div class="highlight"><pre><code class="language-" data-lang="">-c topology.blobstore.map='{"[KEY]":{"localname":"[VALUE]", "uncompress":[true|false]}}'
</code></pre></div>
<p>Note: Please take care of the quotes.</p>
<p>The cache file would then be accessible to the topology as a local file with the
name [VALUE].<br>
The localname parameter is optional, if omitted the local cached file will have
the same name as [KEY].<br>
The uncompress parameter is optional, if omitted the local cached file will not
be uncompressed. Note that the key string needs to have the appropriate
file-name-like format and extension, so it can be uncompressed correctly.</p>
<h6 id="example">Example:</h6>
<div class="highlight"><pre><code class="language-" data-lang="">storm jar /home/y/lib/storm-starter/current/storm-starter-jar-with-dependencies.jar org.apache.storm.starter.clj.word_count test_topo -c topology.blobstore.map='{"key1":{"localname":"blob_file", "uncompress":false},"key2":{}}'
</code></pre></div>
<p>Note: Please take care of the quotes.</p>
<p>In the above example, we start the <em>word_count</em> topology (stored in the
<em>storm-starter-jar-with-dependencies.jar</em> file), and ask it to have access
to the cached file stored with key string = <em>key1</em>. This file would then be
accessible to the topology as a local file called <em>blob_file</em>, and the
supervisor will not try to uncompress the file. Note that in our example, the
file&#39;s content originally came from <em>README.txt</em>. We also ask for the file
stored with the key string = <em>key2</em> to be accessible to the topology. Since
both the optional parameters are omitted, this file will get the local name =
<em>key2</em>, and will not be uncompressed.</p>
<h3 id="updating-a-cached-file">Updating a cached file</h3>
<p>It is possible for the cached files to be updated while topologies are running.
The update happens in an eventual consistency model, where the supervisors poll
Nimbus every 30 seconds, and update their local copies. In the current version,
it is the user&#39;s responsibility to check whether a new file is available.</p>
<p>To update a cached file, use the following command. Contents come from a FILE or
STDIN. Write access is required to be able to update a cached file.</p>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore update [-f|--file NEW_FILE] [KEYSTRING]
</code></pre></div>
<h6 id="example">Example:</h6>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore update -f updates.txt key1
</code></pre></div>
<p>In the above example, the topologies will be presented with the contents of the
file <em>updates.txt</em> instead of <em>README.txt</em> (from the previous example), even
though their access by the topology is still through a file called
<em>blob_file</em>.</p>
<h3 id="removing-a-cached-file">Removing a cached file</h3>
<p>To remove a file from the distributed cache, use the following command. Removing
a file requires write access.</p>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore delete [KEYSTRING]
</code></pre></div>
<h3 id="listing-blobs-currently-in-the-distributed-cache-blobstore">Listing Blobs currently in the distributed cache blobstore</h3>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore list [KEY...]
</code></pre></div>
<p>lists blobs currently in the blobstore</p>
<h3 id="reading-the-contents-of-a-blob">Reading the contents of a blob</h3>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore cat [-f|--file FILE] KEY
</code></pre></div>
<p>read a blob and then either write it to a file, or STDOUT. Reading a blob
requires read access.</p>
<h3 id="setting-the-access-control-for-a-blob">Setting the access control for a blob</h3>
<div class="highlight"><pre><code class="language-" data-lang="">set-acl [-s ACL] KEY
</code></pre></div>
<p>ACL is in the form [uo]:[username]:[r-][w-][a-] can be comma separated list
(requires admin access).</p>
<h3 id="update-the-replication-factor-for-a-blob">Update the replication factor for a blob</h3>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore replication --update --replication-factor 5 key1
</code></pre></div>
<h3 id="read-the-replication-factor-of-a-blob">Read the replication factor of a blob</h3>
<div class="highlight"><pre><code class="language-" data-lang="">storm blobstore replication --read key1
</code></pre></div>
<h3 id="command-line-help">Command line help</h3>
<div class="highlight"><pre><code class="language-" data-lang="">storm help blobstore
</code></pre></div>
<h2 id="using-the-distributed-cache-api-from-java">Using the Distributed Cache API from Java</h2>
<p>We start by getting a ClientBlobStore object by calling this function:</p>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">Config</span> <span class="n">theconf</span> <span class="o">=</span> <span class="k">new</span> <span class="n">Config</span><span class="o">();</span>
<span class="n">theconf</span><span class="o">.</span><span class="na">putAll</span><span class="o">(</span><span class="n">Utils</span><span class="o">.</span><span class="na">readStormConfig</span><span class="o">());</span>
<span class="n">ClientBlobStore</span> <span class="n">clientBlobStore</span> <span class="o">=</span> <span class="n">Utils</span><span class="o">.</span><span class="na">getClientBlobStore</span><span class="o">(</span><span class="n">theconf</span><span class="o">);</span>
</code></pre></div>
<p>The required Utils package can by imported by:</p>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kn">import</span> <span class="nn">org.apache.storm.utils.Utils</span><span class="o">;</span>
</code></pre></div>
<p>ClientBlobStore and other blob-related classes can be imported by:</p>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kn">import</span> <span class="nn">org.apache.storm.blobstore.ClientBlobStore</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.apache.storm.blobstore.AtomicOutputStream</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.apache.storm.blobstore.InputStreamWithMeta</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.apache.storm.blobstore.BlobStoreAclHandler</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.apache.storm.generated.*</span><span class="o">;</span>
</code></pre></div>
<h3 id="creating-acls-to-be-used-for-blobs">Creating ACLs to be used for blobs</h3>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">String</span> <span class="n">stringBlobACL</span> <span class="o">=</span> <span class="s">"u:username:rwa"</span><span class="o">;</span>
<span class="n">AccessControl</span> <span class="n">blobACL</span> <span class="o">=</span> <span class="n">BlobStoreAclHandler</span><span class="o">.</span><span class="na">parseAccessControl</span><span class="o">(</span><span class="n">stringBlobACL</span><span class="o">);</span>
<span class="n">List</span><span class="o">&lt;</span><span class="n">AccessControl</span><span class="o">&gt;</span> <span class="n">acls</span> <span class="o">=</span> <span class="k">new</span> <span class="n">LinkedList</span><span class="o">&lt;</span><span class="n">AccessControl</span><span class="o">&gt;();</span>
<span class="n">acls</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">blobACL</span><span class="o">);</span> <span class="c1">// more ACLs can be added here</span>
<span class="n">SettableBlobMeta</span> <span class="n">settableBlobMeta</span> <span class="o">=</span> <span class="k">new</span> <span class="n">SettableBlobMeta</span><span class="o">(</span><span class="n">acls</span><span class="o">);</span>
<span class="n">settableBlobMeta</span><span class="o">.</span><span class="na">set_replication_factor</span><span class="o">(</span><span class="mi">4</span><span class="o">);</span> <span class="c1">// Here we can set the replication factor</span>
</code></pre></div>
<p>The settableBlobMeta object is what we need to create a blob in the next step. </p>
<h3 id="creating-a-blob">Creating a blob</h3>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">AtomicOutputStream</span> <span class="n">blobStream</span> <span class="o">=</span> <span class="n">clientBlobStore</span><span class="o">.</span><span class="na">createBlob</span><span class="o">(</span><span class="s">"some_key"</span><span class="o">,</span> <span class="n">settableBlobMeta</span><span class="o">);</span>
<span class="n">blobStream</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="s">"Some String or input data"</span><span class="o">.</span><span class="na">getBytes</span><span class="o">());</span>
<span class="n">blobStream</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
</code></pre></div>
<p>Note that the settableBlobMeta object here comes from the last step, creating ACLs.
It is recommended that for very large files, the user writes the bytes in smaller chunks (for example 64 KB, up to 1 MB chunks).</p>
<h3 id="updating-a-blob">Updating a blob</h3>
<p>Similar to creating a blob, but we get the AtomicOutputStream in a different way:</p>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">String</span> <span class="n">blobKey</span> <span class="o">=</span> <span class="s">"some_key"</span><span class="o">;</span>
<span class="n">AtomicOutputStream</span> <span class="n">blobStream</span> <span class="o">=</span> <span class="n">clientBlobStore</span><span class="o">.</span><span class="na">updateBlob</span><span class="o">(</span><span class="n">blobKey</span><span class="o">);</span>
</code></pre></div>
<p>Pass a byte stream to the returned AtomicOutputStream as before. </p>
<h3 id="updating-the-acls-of-a-blob">Updating the ACLs of a blob</h3>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">String</span> <span class="n">blobKey</span> <span class="o">=</span> <span class="s">"some_key"</span><span class="o">;</span>
<span class="n">AccessControl</span> <span class="n">updateAcl</span> <span class="o">=</span> <span class="n">BlobStoreAclHandler</span><span class="o">.</span><span class="na">parseAccessControl</span><span class="o">(</span><span class="s">"u:USER:--a"</span><span class="o">);</span>
<span class="n">List</span><span class="o">&lt;</span><span class="n">AccessControl</span><span class="o">&gt;</span> <span class="n">updateAcls</span> <span class="o">=</span> <span class="k">new</span> <span class="n">LinkedList</span><span class="o">&lt;</span><span class="n">AccessControl</span><span class="o">&gt;();</span>
<span class="n">updateAcls</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">updateAcl</span><span class="o">);</span>
<span class="n">SettableBlobMeta</span> <span class="n">modifiedSettableBlobMeta</span> <span class="o">=</span> <span class="k">new</span> <span class="n">SettableBlobMeta</span><span class="o">(</span><span class="n">updateAcls</span><span class="o">);</span>
<span class="n">clientBlobStore</span><span class="o">.</span><span class="na">setBlobMeta</span><span class="o">(</span><span class="n">blobKey</span><span class="o">,</span> <span class="n">modifiedSettableBlobMeta</span><span class="o">);</span>
<span class="c1">//Now set write only</span>
<span class="n">updateAcl</span> <span class="o">=</span> <span class="n">BlobStoreAclHandler</span><span class="o">.</span><span class="na">parseAccessControl</span><span class="o">(</span><span class="s">"u:USER:-w-"</span><span class="o">);</span>
<span class="n">updateAcls</span> <span class="o">=</span> <span class="k">new</span> <span class="n">LinkedList</span><span class="o">&lt;</span><span class="n">AccessControl</span><span class="o">&gt;();</span>
<span class="n">updateAcls</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">updateAcl</span><span class="o">);</span>
<span class="n">modifiedSettableBlobMeta</span> <span class="o">=</span> <span class="k">new</span> <span class="n">SettableBlobMeta</span><span class="o">(</span><span class="n">updateAcls</span><span class="o">);</span>
<span class="n">clientBlobStore</span><span class="o">.</span><span class="na">setBlobMeta</span><span class="o">(</span><span class="n">blobKey</span><span class="o">,</span> <span class="n">modifiedSettableBlobMeta</span><span class="o">);</span>
</code></pre></div>
<h3 id="updating-and-reading-the-replication-of-a-blob">Updating and Reading the replication of a blob</h3>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">String</span> <span class="n">blobKey</span> <span class="o">=</span> <span class="s">"some_key"</span><span class="o">;</span>
<span class="n">BlobReplication</span> <span class="n">replication</span> <span class="o">=</span> <span class="n">clientBlobStore</span><span class="o">.</span><span class="na">updateBlobReplication</span><span class="o">(</span><span class="n">blobKey</span><span class="o">,</span> <span class="mi">5</span><span class="o">);</span>
<span class="kt">int</span> <span class="n">replication_factor</span> <span class="o">=</span> <span class="n">replication</span><span class="o">.</span><span class="na">get_replication</span><span class="o">();</span>
</code></pre></div>
<p>Note: The replication factor gets updated and reflected only for hdfs blobstore</p>
<h3 id="reading-a-blob">Reading a blob</h3>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">String</span> <span class="n">blobKey</span> <span class="o">=</span> <span class="s">"some_key"</span><span class="o">;</span>
<span class="n">InputStreamWithMeta</span> <span class="n">blobInputStream</span> <span class="o">=</span> <span class="n">clientBlobStore</span><span class="o">.</span><span class="na">getBlob</span><span class="o">(</span><span class="n">blobKey</span><span class="o">);</span>
<span class="n">BufferedReader</span> <span class="n">r</span> <span class="o">=</span> <span class="k">new</span> <span class="n">BufferedReader</span><span class="o">(</span><span class="k">new</span> <span class="n">InputStreamReader</span><span class="o">(</span><span class="n">blobInputStream</span><span class="o">));</span>
<span class="n">String</span> <span class="n">blobContents</span> <span class="o">=</span> <span class="n">r</span><span class="o">.</span><span class="na">readLine</span><span class="o">();</span>
</code></pre></div>
<h3 id="deleting-a-blob">Deleting a blob</h3>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">String</span> <span class="n">blobKey</span> <span class="o">=</span> <span class="s">"some_key"</span><span class="o">;</span>
<span class="n">clientBlobStore</span><span class="o">.</span><span class="na">deleteBlob</span><span class="o">(</span><span class="n">blobKey</span><span class="o">);</span>
</code></pre></div>
<h3 id="getting-a-list-of-blob-keys-already-in-the-blobstore">Getting a list of blob keys already in the blobstore</h3>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">Iterator</span> <span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">stringIterator</span> <span class="o">=</span> <span class="n">clientBlobStore</span><span class="o">.</span><span class="na">listKeys</span><span class="o">();</span>
</code></pre></div>
<h2 id="appendix-a">Appendix A</h2>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">abstract</span> <span class="kt">void</span> <span class="nf">prepare</span><span class="o">(</span><span class="n">Map</span> <span class="n">conf</span><span class="o">,</span> <span class="n">String</span> <span class="n">baseDir</span><span class="o">);</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">AtomicOutputStream</span> <span class="nf">createBlob</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">SettableBlobMeta</span> <span class="n">meta</span><span class="o">,</span> <span class="n">Subject</span> <span class="n">who</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyAlreadyExistsException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">AtomicOutputStream</span> <span class="nf">updateBlob</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">Subject</span> <span class="n">who</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">ReadableBlobMeta</span> <span class="nf">getBlobMeta</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">Subject</span> <span class="n">who</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="kt">void</span> <span class="nf">setBlobMeta</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">SettableBlobMeta</span> <span class="n">meta</span><span class="o">,</span> <span class="n">Subject</span> <span class="n">who</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="kt">void</span> <span class="nf">deleteBlob</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">Subject</span> <span class="n">who</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">InputStreamWithMeta</span> <span class="nf">getBlob</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">Subject</span> <span class="n">who</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">Iterator</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="nf">listKeys</span><span class="o">(</span><span class="n">Subject</span> <span class="n">who</span><span class="o">);</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">BlobReplication</span> <span class="nf">getBlobReplication</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">Subject</span> <span class="n">who</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">Exception</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">BlobReplication</span> <span class="nf">updateBlobReplication</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="kt">int</span> <span class="n">replication</span><span class="o">,</span> <span class="n">Subject</span> <span class="n">who</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">,</span> <span class="n">IOException</span>
</code></pre></div>
<h2 id="appendix-b">Appendix B</h2>
<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">abstract</span> <span class="kt">void</span> <span class="nf">prepare</span><span class="o">(</span><span class="n">Map</span> <span class="n">conf</span><span class="o">);</span>
<span class="kd">protected</span> <span class="kd">abstract</span> <span class="n">AtomicOutputStream</span> <span class="nf">createBlobToExtend</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">SettableBlobMeta</span> <span class="n">meta</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyAlreadyExistsException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">AtomicOutputStream</span> <span class="nf">updateBlob</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">ReadableBlobMeta</span> <span class="nf">getBlobMeta</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">protected</span> <span class="kd">abstract</span> <span class="kt">void</span> <span class="nf">setBlobMetaToExtend</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">SettableBlobMeta</span> <span class="n">meta</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="kt">void</span> <span class="nf">deleteBlob</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">InputStreamWithMeta</span> <span class="nf">getBlob</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">Iterator</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="nf">listKeys</span><span class="o">();</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="kt">void</span> <span class="nf">watchBlob</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">,</span> <span class="n">IBlobWatcher</span> <span class="n">watcher</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="kt">void</span> <span class="nf">stopWatchingBlob</span><span class="o">(</span><span class="n">String</span> <span class="n">key</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">BlobReplication</span> <span class="nf">getBlobReplication</span><span class="o">(</span><span class="n">String</span> <span class="n">Key</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="n">BlobReplication</span> <span class="nf">updateBlobReplication</span><span class="o">(</span><span class="n">String</span> <span class="n">Key</span><span class="o">,</span> <span class="kt">int</span> <span class="n">replication</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">AuthorizationException</span><span class="o">,</span> <span class="n">KeyNotFoundException</span>
</code></pre></div>
<h2 id="appendix-c">Appendix C</h2>
<div class="highlight"><pre><code class="language-" data-lang="">service Nimbus {
...
string beginCreateBlob(1: string key, 2: SettableBlobMeta meta) throws (1: AuthorizationException aze, 2: KeyAlreadyExistsException kae);
string beginUpdateBlob(1: string key) throws (1: AuthorizationException aze, 2: KeyNotFoundException knf);
void uploadBlobChunk(1: string session, 2: binary chunk) throws (1: AuthorizationException aze);
void finishBlobUpload(1: string session) throws (1: AuthorizationException aze);
void cancelBlobUpload(1: string session) throws (1: AuthorizationException aze);
ReadableBlobMeta getBlobMeta(1: string key) throws (1: AuthorizationException aze, 2: KeyNotFoundException knf);
void setBlobMeta(1: string key, 2: SettableBlobMeta meta) throws (1: AuthorizationException aze, 2: KeyNotFoundException knf);
BeginDownloadResult beginBlobDownload(1: string key) throws (1: AuthorizationException aze, 2: KeyNotFoundException knf);
binary downloadBlobChunk(1: string session) throws (1: AuthorizationException aze);
void deleteBlob(1: string key) throws (1: AuthorizationException aze, 2: KeyNotFoundException knf);
ListBlobsResult listBlobs(1: string session);
BlobReplication getBlobReplication(1: string key) throws (1: AuthorizationException aze, 2: KeyNotFoundException knf);
BlobReplication updateBlobReplication(1: string key, 2: i32 replication) throws (1: AuthorizationException aze, 2: KeyNotFoundException knf);
...
}
struct BlobReplication {
1: required i32 replication;
}
exception AuthorizationException {
1: required string msg;
}
exception KeyNotFoundException {
1: required string msg;
}
exception KeyAlreadyExistsException {
1: required string msg;
}
enum AccessControlType {
OTHER = 1,
USER = 2
//eventually ,GROUP=3
}
struct AccessControl {
1: required AccessControlType type;
2: optional string name; //Name of user or group in ACL
3: required i32 access; //bitmasks READ=0x1, WRITE=0x2, ADMIN=0x4
}
struct SettableBlobMeta {
1: required list&lt;AccessControl&gt; acl;
2: optional i32 replication_factor
}
struct ReadableBlobMeta {
1: required SettableBlobMeta settable;
//This is some indication of a version of a BLOB. The only guarantee is
// if the data changed in the blob the version will be different.
2: required i64 version;
}
struct ListBlobsResult {
1: required list&lt;string&gt; keys;
2: required string session;
}
struct BeginDownloadResult {
//Same version as in ReadableBlobMeta
1: required i64 version;
2: required string session;
3: optional i64 data_size;
}
</code></pre></div></div>
</div>
</div>
</div>
<footer>
<div class="container-fluid">
<div class="row">
<div class="col-md-3">
<div class="footer-widget">
<h5>Meetups</h5>
<ul class="latest-news">
<li><a href="http://www.meetup.com/Apache-Storm-Apache-Kafka/">Apache Storm & Apache Kafka</a> <span class="small">(Sunnyvale, CA)</span></li>
<li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Apache Storm & Kafka Users</a> <span class="small">(Seattle, WA)</span></li>
<li><a href="http://www.meetup.com/New-York-City-Storm-User-Group/">NYC Storm User Group</a> <span class="small">(New York, NY)</span></li>
<li><a href="http://www.meetup.com/Bay-Area-Stream-Processing">Bay Area Stream Processing</a> <span class="small">(Emeryville, CA)</span></li>
<li><a href="http://www.meetup.com/Boston-Storm-Users/">Boston Realtime Data</a> <span class="small">(Boston, MA)</span></li>
<li><a href="http://www.meetup.com/storm-london">London Storm User Group</a> <span class="small">(London, UK)</span></li>
<!-- <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Seatle, WA</a> <span class="small">(27 Jun 2015)</span></li> -->
</ul>
</div>
</div>
<div class="col-md-3">
<div class="footer-widget">
<h5>About Apache Storm</h5>
<p>Apache Storm integrates with any queueing system and any database system. Apache Storm's spout abstraction makes it easy to integrate a new queuing system. Likewise, integrating Apache Storm with database systems is easy.</p>
</div>
</div>
<div class="col-md-3">
<div class="footer-widget">
<h5>First Look</h5>
<ul class="footer-list">
<li><a href="/releases/current/Rationale.html">Rationale</a></li>
<li><a href="/releases/current/Tutorial.html">Tutorial</a></li>
<li><a href="/releases/current/Setting-up-development-environment.html">Setting up development environment</a></li>
<li><a href="/releases/current/Creating-a-new-Storm-project.html">Creating a new Apache Storm project</a></li>
</ul>
</div>
</div>
<div class="col-md-3">
<div class="footer-widget">
<h5>Documentation</h5>
<ul class="footer-list">
<li><a href="/releases/current/index.html">Index</a></li>
<li><a href="/releases/current/javadocs/index.html">Javadoc</a></li>
<li><a href="/releases/current/FAQ.html">FAQ</a></li>
</ul>
</div>
</div>
</div>
<hr/>
<div class="row">
<div class="col-md-12">
<p align="center">Copyright © 2019 <a href="http://www.apache.org">Apache Software Foundation</a>. All Rights Reserved.
<br>Apache Storm, Apache, the Apache feather logo, and the Apache Storm project logos are trademarks of The Apache Software Foundation.
<br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</p>
</div>
</div>
</div>
</footer>
<!--Footer End-->
<!-- Scroll to top -->
<span class="totop"><a href="#"><i class="fa fa-angle-up"></i></a></span>
</body>
</html>