blob: 64cd01ec5df1ac8752395f12fb24eb7920e03e4b [file] [log] [blame]
CachePipe is a simple-to-use caching pipeline. By this we mean:
- *simple-to-use*: It is a simple wrapper around existing commands, and
doesn't require you to learn an extensive system
- *caching*: commands are cached based on the contents of an
explicitly-provided dependency list, and future invocations will
not re-run the command if any of them hav changed
- *pipeline*: CachePipe is originally designed to be used as part of a
long pipeline of many different steps
CachePipe uses git-style SHA-1 hashes of the dependencies and the
command invocation itself to determine when changes have been made.
This improves upon existing caching systems, which make use of
unreliable indicators of command completion such as timestamps or even
file existence. It is implemented as a Perl module, and provides both
a Perl API and a command-line interface.
CachePipe supports a number of options. See below for more detail.
-- EXAMPLE -----------------------------------------------------------
# Put this in your init file
export CACHEPIPE=/path/to/cachepipe
export PERL5LIB+=:$CACHEPIPE
. $CACHEPIPE/bashrc
You can now do the following:
- From the command line:
$ cachecmd copy-file "cp a b" a b
[copy-file] rebuilding...
dep=a [CHANGED]
dep=b [NOT FOUND]
cmd=cp a b
took 0 seconds (0s)
$ cachecmd copy-file "cp a b" a b
[copy-file] cached, skipping...
- From a Perl script:
use CachePipe;
my $pipe = new CachePipe();
$pipe->cmd("copy-file","cp a b",["a","b"]);
-- OPTIONS -----------------------------------------------------------
You can tell the command to build the cache files without actually
running the command using the --cache-only flag between the command
name and the command, i.e.
$ cachecmd copy-file --cache-only "cp a b" a b
This will compute the hash over all the dependencies and the command
as if the current state were the desired state. Note that all the
dependencies must exist.
If you have already run a command, you can easily run it again:
$ cachecmd copy-file --rerun
Sometimes you wan to just recompute the hashes of a run, as if it had
just been successfully completed, but without actually re-running the
command. If the command has already been run once, you can type:
$ cachecmd copy-file --cache-only
This flag also works with a new command, e.g.,
$ cachecmd copy-file --cache-only "cp a b" a b
-- AUTHORS -----------------------------------------------------------
This was all Adam Lopez' idea.
Matt Post <post@jhu.edu>
Adam Lopez <alopez@cs.jhu.edu>