Add 'webindex/' from commit '91dc7cb6fc72c79a53c6b7d0a6c0599cd8eacb9b' git-subtree-dir: webindex git-subtree-mainline: f762da6d8f93dec655741632dd534d1287d1a6ec git-subtree-split: 91dc7cb6fc72c79a53c6b7d0a6c0599cd8eacb9b

commit: 4bd6a004c8dc37e2996314fd8b0de554562c5d38 [log] [tgz]
author: Christopher Tubbs <ctubbsii@apache.org> Mon Apr 16 16:19:56 2018 -0400
committer: Christopher Tubbs <ctubbsii@apache.org> Mon Apr 16 16:19:56 2018 -0400
tree: 0fd91568cd66c28880c4bad4cbdf4106357491d8
parent: f762da6d8f93dec655741632dd534d1287d1a6ec [diff]
parent: 91dc7cb6fc72c79a53c6b7d0a6c0599cd8eacb9b [diff]
diff --git a/LICENSE b/LICENSE
index 8f71f43..d645695 100644
--- a/LICENSE
+++ b/LICENSE

@@ -1,3 +1,4 @@
+
                                  Apache License
                            Version 2.0, January 2004
                         http://www.apache.org/licenses/
@@ -178,7 +179,7 @@
    APPENDIX: How to apply the Apache License to your work.
 
       To apply the Apache License to your work, attach the following
-      boilerplate notice, with the fields enclosed by brackets "{}"
+      boilerplate notice, with the fields enclosed by brackets "[]"
       replaced with your own identifying information. (Don't include
       the brackets!)  The text should be enclosed in the appropriate
       comment syntax for the file format. We also recommend that a
@@ -186,7 +187,7 @@
       same "printed page" as the copyright notice for easier
       identification within third-party archives.
 
-   Copyright {yyyy} {name of copyright owner}
+   Copyright [yyyy] [name of copyright owner]
 
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
@@ -199,4 +200,3 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
-

diff --git a/README.md b/README.md
index f869d20..6a9cfba 100644
--- a/README.md
+++ b/README.md

@@ -1,76 +1 @@
-![Webindex][logo]
----
-[![Build Status][ti]][tl] [![Apache License][li]][ll]
-
-Webindex is an example [Apache Fluo][fluo] application that incrementally indexes links to web pages
-in multiple ways. If you are new to Fluo, you may want start with the [Fluo tour][tour] as the
-WebIndex application is more complicated. For more information on how the WebIndex application
-works, view the [tables](docs/tables.md) and [code](docs/code-guide.md) documentation.
-
-Webindex utilizes multiple projects.  [Common Crawl][cc] web crawl data is used as the input.
-[Apache Spark][spark] is used to initialize Fluo and incrementally load data into Fluo.  [Apache
-Accumulo][accumulo] is used to hold the indexes and Fluo's data.  Fluo is used to continuously
-combine new and historical information about web pages and update an external index when changes
-occur. Webindex has simple UI built using [Spark Java][sparkjava] that allows querying the indexes.
-
-Below is a video showing repeatedly querying stackoverflow.com while Webindex was running for three
-days on EC2.  The video was made by querying the Webindex instance periodically and taking a
-screenshot.  More details about this video are available in this [blog post][bp].
-
-[![Querying stackoverflow.com](http://img.youtube.com/vi/mJJNJbPN2EI/0.jpg)](http://www.youtube.com/watch?v=mJJNJbPN2EI)
-
-## Running WebIndex
-
-If you are new to WebIndex, the simplest way to run the application is to run the development
-server. First, clone the WebIndex repo:
-
-    git clone https://github.com/astralway/webindex.git
-
-Next, on a machine where Java and Maven are installed, run the development server using the 
-`webindex` command:
-
-    cd webindex/
-    ./bin/webindex dev
-
-This will build and start the development server which will log to the console. This 'dev' command
-has several command line options which can be viewed by running with `-h`. When you want to
-terminate the server, press `CTRL-c`.
-
-The development server starts a MiniAccumuloCluster and runs MiniFluo on top of it. It parses a
-CommonCrawl data file and creates a file at `data/1000-pages.txt` with 1000 pages that are loaded
-into MiniFluo. The number of pages loaded can be changed to 5000 by using the command below:
-
-    ./bin/webindex dev --pages 5000
-
-The pages are processed by Fluo which exports indexes to Accumulo. The development server also
-starts a web application  at [http://localhost:4567](http://localhost:4567) that queries indexes in
-Accumulo.
-
-If you would like to run WebIndex on a cluster, follow the [install] instructions. 
-
-### Viewing metrics
-
-Metrics can be sent from the development server to InfluxDB and viewed in Grafana. You can either
-setup InfluxDB+Grafana on you own or use [Uno] command `uno setup metrics`. After a metrics server
-is started, start the development server the option `--metrics` to start sending metrics:
-
-    ./bin/webindex dev --metrics
-
-Fluo metrics can be viewed in Grafana.  To view application-specific metrics for Webindex, import
-the WebIndex Grafana dashboard located at `contrib/webindex-dashboard.json`.
-
-[tour]: https://fluo.apache.org/tour/
-[sparkjava]: http://sparkjava.com/
-[spark]: https://spark.apache.org/
-[accumulo]: https://accumulo.apache.org/
-[fluo]: https://fluo.apache.org/
-[pc]: https://github.com/astralway/phrasecount
-[Uno]: https://github.com/astralway/uno
-[cc]: https://commoncrawl.org/
-[install]: docs/install.md
-[ti]: https://travis-ci.org/astralway/webindex.svg?branch=master
-[tl]: https://travis-ci.org/astralway/webindex
-[li]: http://img.shields.io/badge/license-ASL-blue.svg
-[ll]: https://github.com/astralway/webindex/blob/master/LICENSE
-[logo]: contrib/webindex.png
-[bp]: https://fluo.apache.org/blog/2016/01/11/webindex-long-run/#videos-from-run
+Examples for Apache Fluo

diff --git a/phrasecount/.gitignore b/phrasecount/.gitignore
new file mode 100644
index 0000000..93eea5d
--- /dev/null
+++ b/phrasecount/.gitignore

@@ -0,0 +1,6 @@
+.classpath
+.project
+.settings
+target
+.idea
+*.iml

diff --git a/phrasecount/.travis.yml b/phrasecount/.travis.yml
new file mode 100644
index 0000000..e36964e
--- /dev/null
+++ b/phrasecount/.travis.yml

@@ -0,0 +1,12 @@
+language: java
+jdk:
+  - oraclejdk8
+script: mvn verify
+notifications:
+  irc:
+    channels:
+      - "chat.freenode.net#fluo"
+    on_success: always
+    on_failure: always
+    use_notice: true
+    skip_join: true

diff --git a/phrasecount/LICENSE b/phrasecount/LICENSE
new file mode 100644
index 0000000..e06d208
--- /dev/null
+++ b/phrasecount/LICENSE

@@ -0,0 +1,202 @@
+Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "{}"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright {yyyy} {name of copyright owner}
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+

diff --git a/phrasecount/README.md b/phrasecount/README.md
new file mode 100644
index 0000000..74a6509
--- /dev/null
+++ b/phrasecount/README.md

@@ -0,0 +1,164 @@
+# Phrase Count
+
+[![Build Status](https://travis-ci.org/astralway/phrasecount.svg?branch=master)](https://travis-ci.org/astralway/phrasecount)
+
+An example application that computes phrase counts for unique documents using Apache Fluo. Each
+unique document that is added causes phrase counts to be incremented. Unique documents have
+reference counts based on the number of locations that point to them. When a unique document is no
+longer referenced by any location, then the phrase counts will be decremented appropriately.
+
+After phrase counts are incremented, export transactions send phrase counts to an Accumulo table.
+The purpose of exporting data is to make it available for query. Percolator is not designed to
+support queries, because its transactions are designed for throughput and not responsiveness.
+
+This example uses the Collision Free Map and Export Queue from [Apache Fluo Recipes][11]. A
+Collision Free Map is used to calculate phrase counts. An Export Queue is used to update the
+external Accumulo table in a fault tolerant manner. Before using Fluo Recipes, this example was
+quite complex. Switching to Fluo Recipes dramatically simplified this example.
+
+## Schema
+
+### Fluo Table Schema
+
+This example uses the following schema for the table used by Apache Fluo.
+  
+Row          | Column        | Value             | Purpose
+-------------|---------------|-------------------|---------------------------------------------------------------------
+uri:\<uri\>  | doc:hash      | \<hash\>          | Contains the hash of the document found at the URI
+doc:\<hash\> | doc:content   | \<document\>      | The contents of the document
+doc:\<hash\> | doc:refCount  | \<int\>           | The number of URIs that reference this document 
+doc:\<hash\> | index:check   | empty             | Setting this columns triggers the observer that indexes the document 
+doc:\<hash\> | index:status  | INDEXED or empty  | Used to track the status of whether this document was indexed 
+
+Additionally the two recipes used by the example store their data in the table
+under two row prefixes.  Nothing else should be stored within these prefixes.
+The collision free map used to compute phrasecounts stores data within the row
+prefix `pcm:`.  The export queue stores data within the row prefix `aeq:`.
+
+### External Table Schema
+
+This example uses the following schema for the external Accumulo table.
+
+Row        | Column          | Value      | Purpose
+-----------|-----------------|------------|---------------------------------------------------------------------
+\<phrase\> | stat:totalCount | \<count\>  | For a given phrase, the value is the total number of times that phrase occurred in all documents.
+\<phrase\> | stat:docCount   | \<count\>  | For a given phrase, the values is the number of documents in which that phrase occurred.
+
+[PhraseCountTable][14] encapsulates all of the code for interacting with this
+external table.
+
+## Code Overview
+
+Documents are loaded into the Fluo table by [DocumentLoader][1] which is
+executed by [Load][2].  [DocumentLoader][1] handles reference counting of
+unique documents and may set a notification for [DocumentObserver][3].
+[DocumentObserver][3] increments or decrements global phrase counts by
+inserting `+1` or `-1` into a collision free map for each phrase in a document.
+[PhraseMap][4] contains the code called by the collision free map recipe.  The
+code in [PhraseMap][4] does two things.  First it computes the phrase counts by
+summing the updates.  Second it places the newly computed phrase count on an
+export queue.  [PhraseExporter][5] is called by the export queue recipe to
+generate mutations to update the external Accumulo table.
+    
+All observers and recipes are configured by code in [Application][10].  All
+observers are run by the Fluo worker processes when notifications trigger them.
+
+## Building
+
+After cloning this repository, build with following command. 
+ 
+```
+mvn package 
+```
+
+## Running via Maven
+
+If you do not have Accumulo, Hadoop, Zookeeper, and Fluo setup, then you can
+start an MiniFluo instance with the [mini.sh](bin/mini.sh) script.  This script
+will run [Mini.java][12] using Maven.  The command will create a
+`fluo.properties` file that can be used by the other commands in this section.
+
+```bash
+./bin/mini.sh /tmp/mac fluo.properties
+```
+
+After the mini command prints out `Wrote : fluo.properties` then its ready to
+use.  Run `tail -f mini.log` and look for the message about writing
+fluo.properties.
+
+This command will automatically configure [PhraseExporter][5] to export phrases
+to an Accumulo table named `pcExport`.
+
+The reason `-Dexec.classpathScope=test` is set is because it adds the test
+[log4j.properties][7] file to the classpath.
+
+### Adding documents
+
+The [load.sh](bin/load.sh) runs [Load.java][2] which scans the directory
+`$TXT_DIR` looking for .txt files to add.  The scan is recursive.  
+
+```bash
+./bin/load.sh fluo.properties $TXT_DIR
+```
+
+### Printing phrases
+
+After documents are added, [print.sh](bin/print.sh) will run [Print.java][13]
+which prints out phrase counts.  Try modifying a document you added and running
+the load command again, you should eventually see the phrase counts change.
+
+```bash
+./bin/print.sh fluo.properties pcExport
+```
+
+The command will print out the number of unique documents and the number
+of processed documents.  If the number of processed documents is less than the
+number of unique documents, then there is still work to do.  After the load
+command runs, the documents will have been added or updated.  However the
+phrase counts will not update until the Observer runs in the background. 
+
+### Killing mini
+
+Make sure to kill mini when finished testing.  The following command will kill it.
+
+```bash
+pkill -f phrasecount.cmd.Mini
+```
+
+## Deploying example
+
+The following script can run this example on a cluster using the Fluo
+distribution and serves as executable documentation for deployment.  The
+previous maven commands using the exec plugin are convenient for a development
+environment, using the following scripts shows how things would work in a
+production environment.
+
+  * [run.sh] (bin/run.sh) : Runs this example with YARN using the Fluo tar
+    distribution.  Running in this way requires setting up Hadoop, Zookeeper,
+    and Accumulo instances separately.  The [Uno][8] and [Muchos][9]
+    projects were created to ease setting up these external dependencies.
+
+## Generating data
+
+Need some data? Use `elinks` to generate text files from web pages.
+
+```
+mkdir data
+elinks -dump 1 -no-numbering -no-references http://accumulo.apache.org > data/accumulo.txt
+elinks -dump 1 -no-numbering -no-references http://hadoop.apache.org > data/hadoop.txt
+elinks -dump 1 -no-numbering -no-references http://zookeeper.apache.org > data/zookeeper.txt
+```
+
+[1]: src/main/java/phrasecount/DocumentLoader.java
+[2]: src/main/java/phrasecount/cmd/Load.java
+[3]: src/main/java/phrasecount/DocumentObserver.java
+[4]: src/main/java/phrasecount/PhraseMap.java
+[5]: src/main/java/phrasecount/PhraseExporter.java
+[7]: src/test/resources/log4j.properties
+[8]: https://github.com/astralway/uno
+[9]: https://github.com/astralway/muchos
+[10]: src/main/java/phrasecount/Application.java
+[11]: https://github.com/apache/fluo-recipes
+[12]: src/main/java/phrasecount/cmd/Mini.java
+[13]: src/main/java/phrasecount/cmd/Print.java
+[14]: src/main/java/phrasecount/query/PhraseCountTable.java

diff --git a/phrasecount/bin/copy-jars.sh b/phrasecount/bin/copy-jars.sh
new file mode 100755
index 0000000..a92ac5f
--- /dev/null
+++ b/phrasecount/bin/copy-jars.sh

@@ -0,0 +1,24 @@
+#!/bin/bash
+
+#This script will copy the phrase count jar and its dependencies to the Fluo
+#application lib dir
+
+
+if [ "$#" -ne 2 ]; then
+  echo "Usage : $0 <FLUO HOME> <PHRASECOUNT_HOME>"
+  exit 
+fi
+
+FLUO_HOME=$1
+PC_HOME=$2
+
+PC_JAR=$PC_HOME/target/phrasecount-0.0.1-SNAPSHOT.jar
+
+#build and copy phrasecount jar
+(cd $PC_HOME; mvn package -DskipTests)
+
+FLUO_APP_LIB=$FLUO_HOME/apps/phrasecount/lib/
+
+cp $PC_JAR $FLUO_APP_LIB
+(cd $PC_HOME; mvn dependency:copy-dependencies -DoutputDirectory=$FLUO_APP_LIB)
+

diff --git a/phrasecount/bin/load.sh b/phrasecount/bin/load.sh
new file mode 100755
index 0000000..4c9a904
--- /dev/null
+++ b/phrasecount/bin/load.sh

@@ -0,0 +1,3 @@
+#!/bin/bash
+
+mvn exec:java -Dexec.mainClass=phrasecount.cmd.Load -Dexec.args="${*:1}" -Dexec.classpathScope=test

diff --git a/phrasecount/bin/mini.sh b/phrasecount/bin/mini.sh
new file mode 100755
index 0000000..b8b60a4
--- /dev/null
+++ b/phrasecount/bin/mini.sh

@@ -0,0 +1,4 @@
+#!/bin/bash
+
+mvn exec:java -Dexec.mainClass=phrasecount.cmd.Mini -Dexec.args="${*:1}" -Dexec.classpathScope=test &>mini.log &
+echo "Started Mini in background.  Writing output to mini.log."

diff --git a/phrasecount/bin/print.sh b/phrasecount/bin/print.sh
new file mode 100755
index 0000000..198fad9
--- /dev/null
+++ b/phrasecount/bin/print.sh

@@ -0,0 +1,4 @@
+#!/bin/bash
+
+mvn exec:java -Dexec.mainClass=phrasecount.cmd.Print -Dexec.args="${*:1}" -Dexec.classpathScope=test
+

diff --git a/phrasecount/bin/run.sh b/phrasecount/bin/run.sh
new file mode 100755
index 0000000..8f6e46a
--- /dev/null
+++ b/phrasecount/bin/run.sh

@@ -0,0 +1,69 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+PC_HOME=$( cd "$( dirname "$BIN_DIR" )" && pwd )
+
+# stop if any command fails
+set -e
+
+if [ "$#" -ne 1 ]; then
+  echo "Usage : $0 <TXT FILES DIR>"
+  exit 
+fi
+
+#set the following to a directory containing text files
+TXT_DIR=$1
+if [ ! -d $TXT_DIR ]; then
+  echo "Document directory $TXT_DIR does not exist" 
+  exit 1
+fi
+
+#ensure $FLUO_HOME is set
+if [ -z "$FLUO_HOME" ]; then
+  echo '$FLUO_HOME must be set!'
+  exit 1
+fi
+
+#Set application name.  $FLUO_APP_NAME is set by fluo-dev and zetten
+APP=${FLUO_APP_NAME:-phrasecount}
+
+#derived variables
+APP_PROPS=$FLUO_HOME/apps/$APP/conf/fluo.properties
+
+if [ ! -f $FLUO_HOME/conf/fluo.properties ]; then
+  echo "Fluo is not configured, exiting."
+  exit 1
+fi
+
+#remove application if it exists
+if [ -d $FLUO_HOME/apps/$APP ]; then
+  echo "Restarting '$APP' application.  Errors may be printed if it's not running..."
+  $FLUO_HOME/bin/fluo kill $APP || true
+  rm -rf $FLUO_HOME/apps/$APP
+fi
+
+#create new application dir
+$FLUO_HOME/bin/fluo new $APP
+
+#copy phrasecount jars to Fluo application lib dir
+$PC_HOME/bin/copy-jars.sh $FLUO_HOME $PC_HOME
+
+#Create export table and output Fluo configuration
+$FLUO_HOME/bin/fluo exec $APP phrasecount.cmd.Setup $APP_PROPS pcExport >> $APP_PROPS
+
+$FLUO_HOME/bin/fluo init $APP -f
+$FLUO_HOME/bin/fluo exec $APP org.apache.fluo.recipes.accumulo.cmds.OptimizeTable
+$FLUO_HOME/bin/fluo start $APP
+$FLUO_HOME/bin/fluo info $APP
+
+#Load data
+$FLUO_HOME/bin/fluo exec $APP phrasecount.cmd.Load $APP_PROPS $TXT_DIR
+
+#wait for all notifications to be processed.
+$FLUO_HOME/bin/fluo wait $APP
+
+#print phrase counts
+$FLUO_HOME/bin/fluo exec $APP phrasecount.cmd.Print $APP_PROPS pcExport
+
+$FLUO_HOME/bin/fluo stop $APP
+

diff --git a/phrasecount/pom.xml b/phrasecount/pom.xml
new file mode 100644
index 0000000..bb9afde
--- /dev/null
+++ b/phrasecount/pom.xml

@@ -0,0 +1,98 @@
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+
+  <groupId>io.github.astralway</groupId>
+  <artifactId>phrasecount</artifactId>
+  <version>0.0.1-SNAPSHOT</version>
+  <packaging>jar</packaging>
+
+  <name>phrasecount</name>
+  <url>https://github.com/astralway/phrasecount</url>
+
+  <properties>
+    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+    <accumulo.version>1.7.2</accumulo.version>
+    <fluo.version>1.0.0-incubating</fluo.version>
+    <fluo-recipes.version>1.0.0-incubating</fluo-recipes.version>
+  </properties>
+
+  <build>
+    <plugins>
+      <plugin>
+        <artifactId>maven-compiler-plugin</artifactId>
+        <version>3.1</version>
+        <configuration>
+          <source>1.8</source>
+          <target>1.8</target>
+          <optimize>true</optimize>
+          <encoding>UTF-8</encoding>
+        </configuration>
+      </plugin>
+      <plugin>
+        <artifactId>maven-dependency-plugin</artifactId>
+        <version>2.10</version>
+        <configuration>
+          <!--define the specific dependencies to copy into the Fluo application dir-->
+          <includeArtifactIds>fluo-recipes-core,fluo-recipes-accumulo,fluo-recipes-kryo,kryo,minlog,reflectasm,objenesis</includeArtifactIds>
+        </configuration>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <version>4.11</version>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>com.beust</groupId>
+      <artifactId>jcommander</artifactId>
+      <version>1.32</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-api</artifactId>
+      <version>${fluo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-core</artifactId>
+      <version>${fluo.version}</version>
+      <scope>runtime</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-core</artifactId>
+      <version>${fluo-recipes.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-accumulo</artifactId>
+      <version>${fluo-recipes.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-kryo</artifactId>
+      <version>${fluo-recipes.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.accumulo</groupId>
+      <artifactId>accumulo-core</artifactId>
+      <version>${accumulo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-mini</artifactId>
+      <version>${fluo.version}</version>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.accumulo</groupId>
+      <artifactId>accumulo-minicluster</artifactId>
+      <version>${accumulo.version}</version>
+    </dependency>
+  </dependencies>
+</project>

diff --git a/phrasecount/src/main/java/phrasecount/Application.java b/phrasecount/src/main/java/phrasecount/Application.java
new file mode 100644
index 0000000..30d7c3a
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/Application.java

@@ -0,0 +1,71 @@
+package phrasecount;
+
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.config.ObserverSpecification;
+import org.apache.fluo.recipes.accumulo.export.AccumuloExporter;
+import org.apache.fluo.recipes.core.export.ExportQueue;
+import org.apache.fluo.recipes.core.map.CollisionFreeMap;
+import org.apache.fluo.recipes.kryo.KryoSimplerSerializer;
+import phrasecount.pojos.Counts;
+import phrasecount.pojos.PcKryoFactory;
+
+import static phrasecount.Constants.EXPORT_QUEUE_ID;
+import static phrasecount.Constants.PCM_ID;
+
+public class Application {
+
+  public static class Options {
+    public Options(int pcmBuckets, int eqBuckets, String instance, String zooKeepers, String user,
+        String password, String eTable) {
+      this.phraseCountMapBuckets = pcmBuckets;
+      this.exportQueueBuckets = eqBuckets;
+      this.instance = instance;
+      this.zookeepers = zooKeepers;
+      this.user = user;
+      this.password = password;
+      this.exportTable = eTable;
+
+    }
+
+    public int phraseCountMapBuckets;
+    public int exportQueueBuckets;
+
+    public String instance;
+    public String zookeepers;
+    public String user;
+    public String password;
+    public String exportTable;
+  }
+
+  /**
+   * Sets Fluo configuration needed to run the phrase count application
+   *
+   * @param fluoConfig FluoConfiguration
+   * @param opts Options
+   */
+  public static void configure(FluoConfiguration fluoConfig, Options opts) {
+    // set up an observer that watches the reference counts of documents. When a document is
+    // referenced or dereferenced, it will add or subtract phrase counts from a collision free map.
+    fluoConfig.addObserver(new ObserverSpecification(DocumentObserver.class.getName()));
+
+    // configure which KryoFactory recipes should use
+    KryoSimplerSerializer.setKryoFactory(fluoConfig, PcKryoFactory.class);
+
+    // set up a collision free map to combine phrase counts
+    CollisionFreeMap.configure(fluoConfig,
+        new CollisionFreeMap.Options(PCM_ID, PhraseMap.PcmCombiner.class,
+            PhraseMap.PcmUpdateObserver.class, String.class, Counts.class,
+            opts.phraseCountMapBuckets));
+
+    AccumuloExporter.Configuration accumuloConfig =
+        new AccumuloExporter.Configuration(opts.instance, opts.zookeepers, opts.user, opts.password,
+                                           opts.exportTable);
+
+    // setup an Accumulo export queue to to send phrase count updates to an Accumulo table
+    ExportQueue.Options exportQueueOpts =
+        new ExportQueue.Options(EXPORT_QUEUE_ID, PhraseExporter.class.getName(),
+                                String.class.getName(), Counts.class.getName(),
+                                opts.exportQueueBuckets).setExporterConfiguration(accumuloConfig);
+    ExportQueue.configure(fluoConfig, exportQueueOpts);
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/Constants.java b/phrasecount/src/main/java/phrasecount/Constants.java
new file mode 100644
index 0000000..1f73bee
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/Constants.java

@@ -0,0 +1,21 @@
+package phrasecount;
+
+import org.apache.fluo.api.data.Column;
+import org.apache.fluo.recipes.core.types.StringEncoder;
+import org.apache.fluo.recipes.core.types.TypeLayer;
+
+public class Constants {
+
+  // set the encoder to use in once place
+  public static final TypeLayer TYPEL = new TypeLayer(new StringEncoder());
+
+  public static final Column INDEX_CHECK_COL = TYPEL.bc().fam("index").qual("check").vis();
+  public static final Column INDEX_STATUS_COL = TYPEL.bc().fam("index").qual("status").vis();
+  public static final Column DOC_CONTENT_COL = TYPEL.bc().fam("doc").qual("content").vis();
+  public static final Column DOC_HASH_COL = TYPEL.bc().fam("doc").qual("hash").vis();
+  public static final Column DOC_REF_COUNT_COL = TYPEL.bc().fam("doc").qual("refCount").vis();
+
+  public static final String EXPORT_QUEUE_ID = "aeq";
+  //phrase count map id
+  public static final String PCM_ID = "pcm";
+}

diff --git a/phrasecount/src/main/java/phrasecount/DocumentLoader.java b/phrasecount/src/main/java/phrasecount/DocumentLoader.java
new file mode 100644
index 0000000..8384b35
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/DocumentLoader.java

@@ -0,0 +1,73 @@
+package phrasecount;
+
+import org.apache.fluo.api.client.Loader;
+import org.apache.fluo.api.client.TransactionBase;
+import org.apache.fluo.recipes.core.types.TypedTransactionBase;
+import phrasecount.pojos.Document;
+
+import static phrasecount.Constants.DOC_CONTENT_COL;
+import static phrasecount.Constants.DOC_HASH_COL;
+import static phrasecount.Constants.DOC_REF_COUNT_COL;
+import static phrasecount.Constants.INDEX_CHECK_COL;
+import static phrasecount.Constants.TYPEL;
+
+/**
+ * Executes document load transactions which dedupe and reference count documents. If needed, the
+ * observer that updates phrase counts is triggered.
+ */
+public class DocumentLoader implements Loader {
+
+  private Document document;
+
+  public DocumentLoader(Document doc) {
+    this.document = doc;
+  }
+
+  @Override
+  public void load(TransactionBase tx, Context context) throws Exception {
+
+    // TODO Need a strategy for dealing w/ large documents. If a worker processes many large
+    // documents concurrently, it could cause memory exhaustion. Could break up large documents
+    // into pieces, However, not sure if the example should be complicated with this.
+
+    TypedTransactionBase ttx = TYPEL.wrap(tx);
+    String storedHash = ttx.get().row("uri:" + document.getURI()).col(DOC_HASH_COL).toString();
+
+    if (storedHash == null || !storedHash.equals(document.getHash())) {
+
+      ttx.mutate().row("uri:" + document.getURI()).col(DOC_HASH_COL).set(document.getHash());
+
+      Integer refCount =
+          ttx.get().row("doc:" + document.getHash()).col(DOC_REF_COUNT_COL).toInteger();
+      if (refCount == null) {
+        // this document was never seen before
+        addNewDocument(ttx, document);
+      } else {
+        setRefCount(ttx, document.getHash(), refCount, refCount + 1);
+      }
+
+      if (storedHash != null) {
+        decrementRefCount(ttx, refCount, storedHash);
+      }
+    }
+  }
+
+  private void setRefCount(TypedTransactionBase tx, String hash, Integer prevRc, int rc) {
+    tx.mutate().row("doc:" + hash).col(DOC_REF_COUNT_COL).set(rc);
+
+    if (rc == 0 || (rc == 1 && (prevRc == null || prevRc == 0))) {
+      // setting this triggers DocumentObserver
+      tx.mutate().row("doc:" + hash).col(INDEX_CHECK_COL).set();
+    }
+  }
+
+  private void decrementRefCount(TypedTransactionBase tx, Integer prevRc, String hash) {
+    int rc = tx.get().row("doc:" + hash).col(DOC_REF_COUNT_COL).toInteger();
+    setRefCount(tx, hash, prevRc, rc - 1);
+  }
+
+  private void addNewDocument(TypedTransactionBase tx, Document doc) {
+    setRefCount(tx, doc.getHash(), null, 1);
+    tx.mutate().row("doc:" + doc.getHash()).col(DOC_CONTENT_COL).set(doc.getContent());
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/DocumentObserver.java b/phrasecount/src/main/java/phrasecount/DocumentObserver.java
new file mode 100644
index 0000000..1c50bfc
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/DocumentObserver.java

@@ -0,0 +1,102 @@
+package phrasecount;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Map.Entry;
+
+import org.apache.fluo.api.client.TransactionBase;
+import org.apache.fluo.api.data.Bytes;
+import org.apache.fluo.api.data.Column;
+import org.apache.fluo.api.observer.AbstractObserver;
+import org.apache.fluo.recipes.core.map.CollisionFreeMap;
+import org.apache.fluo.recipes.core.types.TypedTransactionBase;
+import phrasecount.pojos.Counts;
+import phrasecount.pojos.Document;
+
+import static phrasecount.Constants.DOC_CONTENT_COL;
+import static phrasecount.Constants.DOC_REF_COUNT_COL;
+import static phrasecount.Constants.INDEX_CHECK_COL;
+import static phrasecount.Constants.INDEX_STATUS_COL;
+import static phrasecount.Constants.PCM_ID;
+import static phrasecount.Constants.TYPEL;
+
+/**
+ * An Observer that updates phrase counts when a document is added or removed.
+ */
+public class DocumentObserver extends AbstractObserver {
+
+  private CollisionFreeMap<String, Counts> pcMap;
+
+  private enum IndexStatus {
+    INDEXED, UNINDEXED
+  }
+
+  @Override
+  public void init(Context context) throws Exception {
+    pcMap = CollisionFreeMap.getInstance(PCM_ID, context.getAppConfiguration());
+  }
+
+  @Override
+  public void process(TransactionBase tx, Bytes row, Column col) throws Exception {
+
+    TypedTransactionBase ttx = TYPEL.wrap(tx);
+
+    IndexStatus status = getStatus(ttx, row);
+    int refCount = ttx.get().row(row).col(DOC_REF_COUNT_COL).toInteger(0);
+
+    if (status == IndexStatus.UNINDEXED && refCount > 0) {
+      updatePhraseCounts(ttx, row, 1);
+      ttx.mutate().row(row).col(INDEX_STATUS_COL).set(IndexStatus.INDEXED.name());
+    } else if (status == IndexStatus.INDEXED && refCount == 0) {
+      updatePhraseCounts(ttx, row, -1);
+      deleteDocument(ttx, row);
+    }
+
+    // TODO modifying the trigger is currently broken, enable more than one observer to commit for a
+    // notification
+    // tx.delete(row, col);
+
+  }
+
+  @Override
+  public ObservedColumn getObservedColumn() {
+    return new ObservedColumn(INDEX_CHECK_COL, NotificationType.STRONG);
+  }
+
+  private void deleteDocument(TypedTransactionBase tx, Bytes row) {
+    // TODO it would probably be useful to have a deleteRow method on Transaction... this method
+    // could start off w/ a simple implementation and later be
+    // optimized... or could have a delete range option
+
+    // TODO this is brittle, this code assumes it knows all possible columns
+    tx.delete(row, DOC_CONTENT_COL);
+    tx.delete(row, DOC_REF_COUNT_COL);
+    tx.delete(row, INDEX_STATUS_COL);
+  }
+
+  private void updatePhraseCounts(TypedTransactionBase ttx, Bytes row, int multiplier) {
+    String content = ttx.get().row(row).col(Constants.DOC_CONTENT_COL).toString();
+
+    // this makes the assumption that the implementation of getPhrases is invariant. This is
+    // probably a bad assumption. A possible way to make this more robust
+    // is to store the output of getPhrases when indexing and use the stored output when unindexing.
+    // Alternatively, could store the version of Document used for
+    // indexing.
+    Map<String, Integer> phrases = new Document(null, content).getPhrases();
+    Map<String, Counts> updates = new HashMap<>(phrases.size());
+    for (Entry<String, Integer> entry : phrases.entrySet()) {
+      updates.put(entry.getKey(), new Counts(multiplier, entry.getValue() * multiplier));
+    }
+
+    pcMap.update(ttx, updates);
+  }
+
+  private IndexStatus getStatus(TypedTransactionBase tx, Bytes row) {
+    String status = tx.get().row(row).col(INDEX_STATUS_COL).toString();
+
+    if (status == null)
+      return IndexStatus.UNINDEXED;
+
+    return IndexStatus.valueOf(status);
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/PhraseExporter.java b/phrasecount/src/main/java/phrasecount/PhraseExporter.java
new file mode 100644
index 0000000..5aec44a
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/PhraseExporter.java

@@ -0,0 +1,24 @@
+package phrasecount;
+
+import java.util.function.Consumer;
+
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.fluo.recipes.accumulo.export.AccumuloExporter;
+import org.apache.fluo.recipes.core.export.SequencedExport;
+import phrasecount.pojos.Counts;
+import phrasecount.query.PhraseCountTable;
+
+/**
+ * Export code that converts {@link Counts} objects from the export queue to Mutations that are
+ * written to Accumulo.
+ */
+public class PhraseExporter extends AccumuloExporter<String, Counts> {
+
+  @Override
+  protected void translate(SequencedExport<String, Counts> export, Consumer<Mutation> consumer) {
+    String phrase = export.getKey();
+    long seq = export.getSequence();
+    Counts counts = export.getValue();
+    consumer.accept(PhraseCountTable.createMutation(phrase, seq, counts));
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/PhraseMap.java b/phrasecount/src/main/java/phrasecount/PhraseMap.java
new file mode 100644
index 0000000..01c3bfb
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/PhraseMap.java

@@ -0,0 +1,63 @@
+package phrasecount;
+
+import java.util.Iterator;
+import java.util.Optional;
+
+import com.google.common.collect.Iterators;
+import org.apache.fluo.api.client.TransactionBase;
+import org.apache.fluo.api.observer.Observer.Context;
+import org.apache.fluo.recipes.core.export.Export;
+import org.apache.fluo.recipes.core.export.ExportQueue;
+import org.apache.fluo.recipes.core.map.CollisionFreeMap;
+import org.apache.fluo.recipes.core.map.Combiner;
+import org.apache.fluo.recipes.core.map.Update;
+import org.apache.fluo.recipes.core.map.UpdateObserver;
+import phrasecount.pojos.Counts;
+
+import static phrasecount.Constants.EXPORT_QUEUE_ID;
+
+/**
+ * This class contains all of the code related to the {@link CollisionFreeMap} that keeps track of
+ * phrase counts.
+ */
+public class PhraseMap {
+
+  /**
+   * A combiner for the {@link CollisionFreeMap} that stores phrase counts. The
+   * {@link CollisionFreeMap} calls this combiner when it lazily updates the counts for a phrase.
+   */
+  public static class PcmCombiner implements Combiner<String, Counts> {
+
+    @Override
+    public Optional<Counts> combine(String key, Iterator<Counts> updates) {
+      Counts sum = new Counts(0, 0);
+      while (updates.hasNext()) {
+        sum = sum.add(updates.next());
+      }
+      return Optional.of(sum);
+    }
+  }
+
+  /**
+   * This class is notified when the {@link CollisionFreeMap} used to store phrase counts updates a
+   * phrase count. Updates are placed an Accumulo export queue to be exported to the table storing
+   * phrase counts for query.
+   */
+  public static class PcmUpdateObserver extends UpdateObserver<String, Counts> {
+
+    private ExportQueue<String, Counts> pcEq;
+
+    @Override
+    public void init(String mapId, Context observerContext) throws Exception {
+      pcEq = ExportQueue.getInstance(EXPORT_QUEUE_ID, observerContext.getAppConfiguration());
+    }
+
+    @Override
+    public void updatingValues(TransactionBase tx, Iterator<Update<String, Counts>> updates) {
+      Iterator<Export<String, Counts>> exports =
+          Iterators.transform(updates, u -> new Export<>(u.getKey(), u.getNewValue().get()));
+      pcEq.addAll(tx, exports);
+    }
+  }
+
+}

diff --git a/phrasecount/src/main/java/phrasecount/cmd/Load.java b/phrasecount/src/main/java/phrasecount/cmd/Load.java
new file mode 100644
index 0000000..82e4e75
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/cmd/Load.java

@@ -0,0 +1,51 @@
+package phrasecount.cmd;
+
+import java.io.File;
+
+import com.google.common.base.Charsets;
+import com.google.common.io.Files;
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.client.LoaderExecutor;
+import org.apache.fluo.api.config.FluoConfiguration;
+import phrasecount.DocumentLoader;
+import phrasecount.pojos.Document;
+
+public class Load {
+
+  public static void main(String[] args) throws Exception {
+
+    if (args.length != 2) {
+      System.err.println("Usage : " + Load.class.getName() + " <fluo props file> <txt file dir>");
+      System.exit(-1);
+    }
+
+    FluoConfiguration config = new FluoConfiguration(new File(args[0]));
+    config.setLoaderThreads(20);
+    config.setLoaderQueueSize(40);
+
+    try (FluoClient fluoClient = FluoFactory.newClient(config);
+        LoaderExecutor le = fluoClient.newLoaderExecutor()) {
+      File[] files = new File(args[1]).listFiles();
+
+      if (files == null) {
+        System.out.println("Text file dir does not exist: " + args[1]);
+      } else {
+        for (File txtFile : files) {
+          if (txtFile.getName().endsWith(".txt")) {
+            String uri = txtFile.toURI().toString();
+            String content = Files.toString(txtFile, Charsets.UTF_8);
+
+            System.out.println("Processing : " + txtFile.toURI());
+            le.execute(new DocumentLoader(new Document(uri, content)));
+          } else {
+            System.out.println("Ignoring : " + txtFile.toURI());
+          }
+        }
+      }
+    }
+
+    // TODO figure what threads are hanging around
+    System.exit(0);
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/cmd/Mini.java b/phrasecount/src/main/java/phrasecount/cmd/Mini.java
new file mode 100644
index 0000000..e43c1f5
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/cmd/Mini.java

@@ -0,0 +1,97 @@
+package phrasecount.cmd;
+
+import java.io.File;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import com.beust.jcommander.JCommander;
+import com.beust.jcommander.Parameter;
+import com.beust.jcommander.ParameterException;
+import org.apache.accumulo.core.conf.Property;
+import org.apache.accumulo.minicluster.MemoryUnit;
+import org.apache.accumulo.minicluster.MiniAccumuloCluster;
+import org.apache.accumulo.minicluster.MiniAccumuloConfig;
+import org.apache.accumulo.minicluster.ServerType;
+import org.apache.fluo.api.client.FluoAdmin.InitializationOptions;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.mini.MiniFluo;
+import phrasecount.Application;
+
+public class Mini {
+
+  static class Parameters {
+    @Parameter(names = {"-m", "--moreMemory"}, description = "Use more memory")
+    boolean moreMemory = false;
+
+    @Parameter(names = {"-w", "--workerThreads"}, description = "Number of worker threads")
+    int workerThreads = 5;
+
+    @Parameter(names = {"-t", "--tabletServers"}, description = "Number of tablet servers")
+    int tabletServers = 2;
+
+    @Parameter(names = {"-z", "--zookeeperPort"}, description = "Port to use for zookeeper")
+    int zookeeperPort = 0;
+
+    @Parameter(description = "<MAC dir> <output props file>")
+    List<String> args;
+  }
+
+  public static void main(String[] args) throws Exception {
+
+    Parameters params = new Parameters();
+    JCommander jc = new JCommander(params);
+
+    try {
+      jc.parse(args);
+      if (params.args == null || params.args.size() != 2)
+        throw new ParameterException("Expected two arguments");
+    } catch (ParameterException pe) {
+      System.out.println(pe.getMessage());
+      jc.setProgramName(Mini.class.getSimpleName());
+      jc.usage();
+      System.exit(-1);
+    }
+
+    MiniAccumuloConfig cfg = new MiniAccumuloConfig(new File(params.args.get(0)), "secret");
+    cfg.setZooKeeperPort(params.zookeeperPort);
+    cfg.setNumTservers(params.tabletServers);
+    if (params.moreMemory) {
+      cfg.setMemory(ServerType.TABLET_SERVER, 2, MemoryUnit.GIGABYTE);
+      Map<String, String> site = new HashMap<>();
+      site.put(Property.TSERV_DATACACHE_SIZE.getKey(), "768M");
+      site.put(Property.TSERV_INDEXCACHE_SIZE.getKey(), "256M");
+      cfg.setSiteConfig(site);
+    }
+
+    MiniAccumuloCluster cluster = new MiniAccumuloCluster(cfg);
+    cluster.start();
+
+    FluoConfiguration fluoConfig = new FluoConfiguration();
+
+    fluoConfig.setMiniStartAccumulo(false);
+    fluoConfig.setAccumuloInstance(cluster.getInstanceName());
+    fluoConfig.setAccumuloUser("root");
+    fluoConfig.setAccumuloPassword("secret");
+    fluoConfig.setAccumuloZookeepers(cluster.getZooKeepers());
+    fluoConfig.setInstanceZookeepers(cluster.getZooKeepers() + "/fluo");
+
+    fluoConfig.setAccumuloTable("data");
+    fluoConfig.setWorkerThreads(params.workerThreads);
+
+    fluoConfig.setApplicationName("phrasecount");
+
+    Application.configure(fluoConfig, new Application.Options(17, 17, cluster.getInstanceName(),
+        cluster.getZooKeepers(), "root", "secret", "pcExport"));
+
+    FluoFactory.newAdmin(fluoConfig).initialize(new InitializationOptions());
+
+    MiniFluo miniFluo = FluoFactory.newMiniFluo(fluoConfig);
+
+    miniFluo.getClientConfiguration().save(new File(params.args.get(1)));
+
+    System.out.println();
+    System.out.println("Wrote : " + params.args.get(1));
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/cmd/Print.java b/phrasecount/src/main/java/phrasecount/cmd/Print.java
new file mode 100644
index 0000000..79819b2
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/cmd/Print.java

@@ -0,0 +1,55 @@
+package phrasecount.cmd;
+
+import java.io.File;
+
+import com.google.common.collect.Iterables;
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.client.Snapshot;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.data.Column;
+import org.apache.fluo.api.data.Span;
+import phrasecount.Constants;
+import phrasecount.pojos.PhraseAndCounts;
+import phrasecount.query.PhraseCountTable;
+
+public class Print {
+
+  public static void main(String[] args) throws Exception {
+    if (args.length != 2) {
+      System.err
+          .println("Usage : " + Print.class.getName() + " <fluo props file> <export table name>");
+      System.exit(-1);
+    }
+
+    FluoConfiguration fluoConfig = new FluoConfiguration(new File(args[0]));
+
+    PhraseCountTable pcTable = new PhraseCountTable(fluoConfig, args[1]);
+    for (PhraseAndCounts phraseCount : pcTable) {
+      System.out.printf("%7d %7d '%s'\n", phraseCount.docPhraseCount, phraseCount.totalPhraseCount,
+          phraseCount.phrase);
+    }
+
+    try (FluoClient fluoClient = FluoFactory.newClient(fluoConfig);
+        Snapshot snap = fluoClient.newSnapshot()) {
+
+      // TODO could precompute this using observers
+      int uriCount = count(snap, "uri:", Constants.DOC_HASH_COL);
+      int documentCount = count(snap, "doc:", Constants.DOC_REF_COUNT_COL);
+      int numIndexedDocs = count(snap, "doc:", Constants.INDEX_STATUS_COL);
+
+      System.out.println();
+      System.out.printf("# uris                : %,d\n", uriCount);
+      System.out.printf("# unique documents    : %,d\n", documentCount);
+      System.out.printf("# processed documents : %,d\n", numIndexedDocs);
+      System.out.println();
+    }
+
+    // TODO figure what threads are hanging around
+    System.exit(0);
+  }
+
+  private static int count(Snapshot snap, String prefix, Column col) {
+    return Iterables.size(snap.scanner().over(Span.prefix(prefix)).fetch(col).byRow().build());
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/cmd/Setup.java b/phrasecount/src/main/java/phrasecount/cmd/Setup.java
new file mode 100644
index 0000000..9d27917
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/cmd/Setup.java

@@ -0,0 +1,38 @@
+package phrasecount.cmd;
+
+import java.io.File;
+
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.client.ZooKeeperInstance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.fluo.api.config.FluoConfiguration;
+import phrasecount.Application;
+import phrasecount.Application.Options;
+
+public class Setup {
+
+  public static void main(String[] args) throws Exception {
+    FluoConfiguration config = new FluoConfiguration(new File(args[0]));
+
+    String exportTable = args[1];
+
+    Connector conn =
+        new ZooKeeperInstance(config.getAccumuloInstance(), config.getAccumuloZookeepers())
+            .getConnector("root", new PasswordToken("secret"));
+    try {
+      conn.tableOperations().delete(exportTable);
+    } catch (TableNotFoundException e) {
+      // ignore if table not found
+    }
+
+    conn.tableOperations().create(exportTable);
+
+    Options opts = new Options(103, 103, config.getAccumuloInstance(), config.getAccumuloZookeepers(),
+        config.getAccumuloUser(), config.getAccumuloPassword(), exportTable);
+
+    FluoConfiguration observerConfig = new FluoConfiguration();
+    Application.configure(observerConfig, opts);
+    observerConfig.save(System.out);
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/cmd/Split.java b/phrasecount/src/main/java/phrasecount/cmd/Split.java
new file mode 100644
index 0000000..cc9d145
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/cmd/Split.java

@@ -0,0 +1,40 @@
+package phrasecount.cmd;
+
+import java.io.File;
+import java.util.SortedSet;
+import java.util.TreeSet;
+
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.ZooKeeperInstance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.hadoop.io.Text;
+
+/**
+ * Utility to add splits to the Accumulo table used by Fluo.
+ */
+public class Split {
+  public static void main(String[] args) throws Exception {
+    if (args.length != 2) {
+      System.err.println("Usage : " + Split.class.getName() + " <fluo props file> <table name>");
+      System.exit(-1);
+    }
+
+    FluoConfiguration fluoConfig = new FluoConfiguration(new File(args[0]));
+    ZooKeeperInstance zki =
+        new ZooKeeperInstance(fluoConfig.getAccumuloInstance(), fluoConfig.getAccumuloZookeepers());
+    Connector conn = zki.getConnector(fluoConfig.getAccumuloUser(),
+        new PasswordToken(fluoConfig.getAccumuloPassword()));
+
+    SortedSet<Text> splits = new TreeSet<>();
+
+    for (char c = 'b'; c < 'z'; c++) {
+      splits.add(new Text("phrase:" + c));
+    }
+
+    conn.tableOperations().addSplits(args[1], splits);
+
+    // TODO figure what threads are hanging around
+    System.exit(0);
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/pojos/Counts.java b/phrasecount/src/main/java/phrasecount/pojos/Counts.java
new file mode 100644
index 0000000..d8e0829
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/pojos/Counts.java

@@ -0,0 +1,44 @@
+package phrasecount.pojos;
+
+import com.google.common.base.Objects;
+
+public class Counts {
+  // number of documents a phrase was seen in
+  public final long docPhraseCount;
+  // total times a phrase was seen in all documents
+  public final long totalPhraseCount;
+
+  public Counts() {
+    docPhraseCount = 0;
+    totalPhraseCount = 0;
+  }
+
+  public Counts(long docPhraseCount, long totalPhraseCount) {
+    this.docPhraseCount = docPhraseCount;
+    this.totalPhraseCount = totalPhraseCount;
+  }
+
+  public Counts add(Counts other) {
+    return new Counts(this.docPhraseCount + other.docPhraseCount, this.totalPhraseCount + other.totalPhraseCount);
+  }
+
+  @Override
+  public boolean equals(Object o) {
+    if (o instanceof Counts) {
+      Counts opc = (Counts) o;
+      return opc.docPhraseCount == docPhraseCount && opc.totalPhraseCount == totalPhraseCount;
+    }
+
+    return false;
+  }
+
+  @Override
+  public int hashCode() {
+    return (int) (993 * totalPhraseCount + 17 * docPhraseCount);
+  }
+
+  @Override
+  public String toString() {
+    return Objects.toStringHelper(this).add("documents", docPhraseCount).add("total", totalPhraseCount).toString();
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/pojos/Document.java b/phrasecount/src/main/java/phrasecount/pojos/Document.java
new file mode 100644
index 0000000..5fc0e70
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/pojos/Document.java

@@ -0,0 +1,59 @@
+package phrasecount.pojos;
+
+import java.util.HashMap;
+import java.util.Map;
+
+import com.google.common.hash.Hasher;
+import com.google.common.hash.Hashing;
+
+public class Document {
+  // the location where the document came from. This is needed inorder to detect when a document
+  // changes.
+  private String uri;
+
+  // the text of a document.
+  private String content;
+
+  private String hash = null;
+
+  public Document(String uri, String content) {
+    this.content = content;
+    this.uri = uri;
+  }
+
+  public String getURI() {
+    return uri;
+  }
+
+  public String getHash() {
+    if (hash != null)
+      return hash;
+
+    Hasher hasher = Hashing.sha1().newHasher();
+    String[] tokens = content.toLowerCase().split("[^\\p{Alnum}]+");
+
+    for (String token : tokens) {
+      hasher.putString(token);
+    }
+
+    return hash = hasher.hash().toString();
+  }
+
+  public Map<String, Integer> getPhrases() {
+    String[] tokens = content.toLowerCase().split("[^\\p{Alnum}]+");
+
+    Map<String, Integer> phrases = new HashMap<>();
+    for (int i = 3; i < tokens.length; i++) {
+      String phrase = tokens[i - 3] + " " + tokens[i - 2] + " " + tokens[i - 1] + " " + tokens[i];
+      Integer old = phrases.put(phrase, 1);
+      if (old != null)
+        phrases.put(phrase, 1 + old);
+    }
+
+    return phrases;
+  }
+
+  public String getContent() {
+    return content;
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/pojos/PcKryoFactory.java b/phrasecount/src/main/java/phrasecount/pojos/PcKryoFactory.java
new file mode 100644
index 0000000..3158f00
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/pojos/PcKryoFactory.java

@@ -0,0 +1,13 @@
+package phrasecount.pojos;
+
+import com.esotericsoftware.kryo.Kryo;
+import com.esotericsoftware.kryo.pool.KryoFactory;
+
+public class PcKryoFactory implements KryoFactory {
+  @Override
+  public Kryo create() {
+    Kryo kryo = new Kryo();
+    kryo.register(Counts.class, 9);
+    return kryo;
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/pojos/PhraseAndCounts.java b/phrasecount/src/main/java/phrasecount/pojos/PhraseAndCounts.java
new file mode 100644
index 0000000..d6ddc33
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/pojos/PhraseAndCounts.java

@@ -0,0 +1,24 @@
+package phrasecount.pojos;
+
+public class PhraseAndCounts extends Counts {
+  public String phrase;
+
+  public PhraseAndCounts(String phrase, int docPhraseCount, int totalPhraseCount) {
+    super(docPhraseCount, totalPhraseCount);
+    this.phrase = phrase;
+  }
+
+  @Override
+  public boolean equals(Object o) {
+    if (o instanceof PhraseAndCounts) {
+      PhraseAndCounts op = (PhraseAndCounts) o;
+      return phrase.equals(op.phrase) && super.equals(op);
+    }
+    return false;
+  }
+
+  @Override
+  public int hashCode() {
+    return super.hashCode() + 31 * phrase.hashCode();
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/query/PhraseCountTable.java b/phrasecount/src/main/java/phrasecount/query/PhraseCountTable.java
new file mode 100644
index 0000000..f5f670a
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/query/PhraseCountTable.java

@@ -0,0 +1,107 @@
+package phrasecount.query;
+
+import java.util.Iterator;
+import java.util.Map.Entry;
+
+import com.google.common.collect.Iterators;
+import org.apache.accumulo.core.client.ClientConfiguration;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.RowIterator;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.ZooKeeperInstance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Range;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.hadoop.io.Text;
+import phrasecount.pojos.Counts;
+import phrasecount.pojos.PhraseAndCounts;
+
+/**
+ * All of the code for dealing with the Accumulo table that Fluo is exporting to
+ */
+public class PhraseCountTable implements Iterable<PhraseAndCounts> {
+
+  static final String STAT_CF = "stat";
+
+  //name of column qualifier used to store phrase count across all documents
+  static final String TOTAL_PC_CQ = "totalCount";
+
+  //name of column qualifier used to store number of documents containing a phrase
+  static final String DOC_PC_CQ = "docCount";
+
+  public static Mutation createMutation(String phrase, long seq, Counts pc) {
+    Mutation mutation = new Mutation(phrase);
+
+    // use the sequence number for the Accumulo timestamp, this will cause older updates to fall
+    // behind newer ones
+    if (pc.totalPhraseCount == 0)
+      mutation.putDelete(STAT_CF, TOTAL_PC_CQ, seq);
+    else
+      mutation.put(STAT_CF, TOTAL_PC_CQ, seq, pc.totalPhraseCount + "");
+
+    if (pc.docPhraseCount == 0)
+      mutation.putDelete(STAT_CF, DOC_PC_CQ, seq);
+    else
+      mutation.put(STAT_CF, DOC_PC_CQ, seq, pc.docPhraseCount + "");
+
+    return mutation;
+  }
+
+  private Connector conn;
+  private String table;
+
+  public PhraseCountTable(FluoConfiguration fluoConfig, String table) throws Exception {
+    ZooKeeperInstance zki = new ZooKeeperInstance(
+        new ClientConfiguration().withZkHosts(fluoConfig.getAccumuloZookeepers())
+            .withInstance(fluoConfig.getAccumuloInstance()));
+    this.conn = zki.getConnector(fluoConfig.getAccumuloUser(),
+        new PasswordToken(fluoConfig.getAccumuloPassword()));
+    this.table = table;
+  }
+
+  public PhraseCountTable(Connector conn, String table) {
+    this.conn = conn;
+    this.table = table;
+  }
+
+
+  public Counts getPhraseCounts(String phrase) throws Exception {
+    Scanner scanner = conn.createScanner(table, Authorizations.EMPTY);
+    scanner.setRange(new Range(phrase));
+
+    int sum = 0;
+    int docCount = 0;
+
+    for (Entry<Key, Value> entry : scanner) {
+      String cq = entry.getKey().getColumnQualifierData().toString();
+      if (cq.equals(TOTAL_PC_CQ)) {
+        sum = Integer.valueOf(entry.getValue().toString());
+      }
+
+      if (cq.equals(DOC_PC_CQ)) {
+        docCount = Integer.valueOf(entry.getValue().toString());
+      }
+    }
+
+    return new Counts(docCount, sum);
+  }
+
+  @Override
+  public Iterator<PhraseAndCounts> iterator() {
+    try {
+      Scanner scanner = conn.createScanner(table, Authorizations.EMPTY);
+      scanner.fetchColumn(new Text(STAT_CF), new Text(TOTAL_PC_CQ));
+      scanner.fetchColumn(new Text(STAT_CF), new Text(DOC_PC_CQ));
+
+      return Iterators.transform(new RowIterator(scanner), new RowTransform());
+    } catch (RuntimeException e) {
+      throw e;
+    } catch (Exception e) {
+      throw new RuntimeException(e);
+    }
+  }
+}

diff --git a/phrasecount/src/main/java/phrasecount/query/RowTransform.java b/phrasecount/src/main/java/phrasecount/query/RowTransform.java
new file mode 100644
index 0000000..e86439c
--- /dev/null
+++ b/phrasecount/src/main/java/phrasecount/query/RowTransform.java

@@ -0,0 +1,34 @@
+package phrasecount.query;
+
+import java.util.Iterator;
+import java.util.Map.Entry;
+
+import com.google.common.base.Function;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Value;
+import phrasecount.pojos.PhraseAndCounts;
+
+public class RowTransform implements Function<Iterator<Entry<Key, Value>>, PhraseAndCounts> {
+  @Override
+  public PhraseAndCounts apply(Iterator<Entry<Key, Value>> input) {
+    String phrase = null;
+
+    int totalPhraseCount = 0;
+    int docPhraseCount = 0;
+
+    while (input.hasNext()) {
+      Entry<Key, Value> colEntry = input.next();
+      String cq = colEntry.getKey().getColumnQualifierData().toString();
+
+      if (cq.equals(PhraseCountTable.TOTAL_PC_CQ))
+        totalPhraseCount = Integer.parseInt(colEntry.getValue().toString());
+      else
+        docPhraseCount = Integer.parseInt(colEntry.getValue().toString());
+
+      if (phrase == null)
+        phrase = colEntry.getKey().getRowData().toString();
+    }
+
+    return new PhraseAndCounts(phrase, docPhraseCount, totalPhraseCount);
+  }
+}

diff --git a/phrasecount/src/test/java/phrasecount/PhraseCounterTest.java b/phrasecount/src/test/java/phrasecount/PhraseCounterTest.java
new file mode 100644
index 0000000..5815883
--- /dev/null
+++ b/phrasecount/src/test/java/phrasecount/PhraseCounterTest.java

@@ -0,0 +1,215 @@
+package phrasecount;
+
+import java.util.Random;
+import java.util.concurrent.atomic.AtomicInteger;
+
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.accumulo.minicluster.MiniAccumuloCluster;
+import org.apache.accumulo.minicluster.MiniAccumuloConfig;
+import org.apache.fluo.api.client.FluoAdmin.InitializationOptions;
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.client.LoaderExecutor;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.mini.MiniFluo;
+import org.apache.fluo.recipes.core.types.TypedSnapshot;
+import org.junit.After;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import phrasecount.pojos.Counts;
+import phrasecount.pojos.Document;
+import phrasecount.query.PhraseCountTable;
+
+import static phrasecount.Constants.DOC_CONTENT_COL;
+import static phrasecount.Constants.DOC_REF_COUNT_COL;
+import static phrasecount.Constants.TYPEL;
+
+// TODO make this an integration test
+
+public class PhraseCounterTest {
+  public static TemporaryFolder folder = new TemporaryFolder();
+  public static MiniAccumuloCluster cluster;
+  private static FluoConfiguration props;
+  private static MiniFluo miniFluo;
+  private static final PasswordToken password = new PasswordToken("secret");
+  private static AtomicInteger tableCounter = new AtomicInteger(1);
+  private PhraseCountTable pcTable;
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    folder.create();
+    MiniAccumuloConfig cfg = new MiniAccumuloConfig(folder.newFolder("miniAccumulo"),
+        new String(password.getPassword()));
+    cluster = new MiniAccumuloCluster(cfg);
+    cluster.start();
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    cluster.stop();
+    folder.delete();
+  }
+
+  @Before
+  public void setUpFluo() throws Exception {
+
+    // configure Fluo to use mini instance. Could avoid all of this code and let MiniFluo create a
+    // MiniAccumulo instance. However we need access to the MiniAccumulo instance inorder to create
+    // the export/query table.
+    props = new FluoConfiguration();
+    props.setMiniStartAccumulo(false);
+    props.setApplicationName("phrasecount");
+    props.setAccumuloInstance(cluster.getInstanceName());
+    props.setAccumuloUser("root");
+    props.setAccumuloPassword("secret");
+    props.setInstanceZookeepers(cluster.getZooKeepers() + "/fluo");
+    props.setAccumuloZookeepers(cluster.getZooKeepers());
+    props.setAccumuloTable("data" + tableCounter.getAndIncrement());
+    props.setWorkerThreads(5);
+
+    // create the export/query table
+    String queryTable = "pcq" + tableCounter.getAndIncrement();
+    Connector conn = cluster.getConnector("root", "secret");
+    conn.tableOperations().create(queryTable);
+    pcTable = new PhraseCountTable(conn, queryTable);
+
+    // configure phrase count observers
+    Application.configure(props, new Application.Options(13, 13, cluster.getInstanceName(),
+        cluster.getZooKeepers(), "root", "secret", queryTable));
+
+    FluoFactory.newAdmin(props)
+        .initialize(new InitializationOptions().setClearTable(true).setClearZookeeper(true));
+
+    miniFluo = FluoFactory.newMiniFluo(props);
+  }
+
+  @After
+  public void tearDownFluo() throws Exception {
+    miniFluo.close();
+  }
+
+  private void loadDocument(FluoClient fluoClient, String uri, String content) {
+    try (LoaderExecutor le = fluoClient.newLoaderExecutor()) {
+      Document doc = new Document(uri, content);
+      le.execute(new DocumentLoader(doc));
+    }
+    miniFluo.waitForObservers();
+  }
+
+  @Test
+  public void test1() throws Exception {
+    try (FluoClient fluoClient = FluoFactory.newClient(props)) {
+
+      loadDocument(fluoClient, "/foo1", "This is only a test.  Do not panic. This is only a test.");
+
+      Assert.assertEquals(new Counts(1, 2), pcTable.getPhraseCounts("is only a test"));
+      Assert.assertEquals(new Counts(1, 1), pcTable.getPhraseCounts("test do not panic"));
+
+      // add new document w/ different content and overlapping phrase.. should change some counts
+      loadDocument(fluoClient, "/foo2", "This is only a test");
+
+      Assert.assertEquals(new Counts(2, 3), pcTable.getPhraseCounts("is only a test"));
+      Assert.assertEquals(new Counts(1, 1), pcTable.getPhraseCounts("test do not panic"));
+
+      // add new document w/ same content, should not change any counts
+      loadDocument(fluoClient, "/foo3", "This is only a test");
+
+      Assert.assertEquals(new Counts(2, 3), pcTable.getPhraseCounts("is only a test"));
+      Assert.assertEquals(new Counts(1, 1), pcTable.getPhraseCounts("test do not panic"));
+
+      // change the content of /foo1, should change counts
+      loadDocument(fluoClient, "/foo1", "The test is over, for now.");
+
+      Assert.assertEquals(new Counts(1, 1), pcTable.getPhraseCounts("the test is over"));
+      Assert.assertEquals(new Counts(1, 1), pcTable.getPhraseCounts("is only a test"));
+      Assert.assertEquals(new Counts(0, 0), pcTable.getPhraseCounts("test do not panic"));
+
+      // change content of foo2, should not change anything
+      loadDocument(fluoClient, "/foo2", "The test is over, for now.");
+
+      Assert.assertEquals(new Counts(1, 1), pcTable.getPhraseCounts("the test is over"));
+      Assert.assertEquals(new Counts(1, 1), pcTable.getPhraseCounts("is only a test"));
+      Assert.assertEquals(new Counts(0, 0), pcTable.getPhraseCounts("test do not panic"));
+
+      String oldHash = new Document("/foo3", "This is only a test").getHash();
+      try(TypedSnapshot tsnap = TYPEL.wrap(fluoClient.newSnapshot())){
+        Assert.assertNotNull(tsnap.get().row("doc:" + oldHash).col(DOC_CONTENT_COL).toString());
+        Assert.assertEquals(1, tsnap.get().row("doc:" + oldHash).col(DOC_REF_COUNT_COL).toInteger(0));
+      }
+      // dereference document that foo3 was referencing
+      loadDocument(fluoClient, "/foo3", "The test is over, for now.");
+
+      Assert.assertEquals(new Counts(1, 1), pcTable.getPhraseCounts("the test is over"));
+      Assert.assertEquals(new Counts(0, 0), pcTable.getPhraseCounts("is only a test"));
+      Assert.assertEquals(new Counts(0, 0), pcTable.getPhraseCounts("test do not panic"));
+
+      try(TypedSnapshot tsnap = TYPEL.wrap(fluoClient.newSnapshot())){
+        Assert.assertNull(tsnap.get().row("doc:" + oldHash).col(DOC_CONTENT_COL).toString());
+        Assert.assertNull(tsnap.get().row("doc:" + oldHash).col(DOC_REF_COUNT_COL).toInteger());
+      }
+    }
+
+  }
+
+  @Test
+  public void testHighCardinality() throws Exception {
+    try (FluoClient fluoClient = FluoFactory.newClient(props)) {
+
+      Random rand = new Random();
+
+      loadDocsWithRandomWords(fluoClient, rand, "This is only a test", 0, 100);
+
+      Assert.assertEquals(new Counts(100, 100), pcTable.getPhraseCounts("this is only a"));
+      Assert.assertEquals(new Counts(100, 100), pcTable.getPhraseCounts("is only a test"));
+
+      loadDocsWithRandomWords(fluoClient, rand, "This is not a test", 0, 2);
+
+      Assert.assertEquals(new Counts(2, 2), pcTable.getPhraseCounts("this is not a"));
+      Assert.assertEquals(new Counts(2, 2), pcTable.getPhraseCounts("is not a test"));
+      Assert.assertEquals(new Counts(98, 98), pcTable.getPhraseCounts("this is only a"));
+      Assert.assertEquals(new Counts(98, 98), pcTable.getPhraseCounts("is only a test"));
+
+      loadDocsWithRandomWords(fluoClient, rand, "This is not a test", 2, 100);
+
+      Assert.assertEquals(new Counts(100, 100), pcTable.getPhraseCounts("this is not a"));
+      Assert.assertEquals(new Counts(100, 100), pcTable.getPhraseCounts("is not a test"));
+      Assert.assertEquals(new Counts(0, 0), pcTable.getPhraseCounts("this is only a"));
+      Assert.assertEquals(new Counts(0, 0), pcTable.getPhraseCounts("is only a test"));
+
+      loadDocsWithRandomWords(fluoClient, rand, "This is only a test", 0, 50);
+
+      Assert.assertEquals(new Counts(50, 50), pcTable.getPhraseCounts("this is not a"));
+      Assert.assertEquals(new Counts(50, 50), pcTable.getPhraseCounts("is not a test"));
+      Assert.assertEquals(new Counts(50, 50), pcTable.getPhraseCounts("this is only a"));
+      Assert.assertEquals(new Counts(50, 50), pcTable.getPhraseCounts("is only a test"));
+
+    }
+  }
+
+  void loadDocsWithRandomWords(FluoClient fluoClient, Random rand, String phrase, int start,
+      int end) {
+
+    try (LoaderExecutor le = fluoClient.newLoaderExecutor()) {
+      // load many documents that share the same phrase
+      for (int i = start; i < end; i++) {
+        String uri = "/foo" + i;
+        StringBuilder content = new StringBuilder(phrase);
+        // add a bunch of random words
+        for (int j = 0; j < 20; j++) {
+          content.append(' ');
+          content.append(Integer.toString(rand.nextInt(10000), 36));
+        }
+
+        Document doc = new Document(uri, content.toString());
+        le.execute(new DocumentLoader(doc));
+      }
+    }
+    miniFluo.waitForObservers();
+  }
+}
+

diff --git a/phrasecount/src/test/resources/log4j.properties b/phrasecount/src/test/resources/log4j.properties
new file mode 100644
index 0000000..1ed12ff
--- /dev/null
+++ b/phrasecount/src/test/resources/log4j.properties

@@ -0,0 +1,29 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+log4j.rootLogger=INFO, CA
+log4j.appender.CA=org.apache.log4j.ConsoleAppender
+log4j.appender.CA.layout=org.apache.log4j.PatternLayout
+log4j.appender.CA.layout.ConversionPattern=%d{ISO8601} [%c{2}] %-5p: %m%n
+
+#Uncomment to see debugging output for Fluo.
+#log4j.logger.org.apache.fluo=DEBUG
+
+#uncomment the following to see all transaction activity 
+#log4j.logger.fluo.tx=TRACE
+
+log4j.logger.org.apache.zookeeper.ClientCnxn=FATAL
+log4j.logger.org.apache.zookeeper.ZooKeeper=WARN
+log4j.logger.org.apache.curator=WARN

diff --git a/stresso/.gitignore b/stresso/.gitignore
new file mode 100644
index 0000000..9233d7a
--- /dev/null
+++ b/stresso/.gitignore

@@ -0,0 +1,9 @@
+.classpath
+.project
+.settings
+target
+.DS_Store
+.idea
+*.iml
+git/
+logs/

diff --git a/stresso/.travis.yml b/stresso/.travis.yml
new file mode 100644
index 0000000..551c724
--- /dev/null
+++ b/stresso/.travis.yml

@@ -0,0 +1,25 @@
+# Copyright 2015 Stresso authors (see AUTHORS)
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+language: java
+jdk:
+  - oraclejdk8
+script: mvn verify
+notifications:
+  irc:
+    channels:
+      - "chat.freenode.net#fluo"
+    on_success: always
+    on_failure: always
+    use_notice: true
+    skip_join: true

diff --git a/stresso/AUTHORS b/stresso/AUTHORS
new file mode 100644
index 0000000..d413329
--- /dev/null
+++ b/stresso/AUTHORS

@@ -0,0 +1,5 @@
+AUTHORS
+-------
+
+Keith Turner - Peterson Technologies
+Mike Walch - Peterson Technologies

diff --git a/stresso/LICENSE b/stresso/LICENSE
new file mode 100644
index 0000000..37ec93a
--- /dev/null
+++ b/stresso/LICENSE

@@ -0,0 +1,191 @@
+Apache License
+Version 2.0, January 2004
+http://www.apache.org/licenses/
+
+TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+1. Definitions.
+
+"License" shall mean the terms and conditions for use, reproduction, and
+distribution as defined by Sections 1 through 9 of this document.
+
+"Licensor" shall mean the copyright owner or entity authorized by the copyright
+owner that is granting the License.
+
+"Legal Entity" shall mean the union of the acting entity and all other entities
+that control, are controlled by, or are under common control with that entity.
+For the purposes of this definition, "control" means (i) the power, direct or
+indirect, to cause the direction or management of such entity, whether by
+contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the
+outstanding shares, or (iii) beneficial ownership of such entity.
+
+"You" (or "Your") shall mean an individual or Legal Entity exercising
+permissions granted by this License.
+
+"Source" form shall mean the preferred form for making modifications, including
+but not limited to software source code, documentation source, and configuration
+files.
+
+"Object" form shall mean any form resulting from mechanical transformation or
+translation of a Source form, including but not limited to compiled object code,
+generated documentation, and conversions to other media types.
+
+"Work" shall mean the work of authorship, whether in Source or Object form, made
+available under the License, as indicated by a copyright notice that is included
+in or attached to the work (an example is provided in the Appendix below).
+
+"Derivative Works" shall mean any work, whether in Source or Object form, that
+is based on (or derived from) the Work and for which the editorial revisions,
+annotations, elaborations, or other modifications represent, as a whole, an
+original work of authorship. For the purposes of this License, Derivative Works
+shall not include works that remain separable from, or merely link (or bind by
+name) to the interfaces of, the Work and Derivative Works thereof.
+
+"Contribution" shall mean any work of authorship, including the original version
+of the Work and any modifications or additions to that Work or Derivative Works
+thereof, that is intentionally submitted to Licensor for inclusion in the Work
+by the copyright owner or by an individual or Legal Entity authorized to submit
+on behalf of the copyright owner. For the purposes of this definition,
+"submitted" means any form of electronic, verbal, or written communication sent
+to the Licensor or its representatives, including but not limited to
+communication on electronic mailing lists, source code control systems, and
+issue tracking systems that are managed by, or on behalf of, the Licensor for
+the purpose of discussing and improving the Work, but excluding communication
+that is conspicuously marked or otherwise designated in writing by the copyright
+owner as "Not a Contribution."
+
+"Contributor" shall mean Licensor and any individual or Legal Entity on behalf
+of whom a Contribution has been received by Licensor and subsequently
+incorporated within the Work.
+
+2. Grant of Copyright License.
+
+Subject to the terms and conditions of this License, each Contributor hereby
+grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free,
+irrevocable copyright license to reproduce, prepare Derivative Works of,
+publicly display, publicly perform, sublicense, and distribute the Work and such
+Derivative Works in Source or Object form.
+
+3. Grant of Patent License.
+
+Subject to the terms and conditions of this License, each Contributor hereby
+grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free,
+irrevocable (except as stated in this section) patent license to make, have
+made, use, offer to sell, sell, import, and otherwise transfer the Work, where
+such license applies only to those patent claims licensable by such Contributor
+that are necessarily infringed by their Contribution(s) alone or by combination
+of their Contribution(s) with the Work to which such Contribution(s) was
+submitted. If You institute patent litigation against any entity (including a
+cross-claim or counterclaim in a lawsuit) alleging that the Work or a
+Contribution incorporated within the Work constitutes direct or contributory
+patent infringement, then any patent licenses granted to You under this License
+for that Work shall terminate as of the date such litigation is filed.
+
+4. Redistribution.
+
+You may reproduce and distribute copies of the Work or Derivative Works thereof
+in any medium, with or without modifications, and in Source or Object form,
+provided that You meet the following conditions:
+
+You must give any other recipients of the Work or Derivative Works a copy of
+this License; and
+You must cause any modified files to carry prominent notices stating that You
+changed the files; and
+You must retain, in the Source form of any Derivative Works that You distribute,
+all copyright, patent, trademark, and attribution notices from the Source form
+of the Work, excluding those notices that do not pertain to any part of the
+Derivative Works; and
+If the Work includes a "NOTICE" text file as part of its distribution, then any
+Derivative Works that You distribute must include a readable copy of the
+attribution notices contained within such NOTICE file, excluding those notices
+that do not pertain to any part of the Derivative Works, in at least one of the
+following places: within a NOTICE text file distributed as part of the
+Derivative Works; within the Source form or documentation, if provided along
+with the Derivative Works; or, within a display generated by the Derivative
+Works, if and wherever such third-party notices normally appear. The contents of
+the NOTICE file are for informational purposes only and do not modify the
+License. You may add Your own attribution notices within Derivative Works that
+You distribute, alongside or as an addendum to the NOTICE text from the Work,
+provided that such additional attribution notices cannot be construed as
+modifying the License.
+You may add Your own copyright statement to Your modifications and may provide
+additional or different license terms and conditions for use, reproduction, or
+distribution of Your modifications, or for any such Derivative Works as a whole,
+provided Your use, reproduction, and distribution of the Work otherwise complies
+with the conditions stated in this License.
+
+5. Submission of Contributions.
+
+Unless You explicitly state otherwise, any Contribution intentionally submitted
+for inclusion in the Work by You to the Licensor shall be under the terms and
+conditions of this License, without any additional terms or conditions.
+Notwithstanding the above, nothing herein shall supersede or modify the terms of
+any separate license agreement you may have executed with Licensor regarding
+such Contributions.
+
+6. Trademarks.
+
+This License does not grant permission to use the trade names, trademarks,
+service marks, or product names of the Licensor, except as required for
+reasonable and customary use in describing the origin of the Work and
+reproducing the content of the NOTICE file.
+
+7. Disclaimer of Warranty.
+
+Unless required by applicable law or agreed to in writing, Licensor provides the
+Work (and each Contributor provides its Contributions) on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied,
+including, without limitation, any warranties or conditions of TITLE,
+NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are
+solely responsible for determining the appropriateness of using or
+redistributing the Work and assume any risks associated with Your exercise of
+permissions under this License.
+
+8. Limitation of Liability.
+
+In no event and under no legal theory, whether in tort (including negligence),
+contract, or otherwise, unless required by applicable law (such as deliberate
+and grossly negligent acts) or agreed to in writing, shall any Contributor be
+liable to You for damages, including any direct, indirect, special, incidental,
+or consequential damages of any character arising as a result of this License or
+out of the use or inability to use the Work (including but not limited to
+damages for loss of goodwill, work stoppage, computer failure or malfunction, or
+any and all other commercial damages or losses), even if such Contributor has
+been advised of the possibility of such damages.
+
+9. Accepting Warranty or Additional Liability.
+
+While redistributing the Work or Derivative Works thereof, You may choose to
+offer, and charge a fee for, acceptance of support, warranty, indemnity, or
+other liability obligations and/or rights consistent with this License. However,
+in accepting such obligations, You may act only on Your own behalf and on Your
+sole responsibility, not on behalf of any other Contributor, and only if You
+agree to indemnify, defend, and hold each Contributor harmless for any liability
+incurred by, or claims asserted against, such Contributor by reason of your
+accepting any such warranty or additional liability.
+
+END OF TERMS AND CONDITIONS
+
+APPENDIX: How to apply the Apache License to your work
+
+To apply the Apache License to your work, attach the following boilerplate
+notice, with the fields enclosed by brackets "[]" replaced with your own
+identifying information. (Don't include the brackets!) The text should be
+enclosed in the appropriate comment syntax for the file format. We also
+recommend that a file or class name and description of purpose be included on
+the same "printed page" as the copyright notice for easier identification within
+third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.

diff --git a/stresso/README.md b/stresso/README.md
new file mode 100644
index 0000000..d3c2577
--- /dev/null
+++ b/stresso/README.md

@@ -0,0 +1,192 @@
+
+# Stresso
+
+[![Build Status](https://travis-ci.org/astralway/stresso.svg?branch=master)](https://travis-ci.org/astralway/stresso)
+
+An example application designed to stress Apache Fluo.  This Fluo application computes the 
+number of unique integers through the process of building a bitwise trie.  New numbers
+are added to the trie as leaf nodes.  Observers watch all nodes in the trie to create 
+parents and percolate counts up to the root nodes such that each node in the trie keeps
+track of the number of leaf nodes below it. The count at the root nodes should equal 
+the total number of leaf nodes.  This makes it easy to verify if the test ran correctly. 
+The test stresses Apache Fluo in that multiple transactions can operate on the same data
+as counts are percolated up the trie.
+
+## Concepts and definitions
+
+This test has the following set of configurable parameters.
+
+ * **nodeSize** : The number of bits chopped off the end each time a number is
+   percolated up.  Must choose a nodeSize such that `64 % nodeSize == 0`
+ * **stopLevel** : The number of levels in the tree is a function of the
+   nodeSize.  The deepest possible level is `64 / nodeSize`.  Levels are
+   decremented going up the tree.  Setting the stop level determines how far up
+   to percolate.  The lower the stop level, the more root nodes there are.
+   Having more root nodes means less collisions, but all roots need to be
+   scanned to get the count of unique numbers.  Having ~64k root nodes is a
+   good choice.  
+ * **max** : Random numbers are generated modulo the max. 
+
+Setting the stop level such that you have ~64k root nodes is dependent on the
+max and nodeSize.  For example assume we choose a max of 10<sup>12</sup> and a
+node size of 8.  The following table shows information about each level in the
+tree using this configuration.  So for a max of 10<sup>12</sup> choosing a stop
+level of 5 would result in 59,604 root nodes.  With this many root nodes there
+would not be many collisions and scanning 59,604 nodes to compute the unique
+number of intergers is a quick operation.
+
+|Level|Max Node             |Number of possible Nodes|
+|:---:|---------------------|-----------------------:|
+|  0  |`0xXXXXXXXXXXXXXXXX` |                 1      |
+|  1  |`0x00XXXXXXXXXXXXXX` |                 1      |
+|  2  |`0x0000XXXXXXXXXXXX` |                 1      |
+|  3  |`0x000000XXXXXXXXXX` |                 1      |
+|  4  |`0x000000E8XXXXXXXX` |               232      |
+|  5  |`0x000000E8D4XXXXXX` |            59,604      |
+|  6  |`0x000000E8D4A5XXXX` |        15,258,789      |
+|  7  |`0x000000E8D4A510XX` |     3,906,250,000      |
+|  8  |`0x000000E8D4A51000` | 1,000,000,000,000      |
+
+In the table above, X indicates nibbles that are always zeroed out for every
+node at that level.  You can easily view nodes at a level using a row prefix
+with the fluo scan command.  For example `fluo scan -p 05` shows all nodes at
+level 5.
+
+For small scale test a max of 10<sup>9</sup> and a stop level of 6 is a good
+choice. 
+
+## Building Stresso
+
+```
+mvn package 
+```
+
+This will create a jar and shaded jar in target:
+
+```
+$ ls target/stresso-*
+target/stresso-0.0.1-SNAPSHOT.jar  target/stresso-0.0.1-SNAPSHOT-shaded.jar
+```
+
+## Run Stresso using MiniFluo
+
+There are several integration tests that run Stresso on a MiniFluo instance.
+These tests can be run using `mvn verify`.
+
+## Run Stresso on cluster
+
+The [bin directory](/bin) contains a set of scripts to help run this test on a
+cluster.  These scripts make the following assumpitions.
+
+ * `FLUO_HOME` environment variable is set.  If not set, then set it in `conf/env.sh`.
+ * Hadoop `yarn` command is on path.
+ * Hadoop `hadoop` command is on path.
+ * Accumulo `accumulo` command is on path.
+
+Before running any of the scipts, copy [conf/env.sh.example](/conf/env.sh.example) 
+to `conf/env.sh`, then inspect and modify the file.
+
+Next, execute the [run-test.sh](/bin/run-test.sh) script.  This script will create a
+new Apache Fluo app called `stresso` (which can be changed by `FLUO_APP_NAME` in your env.sh). 
+It will modify the application's fluo.properties, copy the stresso jar to the `lib/` 
+directory of the app and set the following in fluo.properties:
+
+```
+fluo.observer.0=stresso.trie.NodeObserver
+fluo.app.trie.nodeSize=X
+fluo.app.trie.stopLevel=Y
+```
+
+The `run-test.sh` script will then initialize and start the Stresso application.  
+It will load a lot of data directly into Accumulo without transactions and then 
+incrementally load smaller amounts of data using transactions.  After incrementally 
+loading some data, it computes the expected number of unique integers using map reduce.
+It then prints the number of unique integers computed by Apache Fluo. 
+
+## Additional Scripts
+
+The script [generate.sh](/bin/generate.sh) starts a map reduce job to generate
+random integers.
+
+```
+generate.sh <num files> <num per file> <max> <out dir>
+
+where:
+
+num files = Number of files to generate (and number of map task)
+numPerMap = Number of random numbers to generate per file
+max       = Generate random numbers between 0 and max
+out dir   = Output directory
+```
+
+The script [split.sh](/bin/split.sh) pre-splits the Accumulo table used by Apache
+Fluo.  Consider running this command before loading data.
+
+```
+split.sh <num tablets> <max>
+
+where:
+
+num tablets = Num tablets to create for lowest level of tree.  Will create less tablets for higher levels based on the max.
+```
+After generating random numbers, load them into Apache Fluo with one of the following
+commands.  The script [init.sh](/bin/init.sh) intializes any empty table using
+map reduce.  This simulates the case where a user has a lot of initial data to
+load into Fluo.  This command should only be run when the table is empty
+because it writes directly to the Fluo table w/o using transactions.  
+
+```
+init.sh <input dir> <tmp dir> <num reducers>
+
+where:
+
+input dir    = A directory with file created by stresso.trie.Generate
+node size    = Size of node in bits which must be a divisor of 32/64
+tmp dir      = This command runs two map reduce jobs and needs an intermediate directory to store data.
+num reducers = Number of reduce task map reuduce job should run
+```
+
+Run the [load.sh](/bin/load.sh) script on a table with existing data. It starts
+a map reduce job that executes load transactions.  Loading the same directory
+multiple times should not result in incorrect counts.
+
+```
+load.sh <input dir>
+```
+
+After loading data, run the [print.sh](/bin/print.sh) script to check the
+status of the computation of the number of unique integers within Apache Fluo.  This
+command will print two numbers, the sum of the root nodes and number of root
+nodes.  If there are outstanding notification to process, this count may not be
+accurate.
+
+```
+print.sh
+```
+
+In order to know how many unique numbers are expected, run the [unique.sh](/bin/unique.sh)
+script.  This scrpt runs a map reduce job that calculates the number of
+unique integers.  This script can take a list of directories created by
+multiple runs of [generate.sh](/bin/generate.sh)
+
+```
+unique.sh <num reducers> <input dir>{ <input dir>}
+```
+
+As transactions execute they leave a trail of history behind.  The nodes in the
+lower levels of the tree are updated by many transactions and therefore have a
+long history trail.  A long transactional history can slow down transactions.
+Forcing a compaction in Accumulo will clean up this history.  However
+compacting the entire table is expensive.  To avoid this expense, compact only the
+lower levels of the tree.  The following command will compact levels of the
+tree with a maximum number of nodes less than the specified cutoff.
+
+```
+compact-ll.sh <max> <cutoff>
+```
+
+where:
+
+```
+cutoff    = Any level of the tree with a maximum number of nodes that is less than this cutoff will be compacted.
+```

diff --git a/stresso/bin/compact-ll.sh b/stresso/bin/compact-ll.sh
new file mode 100755
index 0000000..5a98277
--- /dev/null
+++ b/stresso/bin/compact-ll.sh

@@ -0,0 +1,6 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+. $BIN_DIR/load-env.sh
+
+$FLUO_CMD exec $FLUO_APP_NAME stresso.trie.CompactLL $FLUO_PROPS $@

diff --git a/stresso/bin/diff.sh b/stresso/bin/diff.sh
new file mode 100755
index 0000000..5e36d95
--- /dev/null
+++ b/stresso/bin/diff.sh

@@ -0,0 +1,6 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+. $BIN_DIR/load-env.sh
+
+$FLUO_CMD exec $FLUO_APP_NAME stresso.trie.Diff $FLUO_PROPS $@

diff --git a/stresso/bin/generate.sh b/stresso/bin/generate.sh
new file mode 100755
index 0000000..622be8a
--- /dev/null
+++ b/stresso/bin/generate.sh

@@ -0,0 +1,6 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+. $BIN_DIR/load-env.sh
+
+yarn jar $STRESSO_JAR stresso.trie.Generate $@

diff --git a/stresso/bin/init.sh b/stresso/bin/init.sh
new file mode 100755
index 0000000..133ad10
--- /dev/null
+++ b/stresso/bin/init.sh

@@ -0,0 +1,11 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+. $BIN_DIR/load-env.sh
+
+if [ "$#" -ne 3 ]; then
+    echo "Usage : $0 <input dir> <work dir> <num reducers>"
+    exit 1
+fi
+
+yarn jar $STRESSO_SHADED_JAR stresso.trie.Init -Dmapreduce.job.reduces=$3 $FLUO_PROPS $1 $2

diff --git a/stresso/bin/load-env.sh b/stresso/bin/load-env.sh
new file mode 100644
index 0000000..5400fc2
--- /dev/null
+++ b/stresso/bin/load-env.sh

@@ -0,0 +1,44 @@
+if [ ! -f $BIN_DIR/../conf/env.sh ] 
+then
+  . $BIN_DIR/../conf/env.sh.example
+else
+  . $BIN_DIR/../conf/env.sh
+fi
+
+# verify fluo configuration
+if [ ! -d "$FLUO_HOME" ]; then
+  echo "Problem with FLUO_HOME : $FLUO_HOME"
+  exit 1
+fi
+FLUO_CMD=$FLUO_HOME/bin/fluo
+if [ -z "$FLUO_APP_NAME" ]; then
+  echo "FLUO_APP_NAME is not set!" 
+  exit 1
+fi
+FLUO_APP_LIB=$FLUO_HOME/apps/$FLUO_APP_NAME/lib
+FLUO_PROPS=$FLUO_HOME/apps/$FLUO_APP_NAME/conf/fluo.properties
+if [ ! -f "$FLUO_PROPS" ] && [ -z "$SKIP_FLUO_PROPS_CHECK" ]; then
+  echo "Fluo properties file not found : $FLUO_PROPS" 
+  exit 1
+fi
+
+STRESSO_VERSION=0.0.1-SNAPSHOT
+STRESSO_JAR=$BIN_DIR/../target/stresso-$STRESSO_VERSION.jar
+STRESSO_SHADED_JAR=$BIN_DIR/../target/stresso-$STRESSO_VERSION-shaded.jar
+if [ ! -f "$STRESSO_JAR" ] && [ -z "$SKIP_JAR_CHECKS" ]; then
+  echo "Stresso jar not found : $STRESSO_JAR" 
+  exit 1;
+fi
+if [ ! -f "$STRESSO_SHADED_JAR" ] && [ -z "$SKIP_JAR_CHECKS" ]; then
+  echo "Stresso shaded jar not found : $STRESSO_SHADED_JAR" 
+  exit 1;
+fi
+
+command -v yarn >/dev/null 2>&1 || { echo >&2 "I require yarn but it's not installed.  Aborting."; exit 1; }
+command -v hadoop >/dev/null 2>&1 || { echo >&2 "I require hadoop but it's not installed.  Aborting."; exit 1; }
+
+if [[ "$OSTYPE" == "darwin"* ]]; then
+  export SED="sed -i .bak"
+else
+  export SED="sed -i"
+fi

diff --git a/stresso/bin/load.sh b/stresso/bin/load.sh
new file mode 100755
index 0000000..8cf2ac5
--- /dev/null
+++ b/stresso/bin/load.sh

@@ -0,0 +1,6 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+. $BIN_DIR/load-env.sh
+
+yarn jar $STRESSO_SHADED_JAR stresso.trie.Load $FLUO_PROPS $@

diff --git a/stresso/bin/print.sh b/stresso/bin/print.sh
new file mode 100755
index 0000000..2554c4c
--- /dev/null
+++ b/stresso/bin/print.sh

@@ -0,0 +1,6 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+. $BIN_DIR/load-env.sh
+
+$FLUO_CMD exec $FLUO_APP_NAME stresso.trie.Print $FLUO_PROPS $@

diff --git a/stresso/bin/run-test.sh b/stresso/bin/run-test.sh
new file mode 100755
index 0000000..a58dd6f
--- /dev/null
+++ b/stresso/bin/run-test.sh

@@ -0,0 +1,124 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+SKIP_JAR_CHECKS="true"
+SKIP_FLUO_PROPS_CHECK="true"
+
+. $BIN_DIR/load-env.sh
+
+unset SKIP_JAR_CHECKS
+unset SKIP_FLUO_PROPS_CHECK
+
+# stop if any command fails
+set -e
+
+if [ ! -d $FLUO_HOME/apps/$FLUO_APP_NAME ]; then
+  $FLUO_CMD new $FLUO_APP_NAME
+else
+  echo "Restarting '$FLUO_APP_NAME' application.  Errors may be printed if it's not running..."
+  $FLUO_CMD stop $FLUO_APP_NAME || true
+  rm -rf $FLUO_HOME/apps/$FLUO_APP_NAME
+  $FLUO_CMD new $FLUO_APP_NAME
+fi
+
+# build stresso
+(cd $BIN_DIR/..;mvn package -Dfluo.version=$FLUO_VERSION -Daccumulo.version=$ACCUMULO_VERSION -DskipTests)
+
+if [[ $(accumulo version) == *1.6* ]]; then
+  # build stress balancer
+  (cd $BIN_DIR/..; mkdir -p git; cd git;git clone https://github.com/keith-turner/stress-balancer.git; cd stress-balancer; ./config-fluo.sh $FLUO_PROPS)
+fi
+
+if [ ! -f "$STRESSO_JAR" ]; then
+  echo "Stresso jar not found : $STRESSO_JAR"
+  exit 1
+fi
+if [ ! -d $FLUO_APP_LIB ]; then
+  echo "Fluo app lib $FLUO_APP_LIB does not exist" 
+  exit 1
+fi
+cp $STRESSO_JAR $FLUO_APP_LIB
+mvn dependency:copy-dependencies  -DincludeArtifactIds=fluo-recipes-core -DoutputDirectory=$FLUO_APP_LIB
+
+# determine a good stop level
+if (("$MAX" <= $((10**9)))); then
+  STOP=6
+elif (("$MAX" <= $((10**12)))); then
+  STOP=5
+else
+  STOP=4
+fi
+
+# delete existing config in fluo.properties if it exist
+$SED '/fluo.observer/d' $FLUO_PROPS
+$SED '/fluo.app.trie/d' $FLUO_PROPS
+
+# append stresso specific config
+echo "fluo.observer.0=stresso.trie.NodeObserver" >> $FLUO_PROPS
+echo "fluo.app.trie.nodeSize=8" >> $FLUO_PROPS
+echo "fluo.app.trie.stopLevel=$STOP" >> $FLUO_PROPS
+
+$FLUO_CMD init $FLUO_APP_NAME -f
+$FLUO_CMD start $FLUO_APP_NAME
+
+echo "Removing any previous logs in $LOG_DIR"
+mkdir -p $LOG_DIR
+rm -f $LOG_DIR/*
+
+# configure balancer for fluo table
+if [[ $(accumulo version) == *1.6* ]]; then
+  (cd $BIN_DIR/../git/stress-balancer; ./config-accumulo.sh $FLUO_PROPS)
+fi # TODO setup RegexGroupBalancer built into Accumulo 1.7.0... may be easier to do from java
+
+hadoop fs -rm -r -f /stresso/
+
+set -e
+
+# add splits to Fluo table
+echo "*****Presplitting table*****"
+$BIN_DIR/split.sh $SPLITS >$LOG_DIR/split.out 2>$LOG_DIR/split.err
+
+if (( GEN_INIT > 0 )); then
+  # generate and load intial data using map reduce writing directly to table
+  echo "*****Generating and loading initial data set*****"
+  $BIN_DIR/generate.sh $MAPS $((GEN_INIT / MAPS)) $MAX /stresso/init >$LOG_DIR/generate_0.out 2>$LOG_DIR/generate_0.err
+  $BIN_DIR/init.sh /stresso/init /stresso/initTmp $REDUCES >$LOG_DIR/init.out 2>$LOG_DIR/init.err
+  hadoop fs -rm -r /stresso/initTmp
+fi
+
+# load data incrementally
+for i in $(seq 1 $ITERATIONS); do
+  echo "*****Generating and loading incremental data set $i*****"
+  $BIN_DIR/generate.sh $MAPS $((GEN_INCR / MAPS)) $MAX /stresso/$i >$LOG_DIR/generate_$i.out 2>$LOG_DIR/generate_$i.err
+  $BIN_DIR/load.sh /stresso/$i >$LOG_DIR/load_$i.out 2>$LOG_DIR/load_$i.err
+  # TODO could reload the same dataset sometimes, maybe when i%5 == 0 or something
+  $BIN_DIR/compact-ll.sh $MAX $COMPACT_CUTOFF >$LOG_DIR/compact-ll_$i.out 2>$LOG_DIR/compact-ll_$i.err
+  if ! ((i % WAIT_PERIOD)); then
+    $FLUO_CMD wait $FLUO_APP_NAME >$LOG_DIR/wait_$i.out 2>$LOG_DIR/wait_$i.err
+  else
+    sleep $SLEEP
+  fi
+done
+
+# print unique counts
+echo "*****Calculating # of unique integers using MapReduce*****"
+$BIN_DIR/unique.sh $REDUCES /stresso/* >$LOG_DIR/unique.out 2>$LOG_DIR/unique.err
+grep UNIQUE $LOG_DIR/unique.err
+
+echo "*****Wait for Fluo to finish processing*****"
+$FLUO_CMD wait $FLUO_APP_NAME
+
+echo "*****Printing # of unique integers calculated by Fluo*****"
+$BIN_DIR/print.sh >$LOG_DIR/print.out 2>$LOG_DIR/print.err
+cat $LOG_DIR/print.out
+
+echo "*****Verifying Fluo & MapReduce results match*****"
+MAPR_TOTAL=`grep UNIQUE $LOG_DIR/unique.err | cut -d = -f 2`
+FLUO_TOTAL=`grep "Total at root" $LOG_DIR/print.out | cut -d ' ' -f 5`
+if [ $MAPR_TOTAL -eq $FLUO_TOTAL ]; then
+  echo "Success! Fluo & MapReduce both calculated $FLUO_TOTAL unique integers"
+  exit 0
+else
+  echo "ERROR - Results do not match. Fluo calculated $FLUO_TOTAL unique integers while MapReduce calculated $MAPR_TOTAL integers"
+  exit 1
+fi

diff --git a/stresso/bin/split.sh b/stresso/bin/split.sh
new file mode 100755
index 0000000..225bef5
--- /dev/null
+++ b/stresso/bin/split.sh

@@ -0,0 +1,6 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+. $BIN_DIR/load-env.sh
+
+$FLUO_CMD exec $FLUO_APP_NAME stresso.trie.Split $FLUO_PROPS "$TABLE_PROPS" $@

diff --git a/stresso/bin/unique.sh b/stresso/bin/unique.sh
new file mode 100755
index 0000000..68c2a58
--- /dev/null
+++ b/stresso/bin/unique.sh

@@ -0,0 +1,11 @@
+#!/bin/bash
+
+BIN_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
+. $BIN_DIR/load-env.sh
+
+if [ "$#" -lt 2 ]; then
+    echo "Usage : $0 <num reducers> <input dir>{ <input dir>}"
+    exit 1
+fi
+
+yarn jar $STRESSO_JAR stresso.trie.Unique -Dmapreduce.job.reduces=$1 ${@:2}

diff --git a/stresso/conf/.gitignore b/stresso/conf/.gitignore
new file mode 100644
index 0000000..137e678
--- /dev/null
+++ b/stresso/conf/.gitignore

@@ -0,0 +1 @@
+env.sh

diff --git a/stresso/conf/env.sh.example b/stresso/conf/env.sh.example
new file mode 100644
index 0000000..77f9171
--- /dev/null
+++ b/stresso/conf/env.sh.example

@@ -0,0 +1,48 @@
+###############################
+# configuration for all scripts
+###############################
+# Fluo Home
+test -z "$FLUO_HOME" && FLUO_HOME=/path/to/accumulo
+# Fluo application name
+FLUO_APP_NAME=stresso
+
+###############################
+# configuration for run-test.sh
+###############################
+# Place where logs from test are placed
+LOG_DIR=$BIN_DIR/../logs
+# Maximum number to generate
+MAX=$((10**9))
+#the number of splits to create in table
+SPLITS=17
+# Number of mappers to run for data generation, which determines how many files
+# generation outputs.  The number of files determines how many mappers loading
+# data will run.
+MAPS=17
+# Number of reduce tasks
+REDUCES=17
+# Number of random numbers to generate initially
+GEN_INIT=$((10**6))
+# Number of random numbers to generate for each incremental step.
+GEN_INCR=$((10**3))
+# Number of incremental steps.
+ITERATIONS=3
+# Seconds to sleep between incremental steps.
+SLEEP=30
+# Compact levels with less than the following possible nodes after loads
+COMPACT_CUTOFF=$((256**3 + 1))
+# The fluo wait command is executed after this many incremental load steps.
+WAIT_PERIOD=10
+# To run map reduce jobs, a shaded jar is built. The following properties
+# determine what versions of Fluo and Accumulo client libs end up in the shaded
+# jar.
+FLUO_VERSION=$($FLUO_HOME/bin/fluo version)
+ACCUMULO_VERSION=$(accumulo version)
+
+# The following Accumulo table properties will be set
+read -r -d '' TABLE_PROPS << EOM
+table.compaction.major.ratio=1.5
+table.file.compress.blocksize=8K
+table.file.compress.blocksize.index=32K
+table.file.compress.type=snappy
+EOM

diff --git a/stresso/conf/log4j.xml b/stresso/conf/log4j.xml
new file mode 100644
index 0000000..bd82a3a
--- /dev/null
+++ b/stresso/conf/log4j.xml

@@ -0,0 +1,39 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Copyright 2015 Stresso authors (see AUTHORS)
+
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
+<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">
+
+	<appender name="console" class="org.apache.log4j.ConsoleAppender">
+		<param name="Target" value="System.out"/>
+		<layout class="org.apache.log4j.PatternLayout">
+			<param name="ConversionPattern" value="%d{ISO8601} [%-8c{2}] %-5p: %m%n" />
+		</layout>
+	</appender>
+
+	<logger name="org.apache.zookeeper">
+		<level value="ERROR" />
+	</logger>
+
+	<logger name="org.apache.curator">
+		<level value="ERROR" />
+	</logger>
+
+	<root>
+		<level value="INFO" />
+		<appender-ref ref="console" />
+	</root>
+</log4j:configuration>

diff --git a/stresso/pom.xml b/stresso/pom.xml
new file mode 100644
index 0000000..9b514da
--- /dev/null
+++ b/stresso/pom.xml

@@ -0,0 +1,234 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Copyright 2014 Stresso authors (see AUTHORS)
+
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+
+  <groupId>io.github.astralway</groupId>
+  <artifactId>stresso</artifactId>
+  <version>0.0.1-SNAPSHOT</version>
+  <packaging>jar</packaging>
+
+  <name>Stresso</name>
+  <description>This repo contains an example application designed to stress Apache Fluo</description>
+  <url>https://github.com/astralway/stresso</url>
+
+  <properties>
+    <accumulo.version>1.7.2</accumulo.version>
+    <hadoop.version>2.6.3</hadoop.version>
+    <fluo.version>1.0.0-incubating</fluo.version>
+    <fluo-recipes.version>1.0.0-incubating</fluo-recipes.version>
+    <slf4j.version>1.7.12</slf4j.version>
+  </properties>
+
+  <profiles>
+    <profile>
+      <id>mini-accumulo</id>
+      <activation>
+        <property>
+          <name>!skipTests</name>
+        </property>
+      </activation>
+      <build>
+        <plugins>
+          <plugin>
+            <groupId>org.apache.accumulo</groupId>
+            <artifactId>accumulo-maven-plugin</artifactId>
+            <version>${accumulo.version}</version>
+            <configuration>
+              <instanceName>it-instance-maven</instanceName>
+              <rootPassword>ITSecret</rootPassword>
+            </configuration>
+            <executions>
+              <execution>
+                <id>run-plugin</id>
+                <goals>
+                  <goal>start</goal>
+                  <goal>stop</goal>
+                </goals>
+              </execution>
+            </executions>
+          </plugin>
+        </plugins>
+      </build>
+    </profile>
+  </profiles>
+
+  <build>
+    <plugins>
+      <plugin>
+        <artifactId>maven-compiler-plugin</artifactId>
+        <version>3.1</version>
+        <configuration>
+          <source>1.8</source>
+          <target>1.8</target>
+          <optimize>true</optimize>
+          <encoding>UTF-8</encoding>
+        </configuration>
+      </plugin>
+
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-failsafe-plugin</artifactId>
+        <configuration>
+          <systemPropertyVariables>
+            <fluo.it.instance.name>it-instance-maven</fluo.it.instance.name>
+            <fluo.it.instance.clear>false</fluo.it.instance.clear>
+          </systemPropertyVariables>
+        </configuration>
+        <executions>
+          <execution>
+            <id>run-integration-tests</id>
+            <goals>
+              <goal>integration-test</goal>
+              <goal>verify</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-shade-plugin</artifactId>
+        <executions>
+          <execution>
+            <goals>
+              <goal>shade</goal>
+            </goals>
+            <phase>package</phase>
+            <configuration>
+              <shadedArtifactAttached>true</shadedArtifactAttached>
+              <shadedClassifierName>shaded</shadedClassifierName>
+              <filters>
+                <filter>
+                  <artifact>*:*</artifact>
+                  <excludes>
+                    <exclude>META-INF/*.SF</exclude>
+                    <exclude>META-INF/*.DSA</exclude>
+                    <exclude>META-INF/*.RSA</exclude>
+                  </excludes>
+                </filter>
+              </filters>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+
+    </plugins>
+  </build>
+
+  <!--
+       The provided scope is used for dependencies that should not end up in
+       the shaded jar.  The shaded jar is used to run map reduce jobs via the yarn
+       command.  The yarn command will provided hadoop dependencies, so they are not
+       needed in the shaded jar.
+  -->
+
+  <dependencies>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-api</artifactId>
+      <version>${fluo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-core</artifactId>
+      <version>${fluo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-mapreduce</artifactId>
+      <version>${fluo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-core</artifactId>
+      <version>${fluo-recipes.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.accumulo</groupId>
+      <artifactId>accumulo-core</artifactId>
+      <version>${accumulo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>hadoop-client</artifactId>
+      <version>${hadoop.version}</version>
+      <scope>provided</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+      <version>${slf4j.version}</version>
+      <scope>provided</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-log4j12</artifactId>
+      <version>${slf4j.version}</version>
+      <scope>provided</scope>
+    </dependency>
+    <dependency>
+      <groupId>com.google.guava</groupId>
+      <artifactId>guava</artifactId>
+      <version>13.0.1</version>
+      <scope>provided</scope>
+    </dependency>
+    <dependency>
+      <groupId>commons-configuration</groupId>
+      <artifactId>commons-configuration</artifactId>
+      <version>1.10</version>
+      <scope>provided</scope>
+    </dependency>
+    <dependency>
+      <groupId>commons-codec</groupId>
+      <artifactId>commons-codec</artifactId>
+      <version>1.10</version>
+      <scope>provided</scope>
+    </dependency>
+
+    <!-- Test Dependencies -->
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <version>4.11</version>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.accumulo</groupId>
+      <artifactId>accumulo-minicluster</artifactId>
+      <version>${accumulo.version}</version>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-mini</artifactId>
+      <version>${fluo.version}</version>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-test</artifactId>
+      <version>${fluo-recipes.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>commons-io</groupId>
+      <artifactId>commons-io</artifactId>
+      <version>2.4</version>
+      <scope>test</scope>
+    </dependency>
+  </dependencies>
+</project>

diff --git a/stresso/src/main/java/stresso/trie/CompactLL.java b/stresso/src/main/java/stresso/trie/CompactLL.java
new file mode 100644
index 0000000..1e0e421
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/CompactLL.java

@@ -0,0 +1,61 @@
+package stresso.trie;
+
+import java.io.File;
+
+import org.apache.accumulo.core.client.Connector;
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.core.util.AccumuloUtil;
+import org.apache.hadoop.io.Text;
+
+/**
+ * Compact the lower levels of the tree. The lower levels of the tree contain a small of nodes that
+ * are frequently updated. Compacting these lower levels is a quick operation that cause the Fluo GC
+ * iterator to cleanup past transactions.
+ */
+
+public class CompactLL {
+  public static void main(String[] args) throws Exception {
+
+    if (args.length != 3) {
+      System.err
+          .println("Usage: " + Split.class.getSimpleName() + " <fluo props> <max> <cutoff>");
+      System.exit(-1);
+    }
+
+    FluoConfiguration config = new FluoConfiguration(new File(args[0]));
+
+    long max = Long.parseLong(args[1]);
+
+    //compact levels that can contain less nodes than this
+    int cutoff = Integer.parseInt(args[2]);
+
+    int nodeSize;
+    int stopLevel;
+    try (FluoClient client = FluoFactory.newClient(config)) {
+      nodeSize = client.getAppConfiguration().getInt(Constants.NODE_SIZE_PROP);
+      stopLevel = client.getAppConfiguration().getInt(Constants.STOP_LEVEL_PROP);
+    }
+
+    int level = 64 / nodeSize;
+
+    while(level >= stopLevel) {
+      if(max < cutoff) {
+        break;
+      }
+
+      max = max >> 8;
+      level--;
+    }
+
+    String start = String.format("%02d", stopLevel);
+    String end = String.format("%02d:~", (level));
+
+    System.out.println("Compacting "+start+" to "+end);
+    Connector conn = AccumuloUtil.getConnector(config);
+    conn.tableOperations().compact(config.getAccumuloTable(), new Text(start), new Text(end), true, false);
+
+    System.exit(0);
+  }
+}

diff --git a/stresso/src/main/java/stresso/trie/Constants.java b/stresso/src/main/java/stresso/trie/Constants.java
new file mode 100644
index 0000000..7c8cf6c
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/Constants.java

@@ -0,0 +1,32 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package stresso.trie;
+
+import org.apache.fluo.api.data.Column;
+import org.apache.fluo.recipes.core.types.StringEncoder;
+import org.apache.fluo.recipes.core.types.TypeLayer;
+
+/**
+ *
+ */
+public class Constants {
+
+  public static final TypeLayer TYPEL = new TypeLayer(new StringEncoder());
+
+  public static final Column COUNT_SEEN_COL = new Column("count", "seen");
+  public static final Column COUNT_WAIT_COL = new Column("count", "wait");
+
+  public static final String NODE_SIZE_PROP = "trie.nodeSize";
+  public static final String STOP_LEVEL_PROP = "trie.stopLevel";
+}

diff --git a/stresso/src/main/java/stresso/trie/Diff.java b/stresso/src/main/java/stresso/trie/Diff.java
new file mode 100644
index 0000000..f74521d
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/Diff.java

@@ -0,0 +1,104 @@
+package stresso.trie;
+
+import java.io.File;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.client.Snapshot;
+import org.apache.fluo.api.client.scanner.ColumnScanner;
+import org.apache.fluo.api.client.scanner.RowScanner;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.data.ColumnValue;
+import org.apache.fluo.api.data.Span;
+
+public class Diff {
+  public static Map<String, Long> getRootCount(FluoClient client, Snapshot snap, int level,
+      int stopLevel, int nodeSize) throws Exception {
+
+    HashMap<String, Long> counts = new HashMap<>();
+
+    RowScanner rows = snap.scanner().over(Span.prefix(String.format("%02d:", level)))
+        .fetch(Constants.COUNT_SEEN_COL, Constants.COUNT_WAIT_COL).byRow().build();
+
+    for (ColumnScanner columns : rows) {
+      String row = columns.getsRow();
+      Node node = new Node(row);
+
+      while (node.getLevel() > stopLevel) {
+        node = node.getParent();
+      }
+
+      String stopRow = node.getRowId();
+      long count = counts.getOrDefault(stopRow, 0L);
+
+      if (node.getNodeSize() == nodeSize) {
+        for (ColumnValue colVal : columns) {
+          count += Long.parseLong(colVal.getsValue());
+        }
+      } else {
+        throw new RuntimeException("TODO");
+      }
+
+      counts.put(stopRow, count);
+    }
+
+    return counts;
+  }
+
+  public static void main(String[] args) throws Exception {
+
+    if (args.length != 1) {
+      System.err.println("Usage: " + Diff.class.getSimpleName() + " <fluo props>");
+      System.exit(-1);
+    }
+
+    FluoConfiguration config = new FluoConfiguration(new File(args[0]));
+
+    try (FluoClient client = FluoFactory.newClient(config); Snapshot snap = client.newSnapshot()) {
+
+      int stopLevel = client.getAppConfiguration().getInt(Constants.STOP_LEVEL_PROP);
+      int nodeSize = client.getAppConfiguration().getInt(Constants.NODE_SIZE_PROP);
+
+      Map<String, Long> rootCounts = getRootCount(client, snap, stopLevel, stopLevel, nodeSize);
+      ArrayList<String> rootRows = new ArrayList<>(rootCounts.keySet());
+      Collections.sort(rootRows);
+
+      // TODO 8
+      for (int level = stopLevel + 1; level <= 8; level++) {
+        System.out.printf("Level %d:\n", level);
+
+        Map<String, Long> counts = getRootCount(client, snap, level, stopLevel, nodeSize);
+
+        long sum = 0;
+
+        for (String row : rootRows) {
+          long c1 = rootCounts.get(row);
+          long c2 = counts.getOrDefault(row, -1L);
+
+          if (c1 != c2) {
+            System.out.printf("\tdiff: %s %d %d\n", row, c1, c2);
+          }
+
+          if (c2 > 0) {
+            sum += c2;
+          }
+        }
+
+        HashSet<String> extras = new HashSet<>(counts.keySet());
+        extras.removeAll(rootCounts.keySet());
+
+        for (String row : extras) {
+          long c = counts.get(row);
+          System.out.printf("\textra: %s %d\n", row, c);
+        }
+
+        System.out.println("\tsum " + sum);
+      }
+    }
+  }
+}

diff --git a/stresso/src/main/java/stresso/trie/Generate.java b/stresso/src/main/java/stresso/trie/Generate.java
new file mode 100644
index 0000000..55f9beb
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/Generate.java

@@ -0,0 +1,176 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package stresso.trie;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.util.Random;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapred.InputFormat;
+import org.apache.hadoop.mapred.InputSplit;
+import org.apache.hadoop.mapred.JobClient;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordReader;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.hadoop.mapred.RunningJob;
+import org.apache.hadoop.mapred.SequenceFileOutputFormat;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class Generate extends Configured implements Tool {
+
+  private static final Logger log = LoggerFactory.getLogger(Generate.class);
+
+  public static final String TRIE_GEN_NUM_PER_MAPPER_PROP = "stresso.trie.numPerMapper";
+  public static final String TRIE_GEN_NUM_MAPPERS_PROP = "stresso.trie.numMappers";
+  public static final String TRIE_GEN_MAX_PROP = "stresso.trie.max";
+
+  public static class RandomSplit implements InputSplit {
+
+    @Override
+    public void write(DataOutput out) throws IOException {}
+
+    @Override
+    public void readFields(DataInput in) throws IOException {}
+
+    @Override
+    public long getLength() throws IOException {
+      return 0;
+    }
+
+    @Override
+    public String[] getLocations() throws IOException {
+      return new String[0];
+    }
+
+  }
+
+  public static class RandomLongInputFormat implements InputFormat<LongWritable,NullWritable> {
+
+    @Override
+    public InputSplit[] getSplits(JobConf job, int numSplits) throws IOException {
+      InputSplit[] splits = new InputSplit[job.getInt(TRIE_GEN_NUM_MAPPERS_PROP, 1)];
+      for (int i = 0; i < splits.length; i++) {
+        splits[i] = new RandomSplit();
+      }
+      return splits;
+    }
+
+    @Override
+    public RecordReader<LongWritable,NullWritable> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException {
+
+      final int numToGen = job.getInt(TRIE_GEN_NUM_PER_MAPPER_PROP, 1);
+      final long max = job.getLong(TRIE_GEN_MAX_PROP, Long.MAX_VALUE);
+
+      return new RecordReader<LongWritable,NullWritable>() {
+
+        private Random random = new Random();
+        private int count = 0;
+
+        @Override
+        public boolean next(LongWritable key, NullWritable value) throws IOException {
+
+          if (count == numToGen)
+            return false;
+
+          key.set((random.nextLong() & 0x7fffffffffffffffl) % max);
+          count++;
+          return true;
+        }
+
+        @Override
+        public LongWritable createKey() {
+          return new LongWritable();
+        }
+
+        @Override
+        public NullWritable createValue() {
+          return NullWritable.get();
+        }
+
+        @Override
+        public long getPos() throws IOException {
+          return count;
+        }
+
+        @Override
+        public void close() throws IOException {}
+
+        @Override
+        public float getProgress() throws IOException {
+          return (float) count / numToGen;
+        }
+      };
+    }
+  }
+
+  @Override
+  public int run(String[] args) throws Exception {
+
+    if (args.length != 4) {
+      log.error("Usage: " + this.getClass().getSimpleName() + " <numMappers> <numbersPerMapper> <max> <output dir>");
+      System.exit(-1);
+    }
+
+    int numMappers = Integer.parseInt(args[0]);
+    int numPerMapper = Integer.parseInt(args[1]);
+    long max = Long.parseLong(args[2]);
+    Path out = new Path(args[3]);
+
+    Preconditions.checkArgument(numMappers > 0, "numMappers <= 0");
+    Preconditions.checkArgument(numPerMapper > 0, "numPerMapper <= 0");
+    Preconditions.checkArgument(max > 0, "max <= 0");
+
+    JobConf job = new JobConf(getConf());
+
+    job.setJobName(this.getClass().getName());
+
+    job.setJarByClass(Generate.class);
+
+    job.setInt(TRIE_GEN_NUM_PER_MAPPER_PROP, numPerMapper);
+    job.setInt(TRIE_GEN_NUM_MAPPERS_PROP, numMappers);
+    job.setLong(TRIE_GEN_MAX_PROP, max);
+
+    job.setInputFormat(RandomLongInputFormat.class);
+
+    job.setNumReduceTasks(0);
+
+    job.setOutputKeyClass(LongWritable.class);
+    job.setOutputValueClass(NullWritable.class);
+
+    job.setOutputFormat(SequenceFileOutputFormat.class);
+    SequenceFileOutputFormat.setOutputPath(job, out);
+
+    RunningJob runningJob = JobClient.runJob(job);
+    runningJob.waitForCompletion();
+    return runningJob.isSuccessful() ? 0 : -1;
+  }
+
+  public static void main(String[] args) throws Exception {
+    int ret = ToolRunner.run(new Generate(), args);
+    System.exit(ret);
+  }
+
+}

diff --git a/stresso/src/main/java/stresso/trie/Init.java b/stresso/src/main/java/stresso/trie/Init.java
new file mode 100644
index 0000000..d0f847f
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/Init.java

@@ -0,0 +1,241 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package stresso.trie;
+
+import java.io.BufferedOutputStream;
+import java.io.File;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.util.Collection;
+
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.admin.CompactionConfig;
+import org.apache.accumulo.core.client.mapreduce.AccumuloFileOutputFormat;
+import org.apache.accumulo.core.client.mapreduce.lib.partition.RangePartitioner;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Value;
+import org.apache.commons.codec.binary.Base64;
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.core.util.AccumuloUtil;
+import org.apache.fluo.mapreduce.FluoKeyValue;
+import org.apache.fluo.mapreduce.FluoKeyValueGenerator;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapreduce.Job;
+import org.apache.hadoop.mapreduce.Mapper;
+import org.apache.hadoop.mapreduce.Reducer;
+import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
+import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+public class Init extends Configured implements Tool {
+
+  public static final String TRIE_STOP_LEVEL_PROP = FluoConfiguration.FLUO_PREFIX + ".stress.trie.stopLevel";
+  public static final String TRIE_NODE_SIZE_PROP = FluoConfiguration.FLUO_PREFIX + ".stress.trie.node.size";
+
+  public static class UniqueReducer extends Reducer<LongWritable,NullWritable,LongWritable,NullWritable> {
+    @Override
+    protected void reduce(LongWritable key, Iterable<NullWritable> values, Context context) throws IOException, InterruptedException {
+      context.write(key, NullWritable.get());
+    }
+  }
+
+  public static class InitMapper extends Mapper<LongWritable,NullWritable,Text,LongWritable> {
+
+    private int stopLevel;
+    private int nodeSize;
+    private static final LongWritable ONE = new LongWritable(1);
+
+    private Text outputKey = new Text();
+
+    @Override
+    protected void setup(Context context) throws IOException, InterruptedException {
+      nodeSize = context.getConfiguration().getInt(TRIE_NODE_SIZE_PROP, 0);
+      stopLevel = context.getConfiguration().getInt(TRIE_STOP_LEVEL_PROP, 0);
+    }
+
+    @Override
+    protected void map(LongWritable key, NullWritable val, Context context) throws IOException, InterruptedException {
+      Node node = new Node(key.get(), 64 / nodeSize, nodeSize);
+      while (node != null) {
+        outputKey.set(node.getRowId());
+        context.write(outputKey, ONE);
+        if (node.getLevel() <= stopLevel)
+          node = null;
+        else
+          node = node.getParent();
+      }
+    }
+  }
+
+  public static class InitCombiner extends Reducer<Text,LongWritable,Text,LongWritable> {
+
+    private LongWritable outputVal = new LongWritable();
+
+    @Override
+    protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
+      long sum = 0;
+      for (LongWritable l : values) {
+        sum += l.get();
+      }
+
+      outputVal.set(sum);
+      context.write(key, outputVal);
+    }
+  }
+
+  public static class InitReducer extends Reducer<Text,LongWritable,Key,Value> {
+    private FluoKeyValueGenerator fkvg = new FluoKeyValueGenerator();
+
+    @Override
+    protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
+      long sum = 0;
+      for (LongWritable l : values) {
+        sum += l.get();
+      }
+
+      fkvg.setRow(key).setColumn(Constants.COUNT_SEEN_COL).setValue(sum + "");
+
+      FluoKeyValue[] kvs = fkvg.getKeyValues();
+      for (FluoKeyValue kv : kvs) {
+        context.write(kv.getKey(), kv.getValue());
+      }
+    }
+  }
+
+  @Override
+  public int run(String[] args) throws Exception {
+    if (args.length != 3) {
+      System.err.println("Usage: " + this.getClass().getSimpleName() + " <fluoProps> <input dir> <tmp dir>");
+      System.exit(-1);
+    }
+
+    FluoConfiguration props = new FluoConfiguration(new File(args[0]));
+    Path input = new Path(args[1]);
+    Path tmp = new Path(args[2]);
+
+    int stopLevel;
+    int nodeSize;
+    try (FluoClient client = FluoFactory.newClient(props)) {
+      nodeSize = client.getAppConfiguration().getInt(Constants.NODE_SIZE_PROP);
+      stopLevel = client.getAppConfiguration().getInt(Constants.STOP_LEVEL_PROP);
+    }
+
+    int ret = unique(input, new Path(tmp, "nums"));
+    if (ret != 0)
+      return ret;
+
+    return buildTree(nodeSize, props, tmp, stopLevel);
+  }
+
+  private int unique(Path input, Path tmp) throws Exception {
+    Job job = Job.getInstance(getConf());
+    job.setJarByClass(Init.class);
+
+    job.setJobName(Init.class.getName() + "_unique");
+
+    job.setInputFormatClass(SequenceFileInputFormat.class);
+    SequenceFileInputFormat.addInputPath(job, input);
+
+    job.setReducerClass(UniqueReducer.class);
+
+    job.setOutputKeyClass(LongWritable.class);
+    job.setOutputValueClass(NullWritable.class);
+
+    job.setOutputFormatClass(SequenceFileOutputFormat.class);
+    SequenceFileOutputFormat.setOutputPath(job, tmp);
+
+    boolean success = job.waitForCompletion(true);
+    return success ? 0 : 1;
+
+  }
+
+  private int buildTree(int nodeSize, FluoConfiguration props, Path tmp, int stopLevel) throws Exception {
+    Job job = Job.getInstance(getConf());
+
+    job.setJarByClass(Init.class);
+
+    job.setJobName(Init.class.getName() + "_load");
+
+    job.setMapOutputKeyClass(Text.class);
+    job.setMapOutputValueClass(LongWritable.class);
+
+    job.getConfiguration().setInt(TRIE_NODE_SIZE_PROP, nodeSize);
+    job.getConfiguration().setInt(TRIE_STOP_LEVEL_PROP, stopLevel);
+
+    job.setInputFormatClass(SequenceFileInputFormat.class);
+    SequenceFileInputFormat.addInputPath(job, new Path(tmp, "nums"));
+
+    job.setMapperClass(InitMapper.class);
+    job.setCombinerClass(InitCombiner.class);
+    job.setReducerClass(InitReducer.class);
+
+    job.setOutputFormatClass(AccumuloFileOutputFormat.class);
+
+    job.setPartitionerClass(RangePartitioner.class);
+
+    FileSystem fs = FileSystem.get(job.getConfiguration());
+    Connector conn = AccumuloUtil.getConnector(props);
+
+    Path splitsPath = new Path(tmp, "splits.txt");
+
+    Collection<Text> splits1 = writeSplits(props, fs, conn, splitsPath);
+
+    RangePartitioner.setSplitFile(job, splitsPath.toString());
+    job.setNumReduceTasks(splits1.size() + 1);
+
+    Path outPath = new Path(tmp, "out");
+    AccumuloFileOutputFormat.setOutputPath(job, outPath);
+
+    boolean success = job.waitForCompletion(true);
+
+    if (success) {
+      Path failPath = new Path(tmp, "failures");
+      fs.mkdirs(failPath);
+      conn.tableOperations().importDirectory(props.getAccumuloTable(), outPath.toString(), failPath.toString(), false);
+
+      //Compacting files makes them local to each tablet and generates files using the tables settings.
+      conn.tableOperations().compact(props.getAccumuloTable(), new CompactionConfig().setWait(true));
+    }
+    return success ? 0 : 1;
+  }
+
+  private Collection<Text> writeSplits(FluoConfiguration props, FileSystem fs, Connector conn, Path splitsPath) throws Exception {
+    Collection<Text> splits1 = conn.tableOperations().listSplits(props.getAccumuloTable());
+    OutputStream out = new BufferedOutputStream(fs.create(splitsPath));
+    for (Text split : splits1) {
+      out.write(Base64.encodeBase64(split.copyBytes()));
+      out.write('\n');
+    }
+
+    out.close();
+    return splits1;
+  }
+
+  public static void main(String[] args) throws Exception {
+    int ret = ToolRunner.run(new Init(), args);
+    System.exit(ret);
+  }
+
+}

diff --git a/stresso/src/main/java/stresso/trie/Load.java b/stresso/src/main/java/stresso/trie/Load.java
new file mode 100644
index 0000000..8e1ebfb
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/Load.java

@@ -0,0 +1,86 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+
+package stresso.trie;
+
+import java.io.File;
+import java.io.IOException;
+
+import org.apache.fluo.api.client.Loader;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.mapreduce.FluoOutputFormat;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapreduce.Job;
+import org.apache.hadoop.mapreduce.Mapper;
+import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class Load extends Configured implements Tool {
+
+  private static final Logger log = LoggerFactory.getLogger(Load.class);
+
+  public static class LoadMapper extends Mapper<LongWritable, NullWritable, Loader, NullWritable> {
+
+    @Override
+    protected void map(LongWritable key, NullWritable val, Context context)
+        throws IOException, InterruptedException {
+      context.write(new NumberLoader(key.get()), val);
+    }
+  }
+
+  @Override
+  public int run(String[] args) throws Exception {
+
+    if (args.length != 2) {
+      log.error("Usage: " + this.getClass().getSimpleName() + "<fluoProps> <input dir>");
+      System.exit(-1);
+    }
+
+    FluoConfiguration props = new FluoConfiguration(new File(args[0]));
+    Path input = new Path(args[1]);
+
+    Job job = Job.getInstance(getConf());
+
+    job.setJobName(Load.class.getName());
+
+    job.setJarByClass(Load.class);
+
+    job.setInputFormatClass(SequenceFileInputFormat.class);
+    SequenceFileInputFormat.addInputPath(job, input);
+
+    job.setMapperClass(LoadMapper.class);
+
+    job.setNumReduceTasks(0);
+
+    job.setOutputFormatClass(FluoOutputFormat.class);
+    FluoOutputFormat.configure(job, props);
+
+    job.getConfiguration().setBoolean("mapreduce.map.speculative", false);
+
+    boolean success = job.waitForCompletion(true);
+    return success ? 0 : 1;
+  }
+
+  public static void main(String[] args) throws Exception {
+    int ret = ToolRunner.run(new Load(), args);
+    System.exit(ret);
+  }
+
+}

diff --git a/stresso/src/main/java/stresso/trie/Node.java b/stresso/src/main/java/stresso/trie/Node.java
new file mode 100644
index 0000000..fb45a60
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/Node.java

@@ -0,0 +1,117 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package stresso.trie;
+
+import com.google.common.base.Strings;
+import com.google.common.hash.Hashing;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+/** Utility class that represents trie node
+ */
+public class Node {
+
+  private final Number number;
+  private final int level;
+  private final int nodeSize;
+
+  static final int HASH_LEN=4;
+
+  public Node(Number number, int level, int nodeSize) {
+    this.number = number;
+    this.level = level;
+    this.nodeSize = nodeSize;
+  }
+
+  public Node(String rowId) {
+    String[] rowArgs = rowId.split(":");
+    checkArgument(validRowId(rowArgs), "Invalid row id - "+ rowId);
+    this.level = Integer.parseInt(rowArgs[0]);
+    this.nodeSize = Integer.parseInt(rowArgs[2]);
+    this.number = parseNumber(rowArgs[3]);
+  }
+
+  public Number getNumber() {
+    return number;
+  }
+
+  public int getLevel() {
+    return level;
+  }
+
+  public boolean isRoot() {
+    return level == 0;
+  }
+
+  public int getNodeSize() {
+    return nodeSize;
+  }
+
+  private Number parseNumber(String numStr) {
+    if (numStr.equals("root")) {
+      return null;
+    } else if (numStr.length() == 16) {
+      return Long.parseLong(numStr, 16);
+    } else {
+      return Integer.parseInt(numStr, 16);
+    }
+  }
+
+  private String genHash(){
+    long num = (number == null)? 0l : number.longValue();
+    int hash = Hashing.murmur3_32().newHasher().putInt(level).putInt(nodeSize).putLong(num).hash().asInt();
+    hash = hash & 0x7fffffff;
+    //base 36 gives a lot more bins in 4 bytes than hex, but it still human readable which is nice for debugging.
+    String hashString = Strings.padStart(Integer.toString(hash, Character.MAX_RADIX), HASH_LEN, '0');
+    return hashString.substring(hashString.length() - HASH_LEN);
+  }
+
+  public String getRowId() {
+    if (level == 0) {
+      return String.format("00:%s:%02d:root", genHash(), nodeSize);
+    } else {
+      if (number instanceof Integer) {
+        return String.format("%02d:%s:%02d:%08x", level, genHash(), nodeSize, number);
+      } else {
+        return String.format("%02d:%s:%02d:%016x", level, genHash(), nodeSize, number);
+      }
+    }
+  }
+
+  public Node getParent() {
+    if (level == 1) {
+      return new Node(null, 0, nodeSize);
+    } else {
+      if (number instanceof Long) {
+        int shift = (((64 / nodeSize) - level) * nodeSize) + nodeSize;
+        Long parent = (number.longValue() >> shift) << shift;
+        return new Node(parent, level-1, nodeSize);
+      } else {
+        int shift = (((32 / nodeSize) - level) * nodeSize) + nodeSize;
+        Integer parent = (number.intValue() >> shift) << shift;
+        return new Node(parent, level-1, nodeSize);
+      }
+    }
+  }
+
+  private boolean validRowId(String[] rowArgs) {
+    return ((rowArgs.length == 4) && (rowArgs[0] != null) && (rowArgs[1] != null) && (rowArgs[2] != null) && (rowArgs[3] != null));
+  }
+
+  public static String generateRootId(int nodeSize) {
+    return (new Node(null, 0, nodeSize)).getRowId();
+  }
+}

diff --git a/stresso/src/main/java/stresso/trie/NodeObserver.java b/stresso/src/main/java/stresso/trie/NodeObserver.java
new file mode 100644
index 0000000..d7356c5
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/NodeObserver.java

@@ -0,0 +1,78 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package stresso.trie;
+
+import java.util.Map;
+
+import org.apache.fluo.api.client.TransactionBase;
+import org.apache.fluo.api.data.Bytes;
+import org.apache.fluo.api.data.Column;
+import org.apache.fluo.api.observer.AbstractObserver;
+import org.apache.fluo.recipes.core.types.TypedSnapshotBase.Value;
+import org.apache.fluo.recipes.core.types.TypedTransactionBase;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Observer that looks for count:wait for nodes. If found, it increments count:seen and increments
+ * count:wait of parent node in trie
+ */
+public class NodeObserver extends AbstractObserver {
+
+  private static final Logger log = LoggerFactory.getLogger(NodeObserver.class);
+
+  private int stopLevel = 0;
+
+  @Override
+  public void process(TransactionBase tx, Bytes row, Column col) throws Exception {
+
+    final TypedTransactionBase ttx = Constants.TYPEL.wrap(tx);
+
+    Map<Column, Value> colVals =
+        ttx.get().row(row).columns(Constants.COUNT_SEEN_COL, Constants.COUNT_WAIT_COL);
+
+    final Integer childWait = colVals.get(Constants.COUNT_WAIT_COL).toInteger(0);
+
+    if (childWait > 0) {
+      Integer childSeen = colVals.get(Constants.COUNT_SEEN_COL).toInteger(0);
+
+      ttx.mutate().row(row).col(Constants.COUNT_SEEN_COL).set(childSeen + childWait);
+      ttx.mutate().row(row).col(Constants.COUNT_WAIT_COL).delete();
+
+      try {
+        Node node = new Node(row.toString());
+        if (node.getLevel() > stopLevel) {
+          Node parent = node.getParent();
+          Integer parentWait =
+              ttx.get().row(parent.getRowId()).col(Constants.COUNT_WAIT_COL).toInteger(0);
+          ttx.mutate().row(parent.getRowId()).col(Constants.COUNT_WAIT_COL)
+              .set(parentWait + childWait);
+        }
+      } catch (IllegalArgumentException e) {
+        log.error(e.getMessage());
+        e.printStackTrace();
+      }
+    }
+  }
+
+  @Override
+  public void init(Context context) throws Exception {
+    stopLevel = context.getAppConfiguration().getInt(Constants.STOP_LEVEL_PROP);
+  }
+
+  @Override
+  public ObservedColumn getObservedColumn() {
+    return new ObservedColumn(Constants.COUNT_WAIT_COL, NotificationType.STRONG);
+  }
+}

diff --git a/stresso/src/main/java/stresso/trie/NumberLoader.java b/stresso/src/main/java/stresso/trie/NumberLoader.java
new file mode 100644
index 0000000..961893a
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/NumberLoader.java

@@ -0,0 +1,70 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package stresso.trie;
+
+import java.util.Map;
+
+import org.apache.fluo.api.client.Loader;
+import org.apache.fluo.api.client.TransactionBase;
+import org.apache.fluo.api.data.Column;
+import org.apache.fluo.recipes.core.types.TypedSnapshotBase.Value;
+import org.apache.fluo.recipes.core.types.TypedTransactionBase;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+/**
+ * Executes load transactions of numbers into trie at leaf node level
+ */
+public class NumberLoader implements Loader {
+
+  private final Number number;
+  private Integer nodeSize = null;
+
+  public NumberLoader(Integer num, int nodeSize) {
+    checkArgument(num >= 0, "Only positive numbers accepted");
+    checkArgument((nodeSize <= 32) && ((32 % nodeSize) == 0), "nodeSize must be divisor of 32");
+    this.number = num;
+    this.nodeSize = nodeSize;
+  }
+
+  public NumberLoader(Long num) {
+    checkArgument(num >= 0, "Only positive numbers accepted");
+    this.number = num;
+  }
+
+  @Override
+  public void load(TransactionBase tx, Context context) throws Exception {
+
+    if (nodeSize == null) {
+      nodeSize = context.getAppConfiguration().getInt(Constants.NODE_SIZE_PROP);
+      checkArgument((nodeSize <= 64) && ((64 % nodeSize) == 0), "nodeSize must be divisor of 64");
+    }
+    int level = 64 / nodeSize;
+
+    TypedTransactionBase ttx = Constants.TYPEL.wrap(tx);
+
+    String rowId = new Node(number, level, nodeSize).getRowId();
+
+    Map<Column, Value> colVals =
+        ttx.get().row(rowId).columns(Constants.COUNT_SEEN_COL, Constants.COUNT_WAIT_COL);
+
+    Integer seen = colVals.get(Constants.COUNT_SEEN_COL).toInteger(0);
+    if (seen == 0) {
+      Integer wait = colVals.get(Constants.COUNT_WAIT_COL).toInteger(0);
+      if (wait == 0) {
+        ttx.mutate().row(rowId).col(Constants.COUNT_WAIT_COL).set(1);
+      }
+    }
+  }
+}

diff --git a/stresso/src/main/java/stresso/trie/Print.java b/stresso/src/main/java/stresso/trie/Print.java
new file mode 100644
index 0000000..62c39a8
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/Print.java

@@ -0,0 +1,125 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+
+package stresso.trie;
+
+import java.io.File;
+
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.client.Snapshot;
+import org.apache.fluo.api.client.scanner.ColumnScanner;
+import org.apache.fluo.api.client.scanner.RowScanner;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.config.SimpleConfiguration;
+import org.apache.fluo.api.data.ColumnValue;
+import org.apache.fluo.api.data.Span;
+
+public class Print {
+
+  public static class Stats {
+    public long totalWait = 0;
+    public long totalSeen = 0;
+    public long nodes;
+    public boolean sawOtherNodes = false;
+
+    public Stats() {
+
+    }
+
+    public Stats(long tw, long ts, boolean son) {
+      this.totalWait = tw;
+      this.totalSeen = ts;
+      this.sawOtherNodes = son;
+    }
+
+    public Stats(long tw, long ts, long nodes, boolean son) {
+      this.totalWait = tw;
+      this.totalSeen = ts;
+      this.nodes = nodes;
+      this.sawOtherNodes = son;
+    }
+
+    @Override
+    public boolean equals(Object o) {
+      if (o instanceof Stats) {
+        Stats os = (Stats) o;
+
+        return totalWait == os.totalWait && totalSeen == os.totalSeen
+            && sawOtherNodes == os.sawOtherNodes;
+      }
+
+      return false;
+    }
+  }
+
+  public static Stats getStats(SimpleConfiguration config) throws Exception {
+
+    try (FluoClient client = FluoFactory.newClient(config); Snapshot snap = client.newSnapshot()) {
+
+      int level = client.getAppConfiguration().getInt(Constants.STOP_LEVEL_PROP);
+      int nodeSize = client.getAppConfiguration().getInt(Constants.NODE_SIZE_PROP);
+
+      RowScanner rows = snap.scanner().over(Span.prefix(String.format("%02d:", level)))
+          .fetch(Constants.COUNT_SEEN_COL, Constants.COUNT_WAIT_COL).byRow().build();
+
+
+      long totalSeen = 0;
+      long totalWait = 0;
+
+      int otherNodeSizes = 0;
+
+      long nodes = 0;
+
+      for (ColumnScanner columns : rows) {
+        String row = columns.getsRow();
+        Node node = new Node(row);
+
+        if (node.getNodeSize() == nodeSize) {
+          for (ColumnValue cv : columns) {
+            if (cv.getColumn().equals(Constants.COUNT_SEEN_COL)) {
+              totalSeen += Long.parseLong(cv.getsValue());
+            } else {
+              totalWait += Long.parseLong(cv.getsValue());
+            }
+          }
+
+          nodes++;
+        } else {
+          otherNodeSizes++;
+        }
+      }
+
+      return new Stats(totalWait, totalSeen, nodes, otherNodeSizes != 0);
+    }
+
+  }
+
+  public static void main(String[] args) throws Exception {
+
+    if (args.length != 1) {
+      System.err.println("Usage: " + Print.class.getSimpleName() + " <fluo props>");
+      System.exit(-1);
+    }
+
+    Stats stats = getStats(new FluoConfiguration(new File(args[0])));
+
+    System.out.println("Total at root : " + (stats.totalSeen + stats.totalWait));
+    System.out.println("Nodes Scanned : " + stats.nodes);
+
+    if (stats.sawOtherNodes) {
+      System.err.println("WARN : Other node sizes were seen and ignored.");
+    }
+  }
+}

diff --git a/stresso/src/main/java/stresso/trie/Split.java b/stresso/src/main/java/stresso/trie/Split.java
new file mode 100644
index 0000000..df8e28f
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/Split.java

@@ -0,0 +1,141 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+
+package stresso.trie;
+
+import java.io.ByteArrayInputStream;
+import java.io.File;
+import java.nio.charset.StandardCharsets;
+import java.util.Map.Entry;
+import java.util.Properties;
+import java.util.Set;
+import java.util.TreeSet;
+
+import com.google.common.base.Strings;
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.core.util.AccumuloUtil;
+import org.apache.hadoop.io.Text;
+
+public class Split {
+
+  private static final String RGB_CLASS =
+      "org.apache.accumulo.server.master.balancer.RegexGroupBalancer";
+  private static final String RGB_PATTERN_PROP = "table.custom.balancer.group.regex.pattern";
+  private static final String RGB_DEFAULT_PROP = "table.custom.balancer.group.regex.default";
+  private static final String TABLE_BALANCER_PROP = "table.balancer";
+
+  public static void main(String[] args) throws Exception {
+    if (args.length != 3) {
+      System.err.println("Usage: " + Split.class.getSimpleName()
+          + " <fluo props> <table props> <tablets per level>");
+      System.exit(-1);
+    }
+
+    FluoConfiguration config = new FluoConfiguration(new File(args[0]));
+
+    int maxTablets = Integer.parseInt(args[2]);
+
+    int nodeSize;
+    int stopLevel;
+    try (FluoClient client = FluoFactory.newClient(config)) {
+      nodeSize = client.getAppConfiguration().getInt(Constants.NODE_SIZE_PROP);
+      stopLevel = client.getAppConfiguration().getInt(Constants.STOP_LEVEL_PROP);
+    }
+
+    setupBalancer(config);
+
+    int level = 64 / nodeSize;
+
+    while (level >= stopLevel) {
+      int numTablets = maxTablets;
+      if (numTablets == 0)
+        break;
+
+      TreeSet<Text> splits = genSplits(level, numTablets);
+      addSplits(config, splits);
+      System.out.printf("Added %d tablets for level %d\n", numTablets, level);
+
+      level--;
+    }
+
+    optimizeAccumulo(config, args[1]);
+  }
+
+  private static void optimizeAccumulo(FluoConfiguration config, String tableProps)
+      throws Exception {
+    Connector conn = AccumuloUtil.getConnector(config);
+
+    Properties tprops = new Properties();
+    tprops.load(new ByteArrayInputStream(tableProps.getBytes(StandardCharsets.UTF_8)));
+
+    Set<Entry<Object, Object>> es = tprops.entrySet();
+    for (Entry<Object, Object> e : es) {
+      conn.tableOperations().setProperty(config.getAccumuloTable(), e.getKey().toString(),
+          e.getValue().toString());
+    }
+    try {
+      conn.instanceOperations().setProperty("table.durability", "flush");
+      conn.tableOperations().removeProperty("accumulo.metadata", "table.durability");
+      conn.tableOperations().removeProperty("accumulo.root", "table.durability");
+    } catch (AccumuloException e) {
+      System.err.println(
+          "Unable to set durability settings (error expected in Accumulo 1.6) : " + e.getMessage());
+    }
+  }
+
+  private static void setupBalancer(FluoConfiguration config) throws AccumuloSecurityException {
+    Connector conn = AccumuloUtil.getConnector(config);
+
+    try {
+      // setting this prop first intentionally because it should fail in 1.6
+      conn.tableOperations().setProperty(config.getAccumuloTable(), RGB_PATTERN_PROP, "(\\d\\d).*");
+      conn.tableOperations().setProperty(config.getAccumuloTable(), RGB_DEFAULT_PROP, "none");
+      conn.tableOperations().setProperty(config.getAccumuloTable(), TABLE_BALANCER_PROP, RGB_CLASS);
+      System.out.println("Setup tablet group balancer");
+    } catch (AccumuloException e) {
+      System.err.println(
+          "Unable to setup tablet balancer (error expected in Accumulo 1.6) : " + e.getMessage());
+    }
+  }
+
+  private static TreeSet<Text> genSplits(int level, int numTablets) {
+
+    TreeSet<Text> splits = new TreeSet<>();
+
+    String ls = String.format("%02d:", level);
+
+    int numSplits = numTablets - 1;
+    int distance = (((int) Math.pow(Character.MAX_RADIX, Node.HASH_LEN) - 1) / numTablets) + 1;
+    int split = distance;
+    for (int i = 0; i < numSplits; i++) {
+      splits.add(new Text(
+          ls + Strings.padStart(Integer.toString(split, Character.MAX_RADIX), Node.HASH_LEN, '0')));
+      split += distance;
+    }
+
+    splits.add(new Text(ls + "~"));
+
+    return splits;
+  }
+
+  private static void addSplits(FluoConfiguration config, TreeSet<Text> splits) throws Exception {
+    Connector conn = AccumuloUtil.getConnector(config);
+    conn.tableOperations().addSplits(config.getAccumuloTable(), splits);
+  }
+}

diff --git a/stresso/src/main/java/stresso/trie/Unique.java b/stresso/src/main/java/stresso/trie/Unique.java
new file mode 100644
index 0000000..ef0d1cc
--- /dev/null
+++ b/stresso/src/main/java/stresso/trie/Unique.java

@@ -0,0 +1,104 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package stresso.trie;
+
+import java.io.IOException;
+import java.util.Iterator;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapred.JobClient;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.MapReduceBase;
+import org.apache.hadoop.mapred.OutputCollector;
+import org.apache.hadoop.mapred.Reducer;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.hadoop.mapred.RunningJob;
+import org.apache.hadoop.mapred.SequenceFileInputFormat;
+import org.apache.hadoop.mapred.lib.NullOutputFormat;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class Unique extends Configured implements Tool {
+
+  private static final Logger log = LoggerFactory.getLogger(Unique.class);
+
+  public static enum Stats {
+    UNIQUE;
+  }
+
+  public static class UniqueReducer extends MapReduceBase implements Reducer<LongWritable,NullWritable,LongWritable,NullWritable> {
+    @Override
+    public void reduce(LongWritable key, Iterator<NullWritable> values, OutputCollector<LongWritable,NullWritable> output, Reporter reporter) throws IOException {
+      reporter.getCounter(Stats.UNIQUE).increment(1);
+    }
+  }
+
+
+  private static int numUnique = 0;
+
+  @VisibleForTesting
+  public static int getNumUnique() {
+    return numUnique;
+  }
+
+  @Override
+  public int run(String[] args) throws Exception {
+
+    if (args.length < 1) {
+      log.error("Usage: " + this.getClass().getSimpleName() + "<input dir>{ <input dir>}");
+      System.exit(-1);
+    }
+
+    JobConf job = new JobConf(getConf());
+
+    job.setJobName(Unique.class.getName());
+    job.setJarByClass(Unique.class);
+
+    job.setInputFormat(SequenceFileInputFormat.class);
+    for (String arg : args) {
+      SequenceFileInputFormat.addInputPath(job, new Path(arg));
+    }
+
+    job.setMapOutputKeyClass(LongWritable.class);
+    job.setMapOutputValueClass(NullWritable.class);
+
+    job.setReducerClass(UniqueReducer.class);
+
+    job.setOutputFormat(NullOutputFormat.class);
+
+    RunningJob runningJob = JobClient.runJob(job);
+    runningJob.waitForCompletion();
+    numUnique = (int) runningJob.getCounters().getCounter(Stats.UNIQUE);
+
+    log.debug("numUnique : "+numUnique);
+
+    return runningJob.isSuccessful() ? 0 : -1;
+
+  }
+
+  public static void main(String[] args) throws Exception {
+    int ret = ToolRunner.run(new Unique(), args);
+    System.exit(ret);
+  }
+
+}

diff --git a/stresso/src/test/java/stresso/ITBase.java b/stresso/src/test/java/stresso/ITBase.java
new file mode 100644
index 0000000..99a0b02
--- /dev/null
+++ b/stresso/src/test/java/stresso/ITBase.java

@@ -0,0 +1,135 @@
+/*
+ * Copyright 2017 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+
+
+package stresso;
+
+import java.io.File;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicInteger;
+
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.Instance;
+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
+import org.apache.accumulo.minicluster.MiniAccumuloCluster;
+import org.apache.accumulo.minicluster.MiniAccumuloConfig;
+import org.apache.accumulo.minicluster.MiniAccumuloInstance;
+import org.apache.commons.io.FileUtils;
+import org.apache.fluo.api.client.FluoAdmin;
+import org.apache.fluo.api.client.FluoAdmin.InitializationOptions;
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.mini.MiniFluo;
+import org.junit.After;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+
+public class ITBase {
+
+  protected final static String USER = "root";
+  protected final static String PASSWORD = "ITSecret";
+  protected final static String TABLE_BASE = "table";
+  protected final static String IT_INSTANCE_NAME_PROP = FluoConfiguration.FLUO_PREFIX
+      + ".it.instance.name";
+  protected final static String IT_INSTANCE_CLEAR_PROP = FluoConfiguration.FLUO_PREFIX
+      + ".it.instance.clear";
+
+  protected static String instanceName;
+  protected static Connector conn;
+  protected static Instance miniAccumulo;
+  private static MiniAccumuloCluster cluster;
+  private static boolean startedCluster = false;
+
+  private static AtomicInteger tableCounter = new AtomicInteger(1);
+  protected static AtomicInteger testCounter = new AtomicInteger();
+
+  protected FluoConfiguration config;
+  protected FluoClient client;
+  protected MiniFluo miniFluo;
+
+  @BeforeClass
+  public static void setUpAccumulo() throws Exception {
+    instanceName = System.getProperty(IT_INSTANCE_NAME_PROP, "it-instance-default");
+    File instanceDir = new File("target/accumulo-maven-plugin/" + instanceName);
+    boolean instanceClear =
+        System.getProperty(IT_INSTANCE_CLEAR_PROP, "true").equalsIgnoreCase("true");
+    if (instanceDir.exists() && instanceClear) {
+      FileUtils.deleteDirectory(instanceDir);
+    }
+    if (!instanceDir.exists()) {
+      MiniAccumuloConfig cfg = new MiniAccumuloConfig(instanceDir, PASSWORD);
+      cfg.setInstanceName(instanceName);
+      cluster = new MiniAccumuloCluster(cfg);
+      cluster.start();
+      startedCluster = true;
+    }
+    miniAccumulo = new MiniAccumuloInstance(instanceName, instanceDir);
+    conn = miniAccumulo.getConnector(USER, new PasswordToken(PASSWORD));
+  }
+
+
+  @AfterClass
+  public static void tearDownAccumulo() throws Exception {
+    if (startedCluster) {
+      cluster.stop();
+    }
+  }
+
+  protected void preInit(FluoConfiguration config){}
+
+  public String getCurTableName() {
+    return TABLE_BASE + tableCounter.get();
+  }
+
+  public String getNextTableName() {
+    return TABLE_BASE + tableCounter.incrementAndGet();
+  }
+
+  @Before
+  public void setUpFluo() throws Exception {
+
+    config = new FluoConfiguration();
+    config.setApplicationName("mini-test" + testCounter.getAndIncrement());
+    config.setAccumuloInstance(miniAccumulo.getInstanceName());
+    config.setAccumuloUser(USER);
+    config.setAccumuloPassword(PASSWORD);
+    config.setAccumuloZookeepers(miniAccumulo.getZooKeepers());
+    config.setInstanceZookeepers(miniAccumulo.getZooKeepers() + "/fluo");
+    config.setMiniStartAccumulo(false);
+    config.setAccumuloTable(getNextTableName());
+    config.setWorkerThreads(5);
+    preInit(config);
+
+    config.setTransactionRollbackTime(1, TimeUnit.SECONDS);
+
+    try (FluoAdmin admin = FluoFactory.newAdmin(config)) {
+      InitializationOptions opts =
+          new InitializationOptions().setClearZookeeper(true).setClearTable(true);
+      admin.initialize(opts);
+    }
+
+    config.getAppConfiguration().clear();
+
+    client = FluoFactory.newClient(config);
+    miniFluo = FluoFactory.newMiniFluo(config);
+  }
+
+  @After
+  public void tearDownFluo() throws Exception {
+    miniFluo.close();
+    client.close();
+  }
+}

diff --git a/stresso/src/test/java/stresso/TrieBasicIT.java b/stresso/src/test/java/stresso/TrieBasicIT.java
new file mode 100644
index 0000000..eba5d3e
--- /dev/null
+++ b/stresso/src/test/java/stresso/TrieBasicIT.java

@@ -0,0 +1,120 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package stresso;
+
+import java.util.HashSet;
+import java.util.Random;
+import java.util.Set;
+
+import org.apache.fluo.api.client.FluoClient;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.client.LoaderExecutor;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.config.ObserverSpecification;
+import org.apache.fluo.recipes.core.types.TypedSnapshot;
+import org.apache.fluo.recipes.test.FluoITHelper;
+import org.junit.Assert;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import stresso.trie.Constants;
+import stresso.trie.Node;
+import stresso.trie.NodeObserver;
+import stresso.trie.NumberLoader;
+
+import static stresso.trie.Constants.COUNT_SEEN_COL;
+import static stresso.trie.Constants.TYPEL;
+
+/**
+ * Tests Trie Stress Test using Basic Loader
+ */
+public class TrieBasicIT extends ITBase {
+
+  private static final Logger log = LoggerFactory.getLogger(TrieBasicIT.class);
+
+  @Override
+  protected void preInit(FluoConfiguration conf) {
+    conf.addObserver(new ObserverSpecification(NodeObserver.class.getName()));
+    conf.getAppConfiguration().setProperty(Constants.STOP_LEVEL_PROP, 0);
+  }
+
+  @Test
+  public void testBit32() throws Exception {
+    runTrieTest(20, Integer.MAX_VALUE, 32);
+  }
+
+  @Test
+  public void testBit8() throws Exception {
+    runTrieTest(25, Integer.MAX_VALUE, 8);
+  }
+
+  @Test
+  public void testBit4() throws Exception {
+    runTrieTest(10, Integer.MAX_VALUE, 4);
+  }
+
+  @Test
+  public void testBit() throws Exception {
+    runTrieTest(5, Integer.MAX_VALUE, 1);
+  }
+
+  @Test
+  public void testDuplicates() throws Exception {
+    runTrieTest(20, 10, 4);
+  }
+
+  private void runTrieTest(int ingestNum, int maxValue, int nodeSize) throws Exception {
+
+    log.info("Ingesting " + ingestNum + " unique numbers with a nodeSize of " + nodeSize + " bits");
+
+    config.setLoaderThreads(0);
+    config.setLoaderQueueSize(0);
+
+    try (FluoClient fluoClient = FluoFactory.newClient(config)) {
+
+      int uniqueNum;
+
+      try (LoaderExecutor le = client.newLoaderExecutor()) {
+        Random random = new Random();
+        Set<Integer> ingested = new HashSet<>();
+        for (int i = 0; i < ingestNum; i++) {
+          int num = Math.abs(random.nextInt(maxValue));
+          le.execute(new NumberLoader(num, nodeSize));
+          ingested.add(num);
+        }
+
+        uniqueNum = ingested.size();
+        log.info(
+            "Ingested " + uniqueNum + " unique numbers with a nodeSize of " + nodeSize + " bits");
+      }
+
+      miniFluo.waitForObservers();
+
+      try (TypedSnapshot tsnap = TYPEL.wrap(client.newSnapshot())) {
+        Integer result =
+            tsnap.get().row(Node.generateRootId(nodeSize)).col(COUNT_SEEN_COL).toInteger();
+        if (result == null) {
+          log.error("Could not find root node");
+          FluoITHelper.printFluoTable(client);
+        }
+        if (!result.equals(uniqueNum)) {
+          log.error(
+              "Count (" + result + ") at root node does not match expected (" + uniqueNum + "):");
+          FluoITHelper.printFluoTable(client);
+        }
+        Assert.assertEquals(uniqueNum, result.intValue());
+      }
+    }
+  }
+}

diff --git a/stresso/src/test/java/stresso/TrieMapRedIT.java b/stresso/src/test/java/stresso/TrieMapRedIT.java
new file mode 100644
index 0000000..62afea1
--- /dev/null
+++ b/stresso/src/test/java/stresso/TrieMapRedIT.java

@@ -0,0 +1,136 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package stresso;
+
+import java.io.File;
+import java.util.ArrayList;
+import java.util.Arrays;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.fluo.api.client.FluoFactory;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.config.ObserverSpecification;
+import org.apache.fluo.api.config.SimpleConfiguration;
+import org.apache.fluo.api.mini.MiniFluo;
+import org.apache.hadoop.util.ToolRunner;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import stresso.trie.Constants;
+import stresso.trie.Generate;
+import stresso.trie.Init;
+import stresso.trie.Load;
+import stresso.trie.NodeObserver;
+import stresso.trie.Print;
+import stresso.trie.Unique;
+
+/**
+ * Tests Trie Stress Test using MapReduce Ingest
+ */
+public class TrieMapRedIT extends ITBase {
+
+  @Override
+  protected void preInit(FluoConfiguration conf) {
+    conf.addObserver(new ObserverSpecification(NodeObserver.class.getName()));
+
+    SimpleConfiguration appCfg = conf.getAppConfiguration();
+    appCfg.setProperty(Constants.STOP_LEVEL_PROP, 0);
+    appCfg.setProperty(Constants.NODE_SIZE_PROP, 8);
+  }
+
+  static void generate(int numMappers, int numPerMapper, int max, File out1) throws Exception {
+    int ret = ToolRunner.run(new Generate(),
+        new String[] {"-D", "mapred.job.tracker=local", "-D", "fs.defaultFS=file:///",
+            "" + numMappers, numPerMapper + "", max + "", out1.toURI().toString()});
+    Assert.assertEquals(0, ret);
+  }
+
+  static void load(int nodeSize, File fluoPropsFile, File input) throws Exception {
+    int ret = ToolRunner.run(new Load(), new String[] {"-D", "mapred.job.tracker=local", "-D",
+        "fs.defaultFS=file:///", fluoPropsFile.getAbsolutePath(), input.toURI().toString()});
+    Assert.assertEquals(0, ret);
+  }
+
+  static void init(int nodeSize, File fluoPropsFile, File input, File tmp) throws Exception {
+    int ret = ToolRunner.run(new Init(),
+        new String[] {"-D", "mapred.job.tracker=local", "-D", "fs.defaultFS=file:///",
+            fluoPropsFile.getAbsolutePath(), input.toURI().toString(), tmp.toURI().toString()});
+    Assert.assertEquals(0, ret);
+  }
+
+  static int unique(File... dirs) throws Exception {
+
+    ArrayList<String> args = new ArrayList<>(
+        Arrays.asList("-D", "mapred.job.tracker=local", "-D", "fs.defaultFS=file:///"));
+    for (File dir : dirs) {
+      args.add(dir.toURI().toString());
+    }
+
+    int ret = ToolRunner.run(new Unique(), args.toArray(new String[args.size()]));
+    Assert.assertEquals(0, ret);
+    return Unique.getNumUnique();
+  }
+
+  @Test
+  public void testEndToEnd() throws Exception {
+    File testDir = new File("target/MRIT");
+    FileUtils.deleteQuietly(testDir);
+    testDir.mkdirs();
+    File fluoPropsFile = new File(testDir, "fluo.props");
+
+    config.save(fluoPropsFile);
+
+    File out1 = new File(testDir, "nums-1");
+
+    generate(2, 100, 500, out1);
+    init(8, fluoPropsFile, out1, new File(testDir, "initTmp"));
+    int ucount = unique(out1);
+
+    Assert.assertTrue(ucount > 0);
+
+    miniFluo.waitForObservers();
+
+    Assert.assertEquals(new Print.Stats(0, ucount, false), Print.getStats(config));
+
+    // reload same data
+    load(8, fluoPropsFile, out1);
+
+    miniFluo.waitForObservers();
+
+    Assert.assertEquals(new Print.Stats(0, ucount, false), Print.getStats(config));
+
+    // load some new data
+    File out2 = new File(testDir, "nums-2");
+    generate(2, 100, 500, out2);
+    load(8, fluoPropsFile, out2);
+    int ucount2 = unique(out1, out2);
+    Assert.assertTrue(ucount2 > ucount); // used > because the probability that no new numbers are
+                                         // chosen is exceedingly small
+
+    miniFluo.waitForObservers();
+
+    Assert.assertEquals(new Print.Stats(0, ucount2, false), Print.getStats(config));
+
+    File out3 = new File(testDir, "nums-3");
+    generate(2, 100, 500, out3);
+    load(8, fluoPropsFile, out3);
+    int ucount3 = unique(out1, out2, out3);
+    Assert.assertTrue(ucount3 > ucount2); // used > because the probability that no new numbers are
+                                          // chosen is exceedingly small
+
+    miniFluo.waitForObservers();
+
+    Assert.assertEquals(new Print.Stats(0, ucount3, false), Print.getStats(config));
+  }
+}

diff --git a/stresso/src/test/java/stresso/TrieStopLevelIT.java b/stresso/src/test/java/stresso/TrieStopLevelIT.java
new file mode 100644
index 0000000..9c85a27
--- /dev/null
+++ b/stresso/src/test/java/stresso/TrieStopLevelIT.java

@@ -0,0 +1,48 @@
+/*
+ * Copyright 2014 Stresso authors (see AUTHORS)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
+ * in compliance with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+
+package stresso;
+
+import org.apache.fluo.api.client.Snapshot;
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.config.ObserverSpecification;
+import org.apache.fluo.api.config.SimpleConfiguration;
+import org.apache.fluo.api.data.Bytes;
+import org.junit.Assert;
+import org.junit.Test;
+import stresso.trie.Constants;
+import stresso.trie.Node;
+import stresso.trie.NodeObserver;
+
+public class TrieStopLevelIT extends TrieMapRedIT {
+
+  @Override
+  protected void preInit(FluoConfiguration conf) {
+    conf.addObserver(new ObserverSpecification(NodeObserver.class.getName()));
+
+    SimpleConfiguration appCfg = conf.getAppConfiguration();
+    appCfg.setProperty(Constants.STOP_LEVEL_PROP, 7);
+    appCfg.setProperty(Constants.NODE_SIZE_PROP, 8);
+  }
+
+  @Test
+  public void testEndToEnd() throws Exception {
+    super.testEndToEnd();
+    try (Snapshot snap = client.newSnapshot()) {
+      Bytes row = Bytes.of(Node.generateRootId(8));
+      Assert.assertNull(snap.get(row, Constants.COUNT_SEEN_COL));
+      Assert.assertNull(snap.get(row, Constants.COUNT_WAIT_COL));
+    }
+  }
+}

diff --git a/stresso/src/test/resources/log4j.properties b/stresso/src/test/resources/log4j.properties
new file mode 100644
index 0000000..adf8e2e
--- /dev/null
+++ b/stresso/src/test/resources/log4j.properties

@@ -0,0 +1,31 @@
+# Copyright 2014 Stresso authors (see AUTHORS)
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+log4j.rootLogger=INFO, CA
+log4j.appender.CA=org.apache.log4j.ConsoleAppender
+log4j.appender.CA.layout=org.apache.log4j.PatternLayout
+log4j.appender.CA.layout.ConversionPattern=%d{ISO8601} [%c{2}] %-5p: %m%n
+
+log4j.logger.org.apache.curator=ERROR
+log4j.logger.org.apache.accumulo=WARN
+log4j.logger.org.apache.commons.vfs2.impl.DefaultFileSystemManager=WARN
+log4j.logger.org.apache.fluo=WARN
+log4j.logger.org.apache.hadoop=WARN
+log4j.logger.org.apache.hadoop.conf=ERROR
+log4j.logger.org.apache.hadoop.mapred=ERROR
+log4j.logger.org.apache.hadoop.mapreduce=ERROR
+log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR
+log4j.logger.org.apache.zookeeper.ClientCnxn=FATAL
+log4j.logger.org.apache.zookeeper.ZooKeeper=WARN
+log4j.logger.stresso=WARN

diff --git a/.gitignore b/webindex/.gitignore
similarity index 100%
rename from .gitignore
rename to webindex/.gitignore


diff --git a/.travis.yml b/webindex/.travis.yml
similarity index 100%
rename from .travis.yml
rename to webindex/.travis.yml


diff --git a/AUTHORS b/webindex/AUTHORS
similarity index 100%
rename from AUTHORS
rename to webindex/AUTHORS


diff --git a/webindex/LICENSE b/webindex/LICENSE
new file mode 100644
index 0000000..8f71f43
--- /dev/null
+++ b/webindex/LICENSE

@@ -0,0 +1,202 @@
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "{}"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright {yyyy} {name of copyright owner}
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+

diff --git a/webindex/README.md b/webindex/README.md
new file mode 100644
index 0000000..f869d20
--- /dev/null
+++ b/webindex/README.md

@@ -0,0 +1,76 @@
+![Webindex][logo]
+---
+[![Build Status][ti]][tl] [![Apache License][li]][ll]
+
+Webindex is an example [Apache Fluo][fluo] application that incrementally indexes links to web pages
+in multiple ways. If you are new to Fluo, you may want start with the [Fluo tour][tour] as the
+WebIndex application is more complicated. For more information on how the WebIndex application
+works, view the [tables](docs/tables.md) and [code](docs/code-guide.md) documentation.
+
+Webindex utilizes multiple projects.  [Common Crawl][cc] web crawl data is used as the input.
+[Apache Spark][spark] is used to initialize Fluo and incrementally load data into Fluo.  [Apache
+Accumulo][accumulo] is used to hold the indexes and Fluo's data.  Fluo is used to continuously
+combine new and historical information about web pages and update an external index when changes
+occur. Webindex has simple UI built using [Spark Java][sparkjava] that allows querying the indexes.
+
+Below is a video showing repeatedly querying stackoverflow.com while Webindex was running for three
+days on EC2.  The video was made by querying the Webindex instance periodically and taking a
+screenshot.  More details about this video are available in this [blog post][bp].
+
+[![Querying stackoverflow.com](http://img.youtube.com/vi/mJJNJbPN2EI/0.jpg)](http://www.youtube.com/watch?v=mJJNJbPN2EI)
+
+## Running WebIndex
+
+If you are new to WebIndex, the simplest way to run the application is to run the development
+server. First, clone the WebIndex repo:
+
+    git clone https://github.com/astralway/webindex.git
+
+Next, on a machine where Java and Maven are installed, run the development server using the 
+`webindex` command:
+
+    cd webindex/
+    ./bin/webindex dev
+
+This will build and start the development server which will log to the console. This 'dev' command
+has several command line options which can be viewed by running with `-h`. When you want to
+terminate the server, press `CTRL-c`.
+
+The development server starts a MiniAccumuloCluster and runs MiniFluo on top of it. It parses a
+CommonCrawl data file and creates a file at `data/1000-pages.txt` with 1000 pages that are loaded
+into MiniFluo. The number of pages loaded can be changed to 5000 by using the command below:
+
+    ./bin/webindex dev --pages 5000
+
+The pages are processed by Fluo which exports indexes to Accumulo. The development server also
+starts a web application  at [http://localhost:4567](http://localhost:4567) that queries indexes in
+Accumulo.
+
+If you would like to run WebIndex on a cluster, follow the [install] instructions. 
+
+### Viewing metrics
+
+Metrics can be sent from the development server to InfluxDB and viewed in Grafana. You can either
+setup InfluxDB+Grafana on you own or use [Uno] command `uno setup metrics`. After a metrics server
+is started, start the development server the option `--metrics` to start sending metrics:
+
+    ./bin/webindex dev --metrics
+
+Fluo metrics can be viewed in Grafana.  To view application-specific metrics for Webindex, import
+the WebIndex Grafana dashboard located at `contrib/webindex-dashboard.json`.
+
+[tour]: https://fluo.apache.org/tour/
+[sparkjava]: http://sparkjava.com/
+[spark]: https://spark.apache.org/
+[accumulo]: https://accumulo.apache.org/
+[fluo]: https://fluo.apache.org/
+[pc]: https://github.com/astralway/phrasecount
+[Uno]: https://github.com/astralway/uno
+[cc]: https://commoncrawl.org/
+[install]: docs/install.md
+[ti]: https://travis-ci.org/astralway/webindex.svg?branch=master
+[tl]: https://travis-ci.org/astralway/webindex
+[li]: http://img.shields.io/badge/license-ASL-blue.svg
+[ll]: https://github.com/astralway/webindex/blob/master/LICENSE
+[logo]: contrib/webindex.png
+[bp]: https://fluo.apache.org/blog/2016/01/11/webindex-long-run/#videos-from-run

diff --git a/bin/impl/base.sh b/webindex/bin/impl/base.sh
similarity index 100%
rename from bin/impl/base.sh
rename to webindex/bin/impl/base.sh


diff --git a/bin/impl/init.sh b/webindex/bin/impl/init.sh
similarity index 100%
rename from bin/impl/init.sh
rename to webindex/bin/impl/init.sh


diff --git a/bin/webindex b/webindex/bin/webindex
similarity index 100%
rename from bin/webindex
rename to webindex/bin/webindex


diff --git a/conf/.gitignore b/webindex/conf/.gitignore
similarity index 100%
rename from conf/.gitignore
rename to webindex/conf/.gitignore


diff --git a/conf/examples/log4j.properties b/webindex/conf/examples/log4j.properties
similarity index 100%
rename from conf/examples/log4j.properties
rename to webindex/conf/examples/log4j.properties


diff --git a/conf/examples/webindex-env.sh b/webindex/conf/examples/webindex-env.sh
similarity index 100%
rename from conf/examples/webindex-env.sh
rename to webindex/conf/examples/webindex-env.sh


diff --git a/conf/examples/webindex.yml b/webindex/conf/examples/webindex.yml
similarity index 100%
rename from conf/examples/webindex.yml
rename to webindex/conf/examples/webindex.yml


diff --git a/contrib/webindex-dashboard.json b/webindex/contrib/webindex-dashboard.json
similarity index 100%
rename from contrib/webindex-dashboard.json
rename to webindex/contrib/webindex-dashboard.json


diff --git a/contrib/webindex.png b/webindex/contrib/webindex.png
similarity index 100%
rename from contrib/webindex.png
rename to webindex/contrib/webindex.png
Binary files differ

diff --git a/contrib/webindex.svg b/webindex/contrib/webindex.svg
similarity index 100%
rename from contrib/webindex.svg
rename to webindex/contrib/webindex.svg


diff --git a/docs/code-guide.md b/webindex/docs/code-guide.md
similarity index 100%
rename from docs/code-guide.md
rename to webindex/docs/code-guide.md


diff --git a/docs/install.md b/webindex/docs/install.md
similarity index 100%
rename from docs/install.md
rename to webindex/docs/install.md


diff --git a/docs/tables.md b/webindex/docs/tables.md
similarity index 100%
rename from docs/tables.md
rename to webindex/docs/tables.md


diff --git a/docs/webindex_graphic.png b/webindex/docs/webindex_graphic.png
similarity index 100%
rename from docs/webindex_graphic.png
rename to webindex/docs/webindex_graphic.png
Binary files differ

diff --git a/modules/core/pom.xml b/webindex/modules/core/pom.xml
similarity index 100%
rename from modules/core/pom.xml
rename to webindex/modules/core/pom.xml


diff --git a/modules/core/src/main/java/webindex/core/Constants.java b/webindex/modules/core/src/main/java/webindex/core/Constants.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/Constants.java
rename to webindex/modules/core/src/main/java/webindex/core/Constants.java


diff --git a/modules/core/src/main/java/webindex/core/IndexClient.java b/webindex/modules/core/src/main/java/webindex/core/IndexClient.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/IndexClient.java
rename to webindex/modules/core/src/main/java/webindex/core/IndexClient.java


diff --git a/modules/core/src/main/java/webindex/core/WebIndexConfig.java b/webindex/modules/core/src/main/java/webindex/core/WebIndexConfig.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/WebIndexConfig.java
rename to webindex/modules/core/src/main/java/webindex/core/WebIndexConfig.java


diff --git a/modules/core/src/main/java/webindex/core/models/DomainStats.java b/webindex/modules/core/src/main/java/webindex/core/models/DomainStats.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/DomainStats.java
rename to webindex/modules/core/src/main/java/webindex/core/models/DomainStats.java


diff --git a/modules/core/src/main/java/webindex/core/models/Link.java b/webindex/modules/core/src/main/java/webindex/core/models/Link.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/Link.java
rename to webindex/modules/core/src/main/java/webindex/core/models/Link.java


diff --git a/modules/core/src/main/java/webindex/core/models/Links.java b/webindex/modules/core/src/main/java/webindex/core/models/Links.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/Links.java
rename to webindex/modules/core/src/main/java/webindex/core/models/Links.java


diff --git a/modules/core/src/main/java/webindex/core/models/Page.java b/webindex/modules/core/src/main/java/webindex/core/models/Page.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/Page.java
rename to webindex/modules/core/src/main/java/webindex/core/models/Page.java


diff --git a/modules/core/src/main/java/webindex/core/models/Pages.java b/webindex/modules/core/src/main/java/webindex/core/models/Pages.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/Pages.java
rename to webindex/modules/core/src/main/java/webindex/core/models/Pages.java


diff --git a/modules/core/src/main/java/webindex/core/models/TopResults.java b/webindex/modules/core/src/main/java/webindex/core/models/TopResults.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/TopResults.java
rename to webindex/modules/core/src/main/java/webindex/core/models/TopResults.java


diff --git a/modules/core/src/main/java/webindex/core/models/URL.java b/webindex/modules/core/src/main/java/webindex/core/models/URL.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/URL.java
rename to webindex/modules/core/src/main/java/webindex/core/models/URL.java


diff --git a/modules/core/src/main/java/webindex/core/models/UriInfo.java b/webindex/modules/core/src/main/java/webindex/core/models/UriInfo.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/UriInfo.java
rename to webindex/modules/core/src/main/java/webindex/core/models/UriInfo.java


diff --git a/modules/core/src/main/java/webindex/core/models/export/DomainUpdate.java b/webindex/modules/core/src/main/java/webindex/core/models/export/DomainUpdate.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/export/DomainUpdate.java
rename to webindex/modules/core/src/main/java/webindex/core/models/export/DomainUpdate.java


diff --git a/modules/core/src/main/java/webindex/core/models/export/IndexUpdate.java b/webindex/modules/core/src/main/java/webindex/core/models/export/IndexUpdate.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/export/IndexUpdate.java
rename to webindex/modules/core/src/main/java/webindex/core/models/export/IndexUpdate.java


diff --git a/modules/core/src/main/java/webindex/core/models/export/PageUpdate.java b/webindex/modules/core/src/main/java/webindex/core/models/export/PageUpdate.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/export/PageUpdate.java
rename to webindex/modules/core/src/main/java/webindex/core/models/export/PageUpdate.java


diff --git a/modules/core/src/main/java/webindex/core/models/export/UriUpdate.java b/webindex/modules/core/src/main/java/webindex/core/models/export/UriUpdate.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/models/export/UriUpdate.java
rename to webindex/modules/core/src/main/java/webindex/core/models/export/UriUpdate.java


diff --git a/modules/core/src/main/java/webindex/core/util/Pager.java b/webindex/modules/core/src/main/java/webindex/core/util/Pager.java
similarity index 100%
rename from modules/core/src/main/java/webindex/core/util/Pager.java
rename to webindex/modules/core/src/main/java/webindex/core/util/Pager.java


diff --git a/modules/core/src/test/java/webindex/core/WebIndexConfigTest.java b/webindex/modules/core/src/test/java/webindex/core/WebIndexConfigTest.java
similarity index 100%
rename from modules/core/src/test/java/webindex/core/WebIndexConfigTest.java
rename to webindex/modules/core/src/test/java/webindex/core/WebIndexConfigTest.java


diff --git a/modules/core/src/test/java/webindex/core/models/LinkTest.java b/webindex/modules/core/src/test/java/webindex/core/models/LinkTest.java
similarity index 100%
rename from modules/core/src/test/java/webindex/core/models/LinkTest.java
rename to webindex/modules/core/src/test/java/webindex/core/models/LinkTest.java


diff --git a/modules/core/src/test/java/webindex/core/models/PageTest.java b/webindex/modules/core/src/test/java/webindex/core/models/PageTest.java
similarity index 100%
rename from modules/core/src/test/java/webindex/core/models/PageTest.java
rename to webindex/modules/core/src/test/java/webindex/core/models/PageTest.java


diff --git a/modules/core/src/test/java/webindex/core/models/URLTest.java b/webindex/modules/core/src/test/java/webindex/core/models/URLTest.java
similarity index 100%
rename from modules/core/src/test/java/webindex/core/models/URLTest.java
rename to webindex/modules/core/src/test/java/webindex/core/models/URLTest.java


diff --git a/modules/core/src/test/resources/log4j.properties b/webindex/modules/core/src/test/resources/log4j.properties
similarity index 100%
rename from modules/core/src/test/resources/log4j.properties
rename to webindex/modules/core/src/test/resources/log4j.properties


diff --git a/modules/data/pom.xml b/webindex/modules/data/pom.xml
similarity index 100%
rename from modules/data/pom.xml
rename to webindex/modules/data/pom.xml


diff --git a/modules/data/src/main/java/webindex/data/CalcSplits.java b/webindex/modules/data/src/main/java/webindex/data/CalcSplits.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/CalcSplits.java
rename to webindex/modules/data/src/main/java/webindex/data/CalcSplits.java


diff --git a/modules/data/src/main/java/webindex/data/Configure.java b/webindex/modules/data/src/main/java/webindex/data/Configure.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/Configure.java
rename to webindex/modules/data/src/main/java/webindex/data/Configure.java


diff --git a/modules/data/src/main/java/webindex/data/Copy.java b/webindex/modules/data/src/main/java/webindex/data/Copy.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/Copy.java
rename to webindex/modules/data/src/main/java/webindex/data/Copy.java


diff --git a/modules/data/src/main/java/webindex/data/FluoApp.java b/webindex/modules/data/src/main/java/webindex/data/FluoApp.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/FluoApp.java
rename to webindex/modules/data/src/main/java/webindex/data/FluoApp.java


diff --git a/modules/data/src/main/java/webindex/data/Init.java b/webindex/modules/data/src/main/java/webindex/data/Init.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/Init.java
rename to webindex/modules/data/src/main/java/webindex/data/Init.java


diff --git a/modules/data/src/main/java/webindex/data/LoadHdfs.java b/webindex/modules/data/src/main/java/webindex/data/LoadHdfs.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/LoadHdfs.java
rename to webindex/modules/data/src/main/java/webindex/data/LoadHdfs.java


diff --git a/modules/data/src/main/java/webindex/data/LoadS3.java b/webindex/modules/data/src/main/java/webindex/data/LoadS3.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/LoadS3.java
rename to webindex/modules/data/src/main/java/webindex/data/LoadS3.java


diff --git a/modules/data/src/main/java/webindex/data/TestParser.java b/webindex/modules/data/src/main/java/webindex/data/TestParser.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/TestParser.java
rename to webindex/modules/data/src/main/java/webindex/data/TestParser.java


diff --git a/modules/data/src/main/java/webindex/data/fluo/DomainCombineQ.java b/webindex/modules/data/src/main/java/webindex/data/fluo/DomainCombineQ.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/fluo/DomainCombineQ.java
rename to webindex/modules/data/src/main/java/webindex/data/fluo/DomainCombineQ.java


diff --git a/modules/data/src/main/java/webindex/data/fluo/IndexUpdateTranslator.java b/webindex/modules/data/src/main/java/webindex/data/fluo/IndexUpdateTranslator.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/fluo/IndexUpdateTranslator.java
rename to webindex/modules/data/src/main/java/webindex/data/fluo/IndexUpdateTranslator.java


diff --git a/modules/data/src/main/java/webindex/data/fluo/PageLoader.java b/webindex/modules/data/src/main/java/webindex/data/fluo/PageLoader.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/fluo/PageLoader.java
rename to webindex/modules/data/src/main/java/webindex/data/fluo/PageLoader.java


diff --git a/modules/data/src/main/java/webindex/data/fluo/PageObserver.java b/webindex/modules/data/src/main/java/webindex/data/fluo/PageObserver.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/fluo/PageObserver.java
rename to webindex/modules/data/src/main/java/webindex/data/fluo/PageObserver.java


diff --git a/modules/data/src/main/java/webindex/data/fluo/UriCombineQ.java b/webindex/modules/data/src/main/java/webindex/data/fluo/UriCombineQ.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/fluo/UriCombineQ.java
rename to webindex/modules/data/src/main/java/webindex/data/fluo/UriCombineQ.java


diff --git a/modules/data/src/main/java/webindex/data/fluo/WebindexObservers.java b/webindex/modules/data/src/main/java/webindex/data/fluo/WebindexObservers.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/fluo/WebindexObservers.java
rename to webindex/modules/data/src/main/java/webindex/data/fluo/WebindexObservers.java


diff --git a/modules/data/src/main/java/webindex/data/spark/IndexEnv.java b/webindex/modules/data/src/main/java/webindex/data/spark/IndexEnv.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/spark/IndexEnv.java
rename to webindex/modules/data/src/main/java/webindex/data/spark/IndexEnv.java


diff --git a/modules/data/src/main/java/webindex/data/spark/IndexStats.java b/webindex/modules/data/src/main/java/webindex/data/spark/IndexStats.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/spark/IndexStats.java
rename to webindex/modules/data/src/main/java/webindex/data/spark/IndexStats.java


diff --git a/modules/data/src/main/java/webindex/data/spark/IndexUtil.java b/webindex/modules/data/src/main/java/webindex/data/spark/IndexUtil.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/spark/IndexUtil.java
rename to webindex/modules/data/src/main/java/webindex/data/spark/IndexUtil.java


diff --git a/modules/data/src/main/java/webindex/data/util/ArchiveUtil.java b/webindex/modules/data/src/main/java/webindex/data/util/ArchiveUtil.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/util/ArchiveUtil.java
rename to webindex/modules/data/src/main/java/webindex/data/util/ArchiveUtil.java


diff --git a/modules/data/src/main/java/webindex/data/util/WARCFileInputFormat.java b/webindex/modules/data/src/main/java/webindex/data/util/WARCFileInputFormat.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/util/WARCFileInputFormat.java
rename to webindex/modules/data/src/main/java/webindex/data/util/WARCFileInputFormat.java


diff --git a/modules/data/src/main/java/webindex/data/util/WARCFileRecordReader.java b/webindex/modules/data/src/main/java/webindex/data/util/WARCFileRecordReader.java
similarity index 100%
rename from modules/data/src/main/java/webindex/data/util/WARCFileRecordReader.java
rename to webindex/modules/data/src/main/java/webindex/data/util/WARCFileRecordReader.java


diff --git a/modules/data/src/main/java/webindex/serialization/WebindexKryoFactory.java b/webindex/modules/data/src/main/java/webindex/serialization/WebindexKryoFactory.java
similarity index 100%
rename from modules/data/src/main/java/webindex/serialization/WebindexKryoFactory.java
rename to webindex/modules/data/src/main/java/webindex/serialization/WebindexKryoFactory.java


diff --git a/modules/data/src/main/resources/splits/accumulo-default.txt b/webindex/modules/data/src/main/resources/splits/accumulo-default.txt
similarity index 100%
rename from modules/data/src/main/resources/splits/accumulo-default.txt
rename to webindex/modules/data/src/main/resources/splits/accumulo-default.txt


diff --git a/modules/data/src/test/java/webindex/data/SparkTestUtil.java b/webindex/modules/data/src/test/java/webindex/data/SparkTestUtil.java
similarity index 100%
rename from modules/data/src/test/java/webindex/data/SparkTestUtil.java
rename to webindex/modules/data/src/test/java/webindex/data/SparkTestUtil.java


diff --git a/modules/data/src/test/java/webindex/data/fluo/it/IndexIT.java b/webindex/modules/data/src/test/java/webindex/data/fluo/it/IndexIT.java
similarity index 100%
rename from modules/data/src/test/java/webindex/data/fluo/it/IndexIT.java
rename to webindex/modules/data/src/test/java/webindex/data/fluo/it/IndexIT.java


diff --git a/modules/data/src/test/java/webindex/data/spark/Hex.java b/webindex/modules/data/src/test/java/webindex/data/spark/Hex.java
similarity index 100%
rename from modules/data/src/test/java/webindex/data/spark/Hex.java
rename to webindex/modules/data/src/test/java/webindex/data/spark/Hex.java


diff --git a/modules/data/src/test/java/webindex/data/spark/IndexEnvTest.java b/webindex/modules/data/src/test/java/webindex/data/spark/IndexEnvTest.java
similarity index 100%
rename from modules/data/src/test/java/webindex/data/spark/IndexEnvTest.java
rename to webindex/modules/data/src/test/java/webindex/data/spark/IndexEnvTest.java


diff --git a/modules/data/src/test/java/webindex/data/spark/IndexUtilTest.java b/webindex/modules/data/src/test/java/webindex/data/spark/IndexUtilTest.java
similarity index 100%
rename from modules/data/src/test/java/webindex/data/spark/IndexUtilTest.java
rename to webindex/modules/data/src/test/java/webindex/data/spark/IndexUtilTest.java


diff --git a/modules/data/src/test/java/webindex/data/util/ArchiveUtilTest.java b/webindex/modules/data/src/test/java/webindex/data/util/ArchiveUtilTest.java
similarity index 100%
rename from modules/data/src/test/java/webindex/data/util/ArchiveUtilTest.java
rename to webindex/modules/data/src/test/java/webindex/data/util/ArchiveUtilTest.java


diff --git a/modules/data/src/test/resources/data/set1/accumulo-data.txt b/webindex/modules/data/src/test/resources/data/set1/accumulo-data.txt
similarity index 100%
rename from modules/data/src/test/resources/data/set1/accumulo-data.txt
rename to webindex/modules/data/src/test/resources/data/set1/accumulo-data.txt


diff --git a/modules/data/src/test/resources/data/set1/fluo-data.txt b/webindex/modules/data/src/test/resources/data/set1/fluo-data.txt
similarity index 100%
rename from modules/data/src/test/resources/data/set1/fluo-data.txt
rename to webindex/modules/data/src/test/resources/data/set1/fluo-data.txt


diff --git a/modules/data/src/test/resources/log4j.properties b/webindex/modules/data/src/test/resources/log4j.properties
similarity index 100%
rename from modules/data/src/test/resources/log4j.properties
rename to webindex/modules/data/src/test/resources/log4j.properties


diff --git a/modules/data/src/test/resources/wat-18.warc b/webindex/modules/data/src/test/resources/wat-18.warc
similarity index 100%
rename from modules/data/src/test/resources/wat-18.warc
rename to webindex/modules/data/src/test/resources/wat-18.warc


diff --git a/modules/data/src/test/resources/wat.warc b/webindex/modules/data/src/test/resources/wat.warc
similarity index 100%
rename from modules/data/src/test/resources/wat.warc
rename to webindex/modules/data/src/test/resources/wat.warc


diff --git a/modules/integration/pom.xml b/webindex/modules/integration/pom.xml
similarity index 100%
rename from modules/integration/pom.xml
rename to webindex/modules/integration/pom.xml


diff --git a/modules/integration/src/main/java/webindex/integration/DevServer.java b/webindex/modules/integration/src/main/java/webindex/integration/DevServer.java
similarity index 100%
rename from modules/integration/src/main/java/webindex/integration/DevServer.java
rename to webindex/modules/integration/src/main/java/webindex/integration/DevServer.java


diff --git a/modules/integration/src/main/java/webindex/integration/DevServerOpts.java b/webindex/modules/integration/src/main/java/webindex/integration/DevServerOpts.java
similarity index 100%
rename from modules/integration/src/main/java/webindex/integration/DevServerOpts.java
rename to webindex/modules/integration/src/main/java/webindex/integration/DevServerOpts.java


diff --git a/modules/integration/src/main/java/webindex/integration/SampleData.java b/webindex/modules/integration/src/main/java/webindex/integration/SampleData.java
similarity index 100%
rename from modules/integration/src/main/java/webindex/integration/SampleData.java
rename to webindex/modules/integration/src/main/java/webindex/integration/SampleData.java


diff --git a/modules/integration/src/test/java/webindex/integration/DevServerIT.java b/webindex/modules/integration/src/test/java/webindex/integration/DevServerIT.java
similarity index 100%
rename from modules/integration/src/test/java/webindex/integration/DevServerIT.java
rename to webindex/modules/integration/src/test/java/webindex/integration/DevServerIT.java


diff --git a/modules/integration/src/test/resources/5-pages.txt b/webindex/modules/integration/src/test/resources/5-pages.txt
similarity index 100%
rename from modules/integration/src/test/resources/5-pages.txt
rename to webindex/modules/integration/src/test/resources/5-pages.txt


diff --git a/modules/integration/src/test/resources/log4j.properties b/webindex/modules/integration/src/test/resources/log4j.properties
similarity index 100%
rename from modules/integration/src/test/resources/log4j.properties
rename to webindex/modules/integration/src/test/resources/log4j.properties


diff --git a/modules/ui/.gitignore b/webindex/modules/ui/.gitignore
similarity index 100%
rename from modules/ui/.gitignore
rename to webindex/modules/ui/.gitignore


diff --git a/modules/ui/pom.xml b/webindex/modules/ui/pom.xml
similarity index 100%
rename from modules/ui/pom.xml
rename to webindex/modules/ui/pom.xml


diff --git a/modules/ui/src/main/java/webindex/ui/WebServer.java b/webindex/modules/ui/src/main/java/webindex/ui/WebServer.java
similarity index 100%
rename from modules/ui/src/main/java/webindex/ui/WebServer.java
rename to webindex/modules/ui/src/main/java/webindex/ui/WebServer.java


diff --git a/modules/ui/src/main/resources/assets/img/webindex.png b/webindex/modules/ui/src/main/resources/assets/img/webindex.png
similarity index 100%
rename from modules/ui/src/main/resources/assets/img/webindex.png
rename to webindex/modules/ui/src/main/resources/assets/img/webindex.png
Binary files differ

diff --git a/modules/ui/src/main/resources/spark/template/freemarker/404.ftl b/webindex/modules/ui/src/main/resources/spark/template/freemarker/404.ftl
similarity index 100%
rename from modules/ui/src/main/resources/spark/template/freemarker/404.ftl
rename to webindex/modules/ui/src/main/resources/spark/template/freemarker/404.ftl


diff --git a/modules/ui/src/main/resources/spark/template/freemarker/common/footer.ftl b/webindex/modules/ui/src/main/resources/spark/template/freemarker/common/footer.ftl
similarity index 100%
rename from modules/ui/src/main/resources/spark/template/freemarker/common/footer.ftl
rename to webindex/modules/ui/src/main/resources/spark/template/freemarker/common/footer.ftl


diff --git a/modules/ui/src/main/resources/spark/template/freemarker/common/head.ftl b/webindex/modules/ui/src/main/resources/spark/template/freemarker/common/head.ftl
similarity index 100%
rename from modules/ui/src/main/resources/spark/template/freemarker/common/head.ftl
rename to webindex/modules/ui/src/main/resources/spark/template/freemarker/common/head.ftl


diff --git a/modules/ui/src/main/resources/spark/template/freemarker/common/header.ftl b/webindex/modules/ui/src/main/resources/spark/template/freemarker/common/header.ftl
similarity index 100%
rename from modules/ui/src/main/resources/spark/template/freemarker/common/header.ftl
rename to webindex/modules/ui/src/main/resources/spark/template/freemarker/common/header.ftl


diff --git a/modules/ui/src/main/resources/spark/template/freemarker/home.ftl b/webindex/modules/ui/src/main/resources/spark/template/freemarker/home.ftl
similarity index 100%
rename from modules/ui/src/main/resources/spark/template/freemarker/home.ftl
rename to webindex/modules/ui/src/main/resources/spark/template/freemarker/home.ftl


diff --git a/modules/ui/src/main/resources/spark/template/freemarker/links.ftl b/webindex/modules/ui/src/main/resources/spark/template/freemarker/links.ftl
similarity index 100%
rename from modules/ui/src/main/resources/spark/template/freemarker/links.ftl
rename to webindex/modules/ui/src/main/resources/spark/template/freemarker/links.ftl


diff --git a/modules/ui/src/main/resources/spark/template/freemarker/page.ftl b/webindex/modules/ui/src/main/resources/spark/template/freemarker/page.ftl
similarity index 100%
rename from modules/ui/src/main/resources/spark/template/freemarker/page.ftl
rename to webindex/modules/ui/src/main/resources/spark/template/freemarker/page.ftl


diff --git a/modules/ui/src/main/resources/spark/template/freemarker/pages.ftl b/webindex/modules/ui/src/main/resources/spark/template/freemarker/pages.ftl
similarity index 100%
rename from modules/ui/src/main/resources/spark/template/freemarker/pages.ftl
rename to webindex/modules/ui/src/main/resources/spark/template/freemarker/pages.ftl


diff --git a/modules/ui/src/main/resources/spark/template/freemarker/top.ftl b/webindex/modules/ui/src/main/resources/spark/template/freemarker/top.ftl
similarity index 100%
rename from modules/ui/src/main/resources/spark/template/freemarker/top.ftl
rename to webindex/modules/ui/src/main/resources/spark/template/freemarker/top.ftl


diff --git a/pom.xml b/webindex/pom.xml
similarity index 100%
rename from pom.xml
rename to webindex/pom.xml
commit	4bd6a004c8dc37e2996314fd8b0de554562c5d38	[log] [tgz]
author	Christopher Tubbs <ctubbsii@apache.org>	Mon Apr 16 16:19:56 2018 -0400
committer	Christopher Tubbs <ctubbsii@apache.org>	Mon Apr 16 16:19:56 2018 -0400
tree	0fd91568cd66c28880c4bad4cbdf4106357491d8
parent	f762da6d8f93dec655741632dd534d1287d1a6ec [diff]
parent	91dc7cb6fc72c79a53c6b7d0a6c0599cd8eacb9b [diff]