Update ci bulk ingest docs (#98)
diff --git a/docs/bulk-test.md b/docs/bulk-test.md
index df5dc45..da8f163 100644
--- a/docs/bulk-test.md
+++ b/docs/bulk-test.md
@@ -8,6 +8,12 @@
# create the ci table if necessary
./bin/cingest createtable
+# Optionally, consider lowering the split threshold to make splits happen more
+# frequently while the test runs. Choose a threshold base on the amount of data
+# being imported and the desired number of splits.
+#
+# accumulo shell -u root -p secret -e 'config -t ci -s table.split.threshold=32M'
+
for i in $(seq 1 10); do
# run map reduce job to generate data for bulk import
./bin/cingest bulk /tmp/bt/$i
@@ -47,3 +53,13 @@
scan -t accumulo.metadata -c loaded
```
+The counts (add referenced and unrefrenced) output by `cingest verify` should equal :
+
+```
+test.ci.bulk.map.task * test.ci.bulk.map.nodes * num_bulk_generate_jobs
+```
+
+Its possible the counts could be slightly smaller because of collisions. However collisions
+are unlikely with the default settings given there are 63 bits of randomness in the row and
+30 bits in the column. This gives a total of 93 bits of randomness per key.
+