Cleanup of README.

commit: f4a3657c01b7de8a7d545d77f3c81c7f2c3cb028 [log] [tgz]
author: Knut Nesheim <knutin@gmail.com> Mon Jan 20 16:01:46 2014 +0100
committer: Knut Nesheim <knutin@gmail.com> Mon Jan 20 16:01:46 2014 +0100
tree: c4c83f91c23664a8260d36c0376b5fe4b84329c7
parent: f0cfcaa7ad26d56328ebf6f3ce033ac03a0d2ffd [diff]
diff --git a/README.md b/README.md
index dd65300..9be0212 100644
--- a/README.md
+++ b/README.md

@@ -15,8 +15,9 @@
 filter.
 
 In addition to the base algorithm, we have implemented the bias
-correction from HLL++ as the described in the excellent [paper by
-Google][].
+correction from HLL++ as the described in the excellent
+[paper by Google][]. Bias correction greatly improves the estimates
+for lower cardinalities.
 
 
 ## Usage
@@ -32,7 +33,7 @@
 2.136502281992361
 ```
 
-The error from estimations can be seen in this example:
+The errors introduced by estimations can be seen in this example:
 ```erlang
 3> random:seed(1,2,3).
 undefined
@@ -48,16 +49,31 @@
 
 A filter can be persisted and read later. The serialized struct is formatted for usage with jiffy:
 ```erlang
-8> Filter1 = hyper:insert(<<"foo">>, hyper:new(4)).
+8> Filter = hyper:insert(<<"foo">>, hyper:new(4)).
 {hyper,4,
        {hyper_binary,{dense,<<4,0,0,0,0,0,0,0,0,0,0,0>>,[],0,16}}}
-9> Filter2 = hyper:from_json(hyper:to_json(Filter1)).
-{hyper,4,
-       {hyper_binary,{dense,<<4,0,0,0,0,0,0,0,0,0,0,0>>,[],0,16}}}
-10> hyper:card(Filter1) =:= hyper:card(Filter2).
+9> Filter =:= hyper:from_json(hyper:to_json(Filter)).
 true
 ```
 
+You can select a different backend. See below for a description of why
+you might want to do so. They serialize in exactly the same way, but
+can't be mixed in memory.
+
+```erlang
+1> Gb = hyper:insert(<<"foo">>, hyper:new(4, hyper_gb)).
+{hyper,4,{hyper_gb,{{1,{0,1,nil,nil}},16}}}
+2> B = hyper:insert(<<"foo">>, hyper:new(4, hyper_binary)).
+{hyper,4,
+       {hyper_binary,{dense,<<4,0,0,0,0,0,0,0,0,0,0,0>>,[],0,16}}}
+3> hyper:to_json(Gb) =:= hyper:to_json(B).
+true
+4> hyper:union(Gb, B).
+** exception error: no case clause matching [{4,hyper_binary},{4,hyper_gb}]
+     in function  hyper:union/1 (src/hyper.erl, line 65)
+```
+
+
 ## Is it any good?
 
 Yes. At Game Analytics we use it extensively.
@@ -73,20 +89,20 @@
  * `hyper_binary`: Fixed memory usage (6 bits * 2^P), fastest on insert,
    union, cardinality and serialization. Best default choice.
 
- * hyper_bisect: Lower memory usage at lower fill rates (3 bytes per
+ * `hyper_bisect`: Lower memory usage at lower fill rates (3 bytes per
    used entry), slightly slower than hyper_binary for
    everything. Switches to a structure similar to hyper_binary when it
    would save memory. Room for further optimization.
 
- * hyper_gb: Fast inserts, very fast unions and reasonable memory
+ * `hyper_gb`: Fast inserts, very fast unions and reasonable memory
    usage at low fill rates. Unreasonable memory usage at high fill
    rates.
 
- * hyper_array: Cardinality estimation is constant, but slower than
+ * `hyper_array`: Cardinality estimation is constant, but slower than
    hyper_gb for low fill rates. Uses much more memory at lower fill
    rates, but stays constant from 25% and upwards.
 
- * hyper_binary_rle: Dud
+ * `hyper_binary_rle`: Dud
 
 You can also implement your own backend. In `hyper_test` theres a
 bunch of tests run for all backends, including some PropEr tests. The
commit	f4a3657c01b7de8a7d545d77f3c81c7f2c3cb028	[log] [tgz]
author	Knut Nesheim <knutin@gmail.com>	Mon Jan 20 16:01:46 2014 +0100
committer	Knut Nesheim <knutin@gmail.com>	Mon Jan 20 16:01:46 2014 +0100
tree	c4c83f91c23664a8260d36c0376b5fe4b84329c7
parent	f0cfcaa7ad26d56328ebf6f3ce033ac03a0d2ffd [diff]