blob: 40c1406ca0999495420e571f191168ca824921fa [file] [log] [blame]
<HTML>
<HEAD>
<title>Performance notes: sparse_hash, dense_hash, sparsetable</title>
</HEAD>
<BODY>
<H2>Performance Numbers</H2>
<p>Here are some performance numbers from an example desktop machine,
taken from a version of time_hash_map that was instrumented to also
report memory allocation information (this modification is not
included by default because it required a big hack to do, including
modifying the STL code to not try to do its own freelist management).</p>
<p>Note there are lots of caveats on these numbers: they may differ from
machine to machine and compiler to compiler, and they only test a very
particular usage pattern that may not match how you use hashtables --
for instance, they test hashtables with very small keys. However,
they're still useful for a baseline comparison of the various
hashtable implementations.</p>
<p>These figures are from a 2.80GHz Pentium 4 with 2G of memory. The
'standard' hash_map and map implementations are the SGI STL code
included with <tt>gcc2</tt>. Compiled with <tt>gcc2.95.3 -g
-O2</tt></p>
<pre>
======
Average over 10000000 iterations
Wed Dec 8 14:56:38 PST 2004
SPARSE_HASH_MAP:
map_grow 665 ns
map_predict/grow 303 ns
map_replace 177 ns
map_fetch 117 ns
map_remove 192 ns
memory used in map_grow 84.3956 Mbytes
DENSE_HASH_MAP:
map_grow 84 ns
map_predict/grow 22 ns
map_replace 18 ns
map_fetch 13 ns
map_remove 23 ns
memory used in map_grow 256.0000 Mbytes
STANDARD HASH_MAP:
map_grow 162 ns
map_predict/grow 107 ns
map_replace 44 ns
map_fetch 22 ns
map_remove 124 ns
memory used in map_grow 204.1643 Mbytes
STANDARD MAP:
map_grow 297 ns
map_predict/grow 282 ns
map_replace 113 ns
map_fetch 113 ns
map_remove 238 ns
memory used in map_grow 236.8081 Mbytes
</pre>
<H2><A name="hashfn">A Note on Hash Functions</A></H2>
<p>For good performance, the Google hash routines depend on a good
hash function: one that distributes data evenly. Many hashtable
implementations come with sub-optimal hash functions that can degrade
performance. For instance, the hash function given in Knuth's _Art of
Computer Programming_, and the default string hash function in SGI's
STL implementation, both distribute certain data sets unevenly,
leading to poor performance.</p>
<p>As an example, in one test of the default SGI STL string hash
function against the Hsieh hash function (see below), for a particular
set of string keys, the Hsieh function resulted in hashtable lookups
that were 20 times as fast as the STLPort hash function. The string
keys were chosen to be "hard" to hash well, so these results may not
be typical, but they are suggestive.</p>
<p>There has been much research over the years into good hash
functions. Here are some hash functions of note.</p>
<ul>
<li> Bob Jenkins: <A HREF="http://burtleburtle.net/bob/hash/">http://burtleburtle.net/bob/hash/</A>
<li> Paul Hsieh: <A HREF="http://www.azillionmonkeys.com/qed/hash.html">http://www.azillionmonkeys.com/qed/hash.html</A>
<li> Fowler/Noll/Vo (FNV): <A HREF="http://www.isthe.com/chongo/tech/comp/fnv/">http://www.isthe.com/chongo/tech/comp/fnv/</A>
<li> MurmurHash: <A HREF="http://murmurhash.googlepages.com/">http://murmurhash.googlepages.com/</A>
</ul>
</body>
</html>