| <HTML> |
| |
| <HEAD> |
| <title>Performance notes: sparse_hash, dense_hash, sparsetable</title> |
| </HEAD> |
| |
| <BODY> |
| |
| <H2>Performance Numbers</H2> |
| |
| <p>Here are some performance numbers from an example desktop machine, |
| taken from a version of time_hash_map that was instrumented to also |
| report memory allocation information (this modification is not |
| included by default because it required a big hack to do, including |
| modifying the STL code to not try to do its own freelist management).</p> |
| |
| <p>Note there are lots of caveats on these numbers: they may differ from |
| machine to machine and compiler to compiler, and they only test a very |
| particular usage pattern that may not match how you use hashtables -- |
| for instance, they test hashtables with very small keys. However, |
| they're still useful for a baseline comparison of the various |
| hashtable implementations.</p> |
| |
| <p>These figures are from a 2.80GHz Pentium 4 with 2G of memory. The |
| 'standard' hash_map and map implementations are the SGI STL code |
| included with <tt>gcc2</tt>. Compiled with <tt>gcc2.95.3 -g |
| -O2</tt></p> |
| |
| <pre> |
| ====== |
| Average over 10000000 iterations |
| Wed Dec 8 14:56:38 PST 2004 |
| |
| SPARSE_HASH_MAP: |
| map_grow 665 ns |
| map_predict/grow 303 ns |
| map_replace 177 ns |
| map_fetch 117 ns |
| map_remove 192 ns |
| memory used in map_grow 84.3956 Mbytes |
| |
| DENSE_HASH_MAP: |
| map_grow 84 ns |
| map_predict/grow 22 ns |
| map_replace 18 ns |
| map_fetch 13 ns |
| map_remove 23 ns |
| memory used in map_grow 256.0000 Mbytes |
| |
| STANDARD HASH_MAP: |
| map_grow 162 ns |
| map_predict/grow 107 ns |
| map_replace 44 ns |
| map_fetch 22 ns |
| map_remove 124 ns |
| memory used in map_grow 204.1643 Mbytes |
| |
| STANDARD MAP: |
| map_grow 297 ns |
| map_predict/grow 282 ns |
| map_replace 113 ns |
| map_fetch 113 ns |
| map_remove 238 ns |
| memory used in map_grow 236.8081 Mbytes |
| </pre> |
| |
| |
| <H2><A name="hashfn">A Note on Hash Functions</A></H2> |
| |
| <p>For good performance, the Google hash routines depend on a good |
| hash function: one that distributes data evenly. Many hashtable |
| implementations come with sub-optimal hash functions that can degrade |
| performance. For instance, the hash function given in Knuth's _Art of |
| Computer Programming_, and the default string hash function in SGI's |
| STL implementation, both distribute certain data sets unevenly, |
| leading to poor performance.</p> |
| |
| <p>As an example, in one test of the default SGI STL string hash |
| function against the Hsieh hash function (see below), for a particular |
| set of string keys, the Hsieh function resulted in hashtable lookups |
| that were 20 times as fast as the STLPort hash function. The string |
| keys were chosen to be "hard" to hash well, so these results may not |
| be typical, but they are suggestive.</p> |
| |
| <p>There has been much research over the years into good hash |
| functions. Here are some hash functions of note.</p> |
| |
| <ul> |
| <li> Bob Jenkins: <A HREF="http://burtleburtle.net/bob/hash/">http://burtleburtle.net/bob/hash/</A> |
| <li> Paul Hsieh: <A HREF="http://www.azillionmonkeys.com/qed/hash.html">http://www.azillionmonkeys.com/qed/hash.html</A> |
| <li> Fowler/Noll/Vo (FNV): <A HREF="http://www.isthe.com/chongo/tech/comp/fnv/">http://www.isthe.com/chongo/tech/comp/fnv/</A> |
| <li> MurmurHash: <A HREF="http://murmurhash.googlepages.com/">http://murmurhash.googlepages.com/</A> |
| </ul> |
| |
| </body> |
| </html> |