src/third_party/google-sparsehash/README - incubator-pagespeed-debian - Git at Google

 This directory contains several hash-map implementations, similar in
 API to SGI's hash_map class, but with different performance
 characteristics.  sparse_hash_map uses very little space overhead, 1-2
 bits per entry.  dense_hash_map is very fast, particulary on lookup.
 (sparse_hash_set and dense_hash_set are the set versions of these
 routines.)  On the other hand, these classes have requirements that
 may not make them appropriate for all applications.

 All these implementation use a hashtable with internal quadratic
 probing.  This method is space-efficient -- there is no pointer
 overhead -- and time-efficient for good hash functions.

 COMPILING
 ---------
 To compile test applications with these classes, run ./configure
 followed by make.  To install these header files on your system, run
 'make install'.  (On Windows, the instructions are different; see
 README.windows.)  See INSTALL for more details.

 This code should work on any modern C++ system.  It has been tested on
 Linux (Ubuntu, Fedora, RedHat, Debian), Solaris 10 x86, FreeBSD 6.0,
 OS X 10.3 and 10.4, and Windows under both VC++7 and VC++8.

 USING
 -----
 See the html files in the doc directory for small example programs
 that use these classes.  It's enough to just include the header file:

    #include <google/sparse_hash_map> // or sparse_hash_set, dense_hash_map, ...
    google::sparse_hash_set<int, int> number_mapper;

 and use the class the way you would other hash-map implementations.
 (Though see "API" below for caveats.)

 By default (you can change it via a flag to ./configure), these hash
 implementations are defined in the google namespace.

 API
 ---
 The API for sparse_hash_map, dense_hash_map, sparse_hash_set, and
 dense_hash_set, are a superset of the API of SGI's hash_map class.
 See doc/sparse_hash_map.html, et al., for more information about the
 API.

 The usage of these classes differ from SGI's hash_map, and other
 hashtable implementations, in the following major ways:

 1) dense_hash_map requires you to set aside one key value as the
    'empty bucket' value, set via the set_empty_key() method.  This
    *MUST* be called before you can use the dense_hash_map.  It is
    illegal to insert any elements into a dense_hash_map whose key is
    equal to the empty-key.

 2) For both dense_hash_map and sparse_hash_map, if you wish to delete
    elements from the hashtable, you must set aside a key value as the
    'deleted bucket' value, set via the set_deleted_key() method.  If
    your hash-map is insert-only, there is no need to call this
    method.  If you call set_deleted_key(), it is illegal to insert any
    elements into a dense_hash_map or sparse_hash_map whose key is
    equal to the deleted-key.

 3) These hash-map implementation support I/O.  See below.

 There are also some smaller differences:

 1) The constructor takes an optional argument that specifies the
    number of elements you expect to insert into the hashtable.  This
    differs from SGI's hash_map implementation, which takes an optional
    number of buckets.

 2) erase() does not immediately reclaim memory.  As a consequence,
    erase() does not invalidate any iterators, making loops like this
    correct:
       for (it = ht.begin(); it != ht.end(); ++it)
         if (...) ht.erase(it);
    As another consequence, a series of erase() calls can leave your
    hashtable using more memory than it needs to.  The hashtable will
    automatically compact() at the next call to insert(), but to
    manually compact a hashtable, you can call
       ht.resize(0)

 3) While sparse_hash_map et al. accept an Allocator template argument,
    they ignore it.  They use malloc() and free() for all memory
    allocations.

 4) sparse_hash_map et al. do not use exceptions.

 I/O
 ---
 In addition to the normal hash-map operations, sparse_hash_map can
 read and write hashtables to disk.  (dense_hash_map also has the API,
 but it has not yet been implemented, and writes will always fail.)

 In the simplest case, writing a hashtable is as easy as calling two
 methods on the hashtable:
    ht.write_metadata(fp);
    ht.write_nopointer_data(fp);

 Reading in this data is equally simple:
    google::sparse_hash_map<...> ht;
    ht.read_metadata(fp);
    ht.read_nopointer_data(fp);

 The above is sufficient if the key and value do not contain any
 pointers: they are basic C types or agglomorations of basic C types.
 If the key and/or value do contain pointers, you can still store the
 hashtable by replacing write_nopointer_data() with a custom writing
 routine.  See sparse_hash_map.html et al. for more information.

 SPARSETABLE
 -----------
 In addition to the hash-map and hash-set classes, this package also
 provides sparsetable.h, an array implementation that uses space
 proportional to the number of elements in the array, rather than the
 maximum element index.  It uses very little space overhead: 1 bit per
 entry.  See doc/sparsetable.html for the API.

 RESOURCE USAGE
 --------------
 * sparse_hash_map has memory overhead of about 2 bits per hash-map
   entry.
 * dense_hash_map has a factor of 2-3 memory overhead: if your
   hashtable data takes X bytes, dense_hash_map will use 3X-4X memory
   total.

 Hashtables tend to double in size when resizing, creating an
 additional 50% space overhead.  dense_hash_map does in fact have a
 significant "high water mark" memory use requirement.
 sparse_hash_map, however, is written to need very little space
 overhead when resizing: only a few bits per hashtable entry.

 PERFORMANCE
 -----------
 You can compile and run the included file time_hash_map.cc to examine
 the performance of sparse_hash_map, dense_hash_map, and your native
 hash_map implementation on your system.  One test against the
 SGI hash_map implementation gave the following timing information for
 a simple find() call:
    SGI hash_map:     22 ns
    dense_hash_map:   13 ns
    sparse_hash_map: 117 ns
    SGI map:         113 ns

 See doc/performance.html for more detailed charts on resource usage
 and performance data.

 ---
 16 March 2005
 (Last updated: 20 March 2007)
	This directory contains several hash-map implementations, similar in
	API to SGI's hash_map class, but with different performance
	characteristics. sparse_hash_map uses very little space overhead, 1-2
	bits per entry. dense_hash_map is very fast, particulary on lookup.
	(sparse_hash_set and dense_hash_set are the set versions of these
	routines.) On the other hand, these classes have requirements that
	may not make them appropriate for all applications.

	All these implementation use a hashtable with internal quadratic
	probing. This method is space-efficient -- there is no pointer
	overhead -- and time-efficient for good hash functions.

	COMPILING
	---------
	To compile test applications with these classes, run ./configure
	followed by make. To install these header files on your system, run
	'make install'. (On Windows, the instructions are different; see
	README.windows.) See INSTALL for more details.

	This code should work on any modern C++ system. It has been tested on
	Linux (Ubuntu, Fedora, RedHat, Debian), Solaris 10 x86, FreeBSD 6.0,
	OS X 10.3 and 10.4, and Windows under both VC++7 and VC++8.

	USING
	-----
	See the html files in the doc directory for small example programs
	that use these classes. It's enough to just include the header file:

	#include <google/sparse_hash_map> // or sparse_hash_set, dense_hash_map, ...
	google::sparse_hash_set<int, int> number_mapper;

	and use the class the way you would other hash-map implementations.
	(Though see "API" below for caveats.)

	By default (you can change it via a flag to ./configure), these hash
	implementations are defined in the google namespace.

	API
	---
	The API for sparse_hash_map, dense_hash_map, sparse_hash_set, and
	dense_hash_set, are a superset of the API of SGI's hash_map class.
	See doc/sparse_hash_map.html, et al., for more information about the
	API.

	The usage of these classes differ from SGI's hash_map, and other
	hashtable implementations, in the following major ways:

	1) dense_hash_map requires you to set aside one key value as the
	'empty bucket' value, set via the set_empty_key() method. This
	MUST be called before you can use the dense_hash_map. It is
	illegal to insert any elements into a dense_hash_map whose key is
	equal to the empty-key.

	2) For both dense_hash_map and sparse_hash_map, if you wish to delete
	elements from the hashtable, you must set aside a key value as the
	'deleted bucket' value, set via the set_deleted_key() method. If
	your hash-map is insert-only, there is no need to call this
	method. If you call set_deleted_key(), it is illegal to insert any
	elements into a dense_hash_map or sparse_hash_map whose key is
	equal to the deleted-key.

	3) These hash-map implementation support I/O. See below.

	There are also some smaller differences:

	1) The constructor takes an optional argument that specifies the
	number of elements you expect to insert into the hashtable. This
	differs from SGI's hash_map implementation, which takes an optional
	number of buckets.

	2) erase() does not immediately reclaim memory. As a consequence,
	erase() does not invalidate any iterators, making loops like this
	correct:
	for (it = ht.begin(); it != ht.end(); ++it)
	if (...) ht.erase(it);
	As another consequence, a series of erase() calls can leave your
	hashtable using more memory than it needs to. The hashtable will
	automatically compact() at the next call to insert(), but to
	manually compact a hashtable, you can call
	ht.resize(0)

	3) While sparse_hash_map et al. accept an Allocator template argument,
	they ignore it. They use malloc() and free() for all memory
	allocations.

	4) sparse_hash_map et al. do not use exceptions.

	I/O
	---
	In addition to the normal hash-map operations, sparse_hash_map can
	read and write hashtables to disk. (dense_hash_map also has the API,
	but it has not yet been implemented, and writes will always fail.)

	In the simplest case, writing a hashtable is as easy as calling two
	methods on the hashtable:
	ht.write_metadata(fp);
	ht.write_nopointer_data(fp);

	Reading in this data is equally simple:
	google::sparse_hash_map<...> ht;
	ht.read_metadata(fp);
	ht.read_nopointer_data(fp);

	The above is sufficient if the key and value do not contain any
	pointers: they are basic C types or agglomorations of basic C types.
	If the key and/or value do contain pointers, you can still store the
	hashtable by replacing write_nopointer_data() with a custom writing
	routine. See sparse_hash_map.html et al. for more information.

	SPARSETABLE
	-----------
	In addition to the hash-map and hash-set classes, this package also
	provides sparsetable.h, an array implementation that uses space
	proportional to the number of elements in the array, rather than the
	maximum element index. It uses very little space overhead: 1 bit per
	entry. See doc/sparsetable.html for the API.

	RESOURCE USAGE
	--------------
	* sparse_hash_map has memory overhead of about 2 bits per hash-map
	entry.
	* dense_hash_map has a factor of 2-3 memory overhead: if your
	hashtable data takes X bytes, dense_hash_map will use 3X-4X memory
	total.

	Hashtables tend to double in size when resizing, creating an
	additional 50% space overhead. dense_hash_map does in fact have a
	significant "high water mark" memory use requirement.
	sparse_hash_map, however, is written to need very little space
	overhead when resizing: only a few bits per hashtable entry.

	PERFORMANCE
	-----------
	You can compile and run the included file time_hash_map.cc to examine
	the performance of sparse_hash_map, dense_hash_map, and your native
	hash_map implementation on your system. One test against the
	SGI hash_map implementation gave the following timing information for
	a simple find() call:
	SGI hash_map: 22 ns
	dense_hash_map: 13 ns
	sparse_hash_map: 117 ns
	SGI map: 113 ns

	See doc/performance.html for more detailed charts on resource usage
	and performance data.

	---
	16 March 2005
	(Last updated: 20 March 2007)