There's a patch at mozilla bugzilla to improve cairo's hash table implementation:
Keith, can you comment please.
There is no evacuation of dead entries in cairo_hash_tables if the size of the
table never changes, so, when entries are regularly added and removed,
eventually the hash table has no free entries.
When there are no free entries to terminate a search, checking that a key is
not in the table requires probing every entry in the table, calling
cairo_hash_keys_equal_func_t on every live entry, so the lookup is O(n).
Bug 18165 or bug 18166 will reduce that speed at which hash_tables fill,
but, because entries are well distributed throughout the table (and
_cairo_hash_table_random_entry removes entries randomly), the tables
The scaled glyph caches are a common case where the table size remains
constant but entries are regularly added and removed.
Created attachment 19802 [details] [review]
reshuffle to remove dead entries
Ensure that at least 25% of the entries in the table are free (search
When the number of free entries becomes small, the table is reshuffled.
This reshuffling avoids the need to reallocate, but means an extra pass through the table entries. Tests indicate that this method works well, but I don't know whether or not a memory allocation would be preferable to the extra pass. I guess it depends on the size of the table, but let me know if you think a memory allocation would be better.
One iteration of reshuffling doesn't necessarily clear out all dead entries, but does clear out most of them, so the number of reshuffles remains small. The number of add and remove cycles between reshuffles should be O(n), where n is the table size, so the average lookup time remains O(1).
This should be fixed by:
Author: Andrea Canciani <firstname.lastname@example.org>
Date: Tue Aug 2 10:50:00 2011 +0200
hash: Improve handling of dead entries
When there are no free entries to terminate a search, checking that a
key is not in the table requires probing every entry in the table,
i.e. it degenerates in an O(n) operation.
Rehashing when the number of free entries is less than 25% makes the
expected lookup time O(1).
The hash-table micro benchmark become 4-6x faster.
The committed patch does not perform in-place reshuffling and simply rehashes to a separate table.
Even if this implies an additional memory allocation, it greatly simplifies the code, because it can directly reuse the rehash code used for changing size.