Introduction to HAMT
In the previous post I explained that a binary tree wasn't the best data structure for my needs - it wastes too much memory.
I looked for a memory-efficient data structures and I found a gem: Hash Array Mapped Trie (HAMT). The author, Phil Bagwell, wrote two papers related to the subject:
- Fast And Space Efficient Trie Searches, 2000 (pdf) (source)
- In this paper the author is comparing various implementations of Tries and introduces Array Mapped Trie.
- Ideal Hash Trees, 2001 (pdf) (source)
- This paper describes using Array Mapped Trie as a good data structure for a generic hash table.
Let's start with the basics.
Okay, let's not start with the basics. Tries (aka. prefix Trees) are nicely described on Wikipedia, go and read that first.
An exectuive summary: Tries are like normal Trees but, instead of
having a value stored on every node, it is only stored on the final leaves.
For comparison, here's the normal binary Tree storing values
And a Trie (prefix tree) whith prefix size of two bits:
Although it may look good, we haven't yet described the data structure in all details: how to store mapping between prefix and value or pointer on every node?
One obvious idea may be to just use an array for that. In our tree that would mean an array of length four for every node. This is indeed a good idea - no need to find the relevant mapping, just use the prefix as an array index:
The problem is that memory footprint is quite big. Even for our simple case there are a lot slots wasted by NULL values.
Mr Bagwell in his paper proposes compressing this array by using a bitmap to store which slots are not-NULL. For example, the map for the first leaf:
1001. Having this bitmap, we are able to compress the
array and store only something like that:
With the mask and a few bitwise operations (including the famous
popcount instruction) it is
possible to quickly say which position on the array holds which exact
prefix. For example in our case
00 is the 0th element of the array
and stores a value, and
11 is 1st element and stores the pointer to
Our Trie after compression will look like:
No bytes are now wasted and retrieving data still has complexity proportional to the prefix length.
Now you know how a compressed Trie works and that it keeps all the advantages of the Trie but having optimal memory footprint.
I described a simplified data structure, much more can be done to make it go really fast.
The uncompressed implementation has an advantage - every node was exactly the same size and memory management is trivial. The compressed variant on the other hand doesn't waste any memory, but it puts a lot of stress onto the "malloc(3)" implememtation.
Implementing a tuned for this data structure memory allocator is neccesary to gain an ultimate performance.
I showed examples of 2-bit dispatching, but original Hash Array Mapped Trie described in the paper is using 5 bits on each layer (bitmaps of size 32 bits). Of course, nothing is stopping from extending the schema to 64 bitmaps and 6 bits for each layer.
HAMT is a very nice data structure. Reasonably easy to work with, with very good performance characteristics when the keys are distributed uniformally. This suits ideally for a generic hash table implementation and will work well for my full-text search engine if I'll use md5 checksums instead of plaintext words for keys.
Finally, HAMT introduces very little memory overhead.