Realities of Hashing
An Experimental Comparison Of Modern Hashing Techniques
William GarrisonUniversity of Maryland Baltimore County
December 1998
Minimal perfect hash functions have been sought after in computer science as the optimal data structure for many applications. The established mindset is that these perfect functions are fast and efficient, but difficult to create. This experiment compares one implementation of a near-minimal perfect hashing with common methods of open addressing and chaining. It shows that at reasonable load factors, open addressing is nearly as efficient as perfect hashing, and that the costs of chaining may make it an acceptable method for applications where resizing the table is not feasible. Additionally, a comparison of hash functions shows that new, more uniformly distributed hash functions are not likely to have a significant payoff over the established division and multiplication methods. Finally, it proposes modifications to the near-minimal perfect hash algorithm used here to create perfect minimal hash tables for any data set.
The reader is assumed to have a basic understanding hash tables, and certain base concepts are not explained in detail.
Near Minimal Perfect Hashing | [HTML] [MS Word] |
Source code | [HTML] [Download Source] |
Data Set | [ZIP file] |
Results spreadsheet and charts | [Text] [Excel SS] [MS Works] |