# How can I test that my hash function is good in terms of max-load?

Filed under:
|
|
##### testing

I have read through various papers on the 'Balls and Bins' problem and it seems that if a hash function is working right (ie. it is effectively a random distribution) then the following should/must be true if I hash `n` values into a hash table with `n` slots (or bins):

1. Probability that a bin is empty, for large `n` is `1/e`.
2. Expected number of empty bins is `n/e`.
3. Probability that a bin has `k` collisions is `<= 1/k!`.
4. Probability that a bin has at least k collisions is `<= (e/k)**k`.

These look easy to check. But the `max-load` test (the maximum number of collisions with high probability) is usually stated vaguely.

Most texts state that the maximum number of collisions in any bin is `O( ln(n) / ln(ln(n)) )`. Some say it is `3*ln(n) / ln(ln(n))`. Other papers mix `ln` and `log` - usually without defining them, or state that `log` is log base e and then use `ln` elsewhere.

Is `ln` the log to base `e` or `2` and is this `max-load` formula right and how big should `n` be to run a test?

This lecture seems to cover it best, but I am no mathematician.

http://pages.cs.wisc.edu/~shuchi/courses/787-F07/scribe-notes/lecture07.pdf

BTW, `with high probability` seems to mean `1 - 1/n`.

© Stack Overflow or respective owner

• #### Problem with hash function: hash(1) == hash(1.0)

as seen on Stack Overflow - Search for 'Stack Overflow'
I have an instance of dict with ints, floats, strings as keys, but the problem is when there are a as int and b as float, and float(a) == b, then their hash values are the same, and thats what I do NOT want to get because I need unique hash vales for this cases in order to get corresponding values… >>> More

• #### Hash table vs Hash list vs Hash tree?

as seen on Stack Overflow - Search for 'Stack Overflow'
What property makes Hash table, Hash list and Hash tree different from each other? Which one is used when? When is table superior than tree. >>> More

• #### Constructing a hash table/hash function.

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I would like to construct a hash table that looks up keys in sequences (strings) of bytes ranging from 1 to 15 bytes. I would like to store an integer value, so I imagine an array for hashing would suffice. I'm having difficulty conceptualizing how to construct a hash function such that given… >>> More

• #### Hash of unique value = unique hash?

as seen on Stack Overflow - Search for 'Stack Overflow'
Theoretically does hashing a unique value yield a unique value? Let's say I have a DB table with 2 columns: id and code. id is an auto-incrementing int and code is a varchar. If I do ... \$code = sha1(\$id); ... and then store \$code into the same row as \$id. Will my code column be unique as well… >>> More

• #### EMERGENCY - Major Problems After Perl Module Installed via WHM

as seen on Stack Overflow - Search for 'Stack Overflow'
I attempted to install the perl module Net::Twitter::Role::API::Lists using WHM and after doing so my whole site came down. It seems that something that was updated with the install isn't functioning correctly and since our website it written in Perl none of our site scripts will run. In almost 8… >>> More

• #### Copy hashtable to another hashtable using c++

as seen on Stack Overflow - Search for 'Stack Overflow'
I am starting with c++ and need to know, what should be the approach to copy one hashtable to another hashtable in C++? We can easily do this in java using: HashMap copyOfOriginal=new HashMap(original); But what about C++? How should I go about it? UPDATE Well, I am doing it at a very basic level… >>> More

• #### Search Complexity of a Hashtable within a Hashtable?

as seen on Stack Overflow - Search for 'Stack Overflow'
Say we have a hashtable of size m, and at each bucket we store a hashtable of size p. What would the worst case/average case search complexity be? I am inclined to say that since computing a hash function is still atomic, the only worst case scenario is if the value is at the end of the linked list… >>> More

• #### How to fix the size of the Hashtable and find the whether it has fix size or not?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi All, I am trying to fix the size of the Hashtable with following code. Hashtable hashtable = new Hashtable(2); //Add elements in the Hashtable hashtable.Add("A", "Vijendra"); hashtable.Add("B", "Singh"); hashtable.Add("C", "Shakya"); hashtable.Add("D"… >>> More

• #### Hi, I have a C hashing routine which is behaving strangely?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, In this hashing routine: 1.) I am able to add strings. 2.) I am able to view my added strings. 3.) When i try to add a duplicate string, it throws me an error saying already present. 4.) But, when i try to delete the same string which is already present in hash table, then the lookup_routine… >>> More

• #### Hashtable comparator problem

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi guys i've never written a comparator b4 and im having a real problem. I've created a hashtable. Hashtable <String, Objects> ht; Could someone show how you'd write a comparator for a Hashtable? the examples i've seen overide equals and everything but i simply dont have a clue. The code… >>> More