Hashcode in Java
What is hashCode()
in Java?
Every object in Java has a method called hashCode()
(inherited from Object
). It returns an integer (32-bit signed number) that is supposed to represent the object in a way that helps quickly find it in a data structure like a HashMap
.
Why is hashCode()
important?
A HashMap
stores key-value pairs. To store or retrieve a value, it uses the key's hashCode()
to decide where in memory (what "bucket") to store or look for it.
This means:
- Fast insert and lookup
- Near constant-time performance (O(1)), if well distributed
How is hashCode()
computed?
For strings:
int hash = 0;
for (char c : value) {
hash = 31 * hash + c;
}
// This is called a polynomial rolling hash.
// Example:
"abc" = (('a'*31 + 'b')*31 + 'c')
= ((97*31 + 98)*31 + 99) = 96354
For integers:
Integer.hashCode(x) == x
it just returns the value itself
For Booleans (it’s hard coded):
true.hashCode() = 1231
false.hashCode() = 1237
For custom objects:
it usually combines the hashcodes of fields
public int hashCode() {
return Objects.hash(field1, field2);
}
What does HashMap do with the hashCode?
int h = hashCode;
h ^= (h >>> 16); // Mix high and low bits
// This improves the randomness so that keys are more evenly spread.
// >>> pushes the upper bits 16 bits to the right and adds zeros to the left
// Example:
// 00000011 00000001 01010101 00010101 becomes 00000000 00000000 00000011 00000001
If capacity is 16:
index = h & (16 - 1); // mask lower 4 bits
// So the map only looks in 1 of 16 slots.
The bitwise AND (&
) operation compares two binary numbers bit by bit, and returns 1 only if both bits are 1 — otherwise it returns 0.
4. Check the bucket
- If it’s empty, store the entry there.
- If not: compare hash and use
.equals()
to find the right key
Why mix the hash with hash ^ (hash >>> 16)
?
Java HashMap uses this mixing to avoid poor distribution.
Imagine two strings with only the first characters different:
- "abc": hash is 96354
- "xbc": hash is 120594
If the map only looks at low bits (due to capacity), they may collide. XORing with the high 16 bits spreads the differences better.
Bucket example (visual)
Let’s say we have:
- Capacity = 16
- Key: "mustafa"
hashCode("mustafa") = 1099491643
h = 1099491643 ^ (1099491643 >>> 16)
= 1099491643 ^ 16777 = 1099508290
index = 1099508290 & (16 - 1) = 1099508290 & 15 = 2
What if two keys have the same index?
This is called a collision. HashMap handles it by using a linked list (or a tree if many keys collide).
- It walks the list in the bucket
- Compares each node's hash and then
equals()
Why is HashMap faster than List?
- List: you must check each element one by one (O(n))
- HashMap: uses hashCode to jump directly to a likely place (O(1) on average)