Hashcode in Java

What is hashCode() in Java?

Every object in Java has a method called hashCode() (inherited from Object). It returns an integer (32-bit signed number) that is supposed to represent the object in a way that helps quickly find it in a data structure like a HashMap.

Why is hashCode() important?

A HashMap stores key-value pairs. To store or retrieve a value, it uses the key's hashCode() to decide where in memory (what "bucket") to store or look for it.

This means:

  • Fast insert and lookup
  • Near constant-time performance (O(1)), if well distributed

How is hashCode() computed?

For strings:

int hash = 0;
for (char c : value) {
    hash = 31 * hash + c;
}

// This is called a polynomial rolling hash.
// Example:
"abc" = (('a'*31 + 'b')*31 + 'c')
      = ((97*31 + 98)*31 + 99) = 96354

For integers:

Integer.hashCode(x) == x

it just returns the value itself

For Booleans (it’s hard coded):

true.hashCode() = 1231
false.hashCode() = 1237

For custom objects:

it usually combines the hashcodes of fields

public int hashCode() {
    return Objects.hash(field1, field2);
}

What does HashMap do with the hashCode?

  • Compute the hashcode to get an int
  • Mix the bits
    int h = hashCode;
    h ^= (h >>> 16); // Mix high and low bits
    
    // This improves the randomness so that keys are more evenly spread.
    // >>> pushes the upper bits 16 bits to the right and adds zeros to the left
    // Example:
    // 00000011 00000001 01010101 00010101 becomes 00000000 00000000 00000011 00000001
  • Find the bucket index

    If capacity is 16:

    index = h & (16 - 1); // mask lower 4 bits
    
    // So the map only looks in 1 of 16 slots.

    The bitwise AND (&) operation compares two binary numbers bit by bit, and returns 1 only if both bits are 1 — otherwise it returns 0.

  • 4. Check the bucket

    • If it’s empty, store the entry there.
    • If not: compare hash and use .equals() to find the right key

    Why mix the hash with hash ^ (hash >>> 16)?

    Java HashMap uses this mixing to avoid poor distribution.

    Imagine two strings with only the first characters different:

    • "abc": hash is 96354
    • "xbc": hash is 120594

    If the map only looks at low bits (due to capacity), they may collide. XORing with the high 16 bits spreads the differences better.

    Bucket example (visual)

    Let’s say we have:

    • Capacity = 16
    • Key: "mustafa"
    • hashCode("mustafa") = 1099491643
  • Mix:
  • h = 1099491643 ^ (1099491643 >>> 16)
      = 1099491643 ^ 16777 = 1099508290
  • Mask for bucket:
  • index = 1099508290 & (16 - 1) = 1099508290 & 15 = 2
  • So this key goes in bucket 2.
  • What if two keys have the same index?

    This is called a collision. HashMap handles it by using a linked list (or a tree if many keys collide).

    • It walks the list in the bucket
    • Compares each node's hash and then equals()

    Why is HashMap faster than List?

    • List: you must check each element one by one (O(n))
    • HashMap: uses hashCode to jump directly to a likely place (O(1) on average)