Q: How does HashMap handle hash collisions?

When two keys map to the same bucket (a collision), chains them. Originally every bucket is a singly linked list of s. As of Java 8, once a single bucket's chain grows past a threshold the bucket is converted to a red-black tree so lookups within it stay logarithmic rather than linear. So a collision degrades the bucket from O(1) toward O(log n) at worst (with a reasonable ), not toward O(n). Distinct keys can collide even with perfect hashCodes simply because many hashes fold into the same small index.

Question 1

How does a HashMap work internally?

Accepted Answer

A  stores entries in an array of buckets (). Each bucket holds a chain of  objects, where a  packs the key's hash, the key, the value, and a * pointer to the following node in the same bucket. On , the key's hash picks a bucket index; the entry is stored there (or appended to the chain on a collision). On , the same index is computed and the chain is scanned with  to find the matching key. With a good hash, operations are average O(1)*; the array is resized as it fills to keep chains short.

Question 2

What is the spread (perturbation) function and why does HashMap apply it?

Accepted Answer

The bucket index is computed as , which only keeps the low bits of the hash (since  is a power of two). If two keys differ only in their high bits, they'd collide. To mix high bits down into the low ones,  applies a spread function before indexing. XORing the hash with its own upper 16 bits is cheap (one shift, one XOR) yet meaningfully reduces collisions for hashes whose entropy lives in the high bits. Note a null key is mapped to hash , which is why it always lands in bucket 0.

Question 3

How does HashMap convert a hash into a bucket index?

Accepted Answer

It uses ****, where  is the table length. Because  is always a power of two,  is a mask of all-ones in the low bits, so the bitwise  is equivalent to  but far faster than a modulo. This is the whole reason capacity must be a power of two: the  trick only distributes entries evenly across all buckets when  is a clean low-bit mask. A non-power-of-two  would leave some buckets unreachable.

Question 4

Why must HashMap capacity always be a power of two?

Accepted Answer

Two reasons, both about the  indexing. First, speed: a bitwise AND replaces an expensive modulo. Second, distribution: when  is a power of two,  is , so the AND keeps a contiguous block of low bits and every bucket is reachable. If  weren't a power of two, the mask would have gaps and some buckets could never be hit. The constructor runs  to round your requested initial capacity up to the nearest power of two, so you can never actually create a non-power-of-two table.

Question 5

Walk through what happens on a put() call.

Accepted Answer

1. Compute the spread hash of the key. 2. Compute the bucket index . 3. If the bucket is empty, drop the new node in. 4. If occupied, scan the chain: a node matching by  and     means an existing key — overwrite its value and return the old one. 5. Otherwise append a new node; if the chain length crosses 8 (and the    table is ≥ 64), treeify the bucket. 6. Increment ; if  (capacity × load factor), resize. The match step is why  is essential: same bucket does not mean same key — the chain must be walked and compared.

Question 6

Walk through what happens on a get() call.

Accepted Answer

mirrors 's lookup half: spread the key's hash, compute the index, then scan the bucket. For each node it first compares the cached hash (a cheap int compare) and only then calls * — the hash check short-circuits most mismatches. In a linked-list bucket this is O(chain length); in a treeified bucket it's O(log n). A missing key returns  — which is indistinguishable from a key mapped to* , so use  when that difference matters.

Question 7

How does HashMap handle hash collisions?

Accepted Answer

When two keys map to the same bucket (a collision), HashMap chains them. Originally every bucket is a singly linked list of Nodes. As of Java 8, once a single bucket's chain grows past a threshold the bucket is converted to a red-black tree so lookups within it stay logarithmic rather than linear.

bucket[5] -> Node(k1) -> Node(k2) -> Node(k3)   // linked list

// after treeification:
bucket[5] -> (balanced red-black tree of TreeNodes)

So a collision degrades the bucket from O(1) toward O(log n) at worst (with a reasonable hashCode), not toward O(n). Distinct keys can collide even with perfect hashCodes simply because many hashes fold into the same small index.

Question 8

What is treeification and when does it happen?

Accepted Answer

Treeification converts a bucket from a linked list to a red-black tree when it gets too long, bounding worst-case lookups at O(log n) instead of O(n). Two conditions must both hold: | Constant | Value | Meaning | | -------- | ----- | ------- | |  | 8 | chain length that triggers treeify | |  | 64 | table must be at least this big | |  | 6 | tree shrinks back to a list at/below this | If the table is smaller than 64, a long chain usually means the table is just too small, so  resizes rather than treeifying. The gap between 8 and 6 (hysteresis) avoids flip-flopping between tree and list on repeated add/remove.

Question 9

What are the default initial capacity and load factor, and what do they mean?

Accepted Answer

The defaults are initial capacity 16 and load factor 0.75. The load factor is the fullness fraction at which the table grows: the threshold = capacity × load factor, so a default map resizes when it holds more than  entries.  is a deliberate time/space trade-off: lower factors waste memory but reduce collisions; higher factors save memory but lengthen chains. If you know the final size, presize with  to avoid repeated resizes.

Question 10

What happens during a HashMap resize?

Accepted Answer

When  exceeds the threshold,  doubles the capacity (a new power-of-two table) and rehashes every entry into the larger table. Doubling keeps the power-of-two invariant so  stays valid. Because only the single new high bit of the hash decides where an entry goes, Java 8 splits each old bucket's chain into exactly two ordered sub-chains without recomputing indexes from scratch. Resizing is O(n) and rebuilds the table, which is why presizing matters for large maps.

Question 11

What is the hashCode/equals contract and why must HashMap keys honor it?

Accepted Answer

The contract: if  then  must hold (the reverse need not — unequal objects may share a hash).  must also be reflexive, symmetric, transitive, and consistent.  relies on this directly: it uses  to find the bucket and  to find the key within it. Override only one and a key you stored becomes unfindable —  lands in the wrong bucket (or the right bucket but fails the  check) and returns . Always override the two together.

Question 12

How can a bad hashCode degrade HashMap to O(n)?

Accepted Answer

If  returns the same value for every key (e.g.  or ), every entry hashes to one bucket. That single bucket becomes one giant chain, so / must scan it linearly — O(n) per operation, defeating the whole point of a hash map. Treeification softens but doesn't fix this: the bucket becomes a red-black tree, improving to O(log n) — and only if the keys are also , otherwise it falls back to a list. A well-distributed  is the real cure.

Question 13

Can a HashMap store null keys and null values?

Accepted Answer

Yes —  allows one null key and any number of null values. The null key is special-cased to hash , so it always sits in bucket 0. The ambiguity of a  return is the catch:  returns  both for an absent key and a key mapped to . Use **** to tell them apart. Contrast with  and , which forbid nulls entirely.

Question 14

What are fail-fast iterators and ConcurrentModificationException?

Accepted Answer

's iterators are fail-fast: they track a  and throw * if the map is structurally modified (add or remove) during iteration through any path other than the iterator itself. Fail-fast is a best-effort bug detector*, not a guarantee — it's specified as "throws on a best-effort basis," so never write logic that depends on the exception. To mutate while iterating, use , collect-then- remove, or .

Question 15

Why is HashMap not thread-safe, and what was the infinite-loop bug?

Accepted Answer

has no synchronization, so concurrent s can interleave and corrupt the table — lost updates, wrong , or a thread seeing a half-built state. The most infamous case was an infinite loop on resize in pre-Java-8 JDKs. Java 8 changed resize to preserve chain order (the low/high split), which removed that specific cycle — but  is still not thread-safe and can lose data or throw under concurrency. For shared maps use ****;  works but locks the whole map.

Question 16

Why should HashMap keys be immutable?

Accepted Answer

A key's hash is computed when it's inserted and used to pick its bucket. If you then mutate a field that affects /, the key's new hash points to a different bucket than where it's actually stored — so the map can no longer find it. The entry becomes a "ghost": present in the map but unreachable by  or , and possibly duplicated on re-insert. Prefer immutable keys (, , records, frozen value objects) so the hash can never drift.

Question 17

What is the difference between HashMap, LinkedHashMap and TreeMap?

Accepted Answer

All implement , but they differ in ordering and complexity: | Map | Ordering | get/put | Backing structure | | --- | -------- | ------- | ----------------- | |  | none (arbitrary) | O(1) avg | hash table | |  | insertion (or access) order | O(1) avg | hash table + linked list | |  | sorted by key | O(log n) | red-black tree | Pick  by default,  when you need stable iteration order or an LRU cache (access-order mode + ), and  when you need sorted keys or range queries (, , ).

Question 18

What is the difference between HashMap, Hashtable and ConcurrentHashMap?

Accepted Answer

| Feature |  |  |  | | ------- | --------- | ----------- | ------------------- | | Thread-safe | no | yes (whole-method ) | yes (fine-grained) | | Null key/values | 1 null key, null values | none | none | | Locking | — | one lock for the whole table | per-bucket / CAS | | Status | preferred single-thread | legacy | preferred concurrent |  is a legacy class that locks every method, so it serializes all access — slow and effectively retired. * locks at the bucket/bin level (with CAS for hot paths), so many threads can write to different buckets in parallel. Its iterators are weakly consistent* (no ).

Question 19

How does ConcurrentHashMap achieve thread safety without a global lock?

Accepted Answer

Modern  (Java 8+) abandoned the old segment design for the same  table as , plus per-bin locking and CAS. An empty bin is filled with a lock-free compare-and-swap; a non-empty bin is updated under a  block on the bin's first node — so contention is limited to the keys colliding in that one bin. Reads are non-blocking (fields are ), so  never locks. Resizes are cooperative: multiple threads help transfer bins concurrently. This is why it scales far better than 's single lock while still being fully correct.

Question 20

What happens when treeified keys are not Comparable?

Accepted Answer

A treeified bucket is a red-black tree, which normally orders nodes by the key's . If the keys don't implement *,  can't sort them, so it falls back to comparing identity hash codes and uses a tie-breaking routine () to keep the tree balanced — searches then effectively scan rather than binary-search within ties. The practical takeaway: treeification still bounds worst case better than a plain list, but you only get the full O(log n)* lookup benefit when colliding keys are mutually . A good  (so buckets never treeify) is still the best defense.

Question 21

How is HashSet implemented on top of HashMap?

Accepted Answer

is a thin wrapper around a : it stores each element as a key, mapping every key to a single shared dummy value (a constant  object). So all of 's behavior — hashing, buckets, load factor, treeification — is just 's. This is why  has the same O(1) average operations, the same null-element-allowed rule, and the same requirement that elements have a proper /.  and  likewise wrap  and .

Question 22

How should you size a HashMap to avoid resizing?

Accepted Answer

Each resize is an O(n) rehash, so for a map you'll fill to a known size, presize it. Because resize triggers at , request an initial capacity of about expected / 0.75 + 1 so the table never crosses its threshold during loading. Passing the raw expected size () is a common mistake — it rounds to capacity 1024 but still resizes at 768 entries. Java 19's  factory expresses "I will hold n entries" directly.

HashMap Internals Interview Questions & Answers

How does a HashMap work internally?

What is the spread (perturbation) function and why does HashMap apply it?

How does HashMap convert a hash into a bucket index?

Why must HashMap capacity always be a power of two?

Walk through what happens on a put() call.

Walk through what happens on a get() call.

How does HashMap handle hash collisions?

What is treeification and when does it happen?

What are the default initial capacity and load factor, and what do they mean?

What happens during a HashMap resize?

What is the hashCode/equals contract and why must HashMap keys honor it?

How can a bad hashCode degrade HashMap to O(n)?

Can a HashMap store null keys and null values?

What are fail-fast iterators and ConcurrentModificationException?

Why is HashMap not thread-safe, and what was the infinite-loop bug?

Why should HashMap keys be immutable?

What is the difference between HashMap, LinkedHashMap and TreeMap?

What is the difference between HashMap, Hashtable and ConcurrentHashMap?

How does ConcurrentHashMap achieve thread safety without a global lock?

What happens when treeified keys are not Comparable?

How is HashSet implemented on top of HashMap?

How should you size a HashMap to avoid resizing?

More ways to practice

Constant	Value	Meaning
`TREEIFY_THRESHOLD`	8	chain length that triggers treeify
`MIN_TREEIFY_CAPACITY`	64	table must be at least this big
`UNTREEIFY_THRESHOLD`	6	tree shrinks back to a list at/below this

Map	Ordering	get/put	Backing structure
`HashMap`	none (arbitrary)	O(1) avg	hash table
`LinkedHashMap`	insertion (or access) order	O(1) avg	hash table + linked list
`TreeMap`	sorted by key	O(log n)	red-black tree

Feature	`HashMap`	`Hashtable`	`ConcurrentHashMap`
Thread-safe	no	yes (whole-method `synchronized`)	yes (fine-grained)
Null key/values	1 null key, null values	none	none
Locking	—	one lock for the whole table	per-bucket / CAS
Status	preferred single-thread	legacy	preferred concurrent

How does a HashMap work internally?

What is the spread (perturbation) function and why does HashMap apply it?

How does HashMap convert a hash into a bucket index?

Why must HashMap capacity always be a power of two?

Walk through what happens on a put() call.

Walk through what happens on a get() call.

How does HashMap handle hash collisions?

What is treeification and when does it happen?

What are the default initial capacity and load factor, and what do they mean?

What happens during a HashMap resize?

What is the hashCode/equals contract and why must HashMap keys honor it?

How can a bad hashCode degrade HashMap to O(n)?

Can a HashMap store null keys and null values?

What are fail-fast iterators and ConcurrentModificationException?

Why is HashMap not thread-safe, and what was the infinite-loop bug?

Why should HashMap keys be immutable?

What is the difference between HashMap, LinkedHashMap and TreeMap?

What is the difference between HashMap, Hashtable and ConcurrentHashMap?

How does ConcurrentHashMap achieve thread safety without a global lock?

What happens when treeified keys are not Comparable?

How is HashSet implemented on top of HashMap?

How should you size a HashMap to avoid resizing?

More Collections interview questions

More ways to practice