Skip to content

HashMap Internals Interview Questions & Answers

22 questions Updated 2026-06-20 Share:

Java HashMap internals interview questions — buckets and hashing, load factor and resizing, treeification, collision handling, hashCode/equals contract, and HashMap vs ConcurrentHashMap vs Hashtable.

Read the in-depth guideHow Java HashMap Works Internally — Buckets, Hashing, Resizing & Treeification(opens in new tab)
22 of 22

A HashMap stores entries in an array of buckets (Node<K,V>[] table). Each bucket holds a chain of Node objects, where a Node packs the key's hash, the key, the value, and a next pointer to the following node in the same bucket.

static class Node<K,V> {
  final int hash;   // cached spread hash of the key
  final K key;
  V value;
  Node<K,V> next;   // chains entries in the same bucket
}

On put, the key's hash picks a bucket index; the entry is stored there (or appended to the chain on a collision). On get, the same index is computed and the chain is scanned with equals to find the matching key. With a good hash, operations are average O(1); the array is resized as it fills to keep chains short.

The bucket index is computed as (n - 1) & hash, which only keeps the low bits of the hash (since n is a power of two). If two keys differ only in their high bits, they'd collide. To mix high bits down into the low ones, HashMap applies a spread function before indexing.

static final int hash(Object key) {
  int h;
  return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

XORing the hash with its own upper 16 bits is cheap (one shift, one XOR) yet meaningfully reduces collisions for hashes whose entropy lives in the high bits. Note a null key is mapped to hash 0, which is why it always lands in bucket 0.

It uses index = (n - 1) & hash, where n is the table length. Because n is always a power of two, n - 1 is a mask of all-ones in the low bits, so the bitwise & is equivalent to hash % n but far faster than a modulo.

int n = 16;              // table length, a power of two
int idx = (n - 1) & hash; // 15 & hash  ->  keeps the low 4 bits

This is the whole reason capacity must be a power of two: the & trick only distributes entries evenly across all buckets when n - 1 is a clean low-bit mask. A non-power-of-two n would leave some buckets unreachable.

Two reasons, both about the (n - 1) & hash indexing. First, speed: a bitwise AND replaces an expensive modulo. Second, distribution: when n is a power of two, n - 1 is 0b...0111…1, so the AND keeps a contiguous block of low bits and every bucket is reachable. If n weren't a power of two, the mask would have gaps and some buckets could never be hit.

// HashMap rounds any requested capacity up to the next power of two
new HashMap<>(10);  // actual table capacity becomes 16
new HashMap<>(20);  // becomes 32

The constructor runs tableSizeFor to round your requested initial capacity up to the nearest power of two, so you can never actually create a non-power-of-two table.

  1. Compute the spread hash of the key.
  2. Compute the bucket index (n - 1) & hash.
  3. If the bucket is empty, drop the new node in.
  4. If occupied, scan the chain: a node matching by hash and equals means an existing key — overwrite its value and return the old one.
  5. Otherwise append a new node; if the chain length crosses 8 (and the table is ≥ 64), treeify the bucket.
  6. Increment size; if size > threshold (capacity × load factor), resize.
map.put("a", 1);  // new bucket
map.put("a", 2);  // same key -> value replaced, returns 1
map.put("b", 3);  // collision? append to chain

The match step is why equals is essential: same bucket does not mean same key — the chain must be walked and compared.

get mirrors put's lookup half: spread the key's hash, compute the index, then scan the bucket. For each node it first compares the cached hash (a cheap int compare) and only then calls equals — the hash check short-circuits most mismatches.

V get(Object key) {
  // hash -> index -> walk chain
  // return node where node.hash == h && (k == node.key || k.equals(node.key))
}

In a linked-list bucket this is O(chain length); in a treeified bucket it's O(log n). A missing key returns null — which is indistinguishable from a key mapped to null, so use containsKey when that difference matters.

When two keys map to the same bucket (a collision), HashMap chains them. Originally every bucket is a singly linked list of Nodes. As of Java 8, once a single bucket's chain grows past a threshold the bucket is converted to a red-black tree so lookups within it stay logarithmic rather than linear.

bucket[5] -> Node(k1) -> Node(k2) -> Node(k3)   // linked list

// after treeification:
bucket[5] -> (balanced red-black tree of TreeNodes)

So a collision degrades the bucket from O(1) toward O(log n) at worst (with a reasonable hashCode), not toward O(n). Distinct keys can collide even with perfect hashCodes simply because many hashes fold into the same small index.

Treeification converts a bucket from a linked list to a red-black tree when it gets too long, bounding worst-case lookups at O(log n) instead of O(n). Two conditions must both hold:

Constant Value Meaning
TREEIFY_THRESHOLD 8 chain length that triggers treeify
MIN_TREEIFY_CAPACITY 64 table must be at least this big
UNTREEIFY_THRESHOLD 6 tree shrinks back to a list at/below this
// if a bucket reaches 8 nodes but table < 64,
// HashMap resizes instead of treeifying

If the table is smaller than 64, a long chain usually means the table is just too small, so HashMap resizes rather than treeifying. The gap between 8 and 6 (hysteresis) avoids flip-flopping between tree and list on repeated add/remove.

The defaults are initial capacity 16 and load factor 0.75. The load factor is the fullness fraction at which the table grows: the threshold = capacity × load factor, so a default map resizes when it holds more than 16 × 0.75 = 12 entries.

new HashMap<>();            // capacity 16, threshold 12
new HashMap<>(32, 0.5f);    // capacity 32, threshold 16

0.75 is a deliberate time/space trade-off: lower factors waste memory but reduce collisions; higher factors save memory but lengthen chains. If you know the final size, presize with new HashMap<>(expected / 0.75 + 1) to avoid repeated resizes.

When size exceeds the threshold, HashMap doubles the capacity (a new power-of-two table) and rehashes every entry into the larger table. Doubling keeps the power-of-two invariant so (n - 1) & hash stays valid.

// Java 8 split trick: with capacity doubled, an entry either stays
// at index i, or moves to index i + oldCapacity — decided by one bit:
if ((node.hash & oldCapacity) == 0)  // stays in "low" bucket
else                                  // moves to "high" bucket

Because only the single new high bit of the hash decides where an entry goes, Java 8 splits each old bucket's chain into exactly two ordered sub-chains without recomputing indexes from scratch. Resizing is O(n) and rebuilds the table, which is why presizing matters for large maps.

The contract: if a.equals(b) then a.hashCode() == b.hashCode() must hold (the reverse need not — unequal objects may share a hash). equals must also be reflexive, symmetric, transitive, and consistent.

class Point {
  int x, y;
  @Override public boolean equals(Object o) { /* compare x,y */ }
  @Override public int hashCode() { return Objects.hash(x, y); }
}

HashMap relies on this directly: it uses hashCode to find the bucket and equals to find the key within it. Override only one and a key you stored becomes unfindable — get lands in the wrong bucket (or the right bucket but fails the equals check) and returns null. Always override the two together.

If hashCode returns the same value for every key (e.g. return 42; or return 0;), every entry hashes to one bucket. That single bucket becomes one giant chain, so get/put must scan it linearly — O(n) per operation, defeating the whole point of a hash map.

class Bad {
  @Override public int hashCode() { return 1; } // all collide!
}
// 10,000 Bad keys -> one bucket of 10,000 nodes -> O(n) lookups

Treeification softens but doesn't fix this: the bucket becomes a red-black tree, improving to O(log n) — and only if the keys are also Comparable, otherwise it falls back to a list. A well-distributed hashCode is the real cure.

Yes — HashMap allows one null key and any number of null values. The null key is special-cased to hash 0, so it always sits in bucket 0.

Map<String, String> m = new HashMap<>();
m.put(null, "x");   // legal — single null key
m.put("a", null);   // legal — null value
m.get("missing");   // null — but is the key absent or mapped to null?
System.out.println(m.containsKey("a")); // true, even though value is null

The ambiguity of a null return is the catch: get returns null both for an absent key and a key mapped to null. Use containsKey to tell them apart. Contrast with Hashtable and ConcurrentHashMap, which forbid nulls entirely.

HashMap's iterators are fail-fast: they track a modCount and throw ConcurrentModificationException if the map is structurally modified (add or remove) during iteration through any path other than the iterator itself.

for (String k : map.keySet()) {
  if (k.startsWith("x")) map.remove(k);  // throws CME
}

// safe: remove through the iterator
Iterator<String> it = map.keySet().iterator();
while (it.hasNext()) { if (it.next().startsWith("x")) it.remove(); }

Fail-fast is a best-effort bug detector, not a guarantee — it's specified as "throws on a best-effort basis," so never write logic that depends on the exception. To mutate while iterating, use iterator.remove(), collect-then- remove, or removeIf.

HashMap has no synchronization, so concurrent puts can interleave and corrupt the table — lost updates, wrong size, or a thread seeing a half-built state. The most infamous case was an infinite loop on resize in pre-Java-8 JDKs.

old resize() reversed each bucket's chain while transferring it;
two threads resizing at once could splice A -> B -> A into a cycle,
so a later get() would spin forever at 100% CPU.

Java 8 changed resize to preserve chain order (the low/high split), which removed that specific cycle — but HashMap is still not thread-safe and can lose data or throw under concurrency. For shared maps use ConcurrentHashMap; Collections.synchronizedMap works but locks the whole map.

A key's hash is computed when it's inserted and used to pick its bucket. If you then mutate a field that affects hashCode/equals, the key's new hash points to a different bucket than where it's actually stored — so the map can no longer find it.

Set<List<Integer>> s = new HashSet<>();
List<Integer> key = new ArrayList<>(List.of(1, 2));
s.add(key);
key.add(3);                     // mutates the key's hashCode!
System.out.println(s.contains(key)); // false — lost in the wrong bucket

The entry becomes a "ghost": present in the map but unreachable by get or contains, and possibly duplicated on re-insert. Prefer immutable keys (String, Integer, records, frozen value objects) so the hash can never drift.

All implement Map, but they differ in ordering and complexity:

Map Ordering get/put Backing structure
HashMap none (arbitrary) O(1) avg hash table
LinkedHashMap insertion (or access) order O(1) avg hash table + linked list
TreeMap sorted by key O(log n) red-black tree
new HashMap<>();        // fastest, no order guarantee
new LinkedHashMap<>();  // predictable iteration order
new TreeMap<>();        // keys come out sorted; needs Comparable/Comparator

Pick HashMap by default, LinkedHashMap when you need stable iteration order or an LRU cache (access-order mode + removeEldestEntry), and TreeMap when you need sorted keys or range queries (firstKey, headMap, subMap).

Feature HashMap Hashtable ConcurrentHashMap
Thread-safe no yes (whole-method synchronized) yes (fine-grained)
Null key/values 1 null key, null values none none
Locking one lock for the whole table per-bucket / CAS
Status preferred single-thread legacy preferred concurrent
Map<String,Integer> m = new ConcurrentHashMap<>();
m.compute("k", (k, v) -> v == null ? 1 : v + 1); // atomic, no external lock

Hashtable is a legacy class that locks every method, so it serializes all access — slow and effectively retired. ConcurrentHashMap locks at the bucket/bin level (with CAS for hot paths), so many threads can write to different buckets in parallel. Its iterators are weakly consistent (no ConcurrentModificationException).

Modern ConcurrentHashMap (Java 8+) abandoned the old segment design for the same Node[] table as HashMap, plus per-bin locking and CAS. An empty bin is filled with a lock-free compare-and-swap; a non-empty bin is updated under a synchronized block on the bin's first node — so contention is limited to the keys colliding in that one bin.

ConcurrentHashMap<String,Long> counts = new ConcurrentHashMap<>();
counts.merge("hit", 1L, Long::sum);   // atomic increment, thread-safe

Reads are non-blocking (fields are volatile), so get never locks. Resizes are cooperative: multiple threads help transfer bins concurrently. This is why it scales far better than Hashtable's single lock while still being fully correct.

A treeified bucket is a red-black tree, which normally orders nodes by the key's compareTo. If the keys don't implement Comparable, HashMap can't sort them, so it falls back to comparing identity hash codes and uses a tie-breaking routine (tieBreakOrder) to keep the tree balanced — searches then effectively scan rather than binary-search within ties.

// keys with equal hash but no Comparable order:
// tree falls back to System.identityHashCode() ordering for placement

The practical takeaway: treeification still bounds worst case better than a plain list, but you only get the full O(log n) lookup benefit when colliding keys are mutually Comparable. A good hashCode (so buckets never treeify) is still the best defense.

HashSet is a thin wrapper around a HashMap: it stores each element as a key, mapping every key to a single shared dummy value (a constant PRESENT object). So all of HashSet's behavior — hashing, buckets, load factor, treeification — is just HashMap's.

private static final Object PRESENT = new Object();
public boolean add(E e) {
  return map.put(e, PRESENT) == null;  // null means key was new
}

This is why HashSet has the same O(1) average operations, the same null-element-allowed rule, and the same requirement that elements have a proper hashCode/equals. LinkedHashSet and TreeSet likewise wrap LinkedHashMap and TreeMap.

Each resize is an O(n) rehash, so for a map you'll fill to a known size, presize it. Because resize triggers at capacity × 0.75, request an initial capacity of about expected / 0.75 + 1 so the table never crosses its threshold during loading.

int expected = 1000;
Map<String,Integer> m = new HashMap<>((int)(expected / 0.75f) + 1);

// Java 19+: clearer intent, does the math for you
Map<String,Integer> m2 = HashMap.newHashMap(expected);

Passing the raw expected size (new HashMap<>(1000)) is a common mistake — it rounds to capacity 1024 but still resizes at 768 entries. Java 19's HashMap.newHashMap(n) factory expresses "I will hold n entries" directly.

More ways to practice

The self-quiz is live. Get notified when mock interviews and new question packs drop.

or
Join our WhatsApp Channel