Skip to content

Set Implementations Interview Questions & Answers

22 questions Updated 2026-06-20 Share:

Java Set interview questions — HashSet vs LinkedHashSet vs TreeSet, how they work internally, ordering and complexity, NavigableSet/SortedSet, EnumSet and CopyOnWriteArraySet, and set operations.

Read the in-depth guideJava Set Explained — HashSet vs LinkedHashSet vs TreeSet(opens in new tab)
22 of 22

A Set is a Collection that holds no duplicate elements — adding an element already present is a no-op and add returns false. There's no index and (for the general contract) no positional access; you check membership with contains, which is the operation sets are built to make fast.

Set<String> tags = new HashSet<>();
tags.add("java");   // true  — added
tags.add("java");   // false — duplicate, ignored
tags.size();        // 1
tags.contains("java"); // true

"Duplicate" is defined by equals (and hashCode for hash-based sets, or compareTo/Comparator for sorted sets) — not by reference identity. That definition of equality is the heart of every Set question.

It depends on the implementation, and this is the single most important Set detail in interviews:

Set Duplicate test
HashSet, LinkedHashSet hashCode() to find the bucket, then equals() within it
TreeSet compareTo() (or the supplied Comparator) — equals is ignored
// Two elements are "equal" in a TreeSet when compare returns 0:
Set<String> ci = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
ci.add("Java");
ci.add("JAVA");   // compare == 0 -> treated as duplicate, not added
ci.size();        // 1

The trap: a TreeSet uses comparison, not equals, so an element can be dropped as a "duplicate" even though equals says it's different (and vice versa) — keep your compareTo consistent with equals to avoid surprises.

A HashSet is just a thin wrapper around a HashMap — each element is stored as a key, and all keys map to the same dummy PRESENT value object. So everything you know about HashMap (buckets, load factor, treeified bins) applies directly.

// Conceptually, inside HashSet:
private transient HashMap<E, Object> map;
private static final Object PRESENT = new Object();

public boolean add(E e) {
    return map.put(e, PRESENT) == null; // true if key was new
}

That's why HashSet gives average O(1) add/contains/remove, has no ordering, and depends entirely on good hashCode/equals. If hashes collide badly, performance degrades toward O(n) (or O(log n) once a bucket treeifies).

LinkedHashSet extends HashSet but is backed by a LinkedHashMap, so it keeps a doubly-linked list threading through the entries. The effect: iteration follows insertion order while keeping HashSet's O(1) operations.

Set<String> hs = new HashSet<>();
hs.add("c"); hs.add("a"); hs.add("b");
// iteration order: unspecified (e.g. a, b, c)

Set<String> lhs = new LinkedHashSet<>();
lhs.add("c"); lhs.add("a"); lhs.add("b");
// iteration order: c, a, b  — exactly as inserted

Cost is a slightly larger memory footprint (the extra prev/next links). Reach for it whenever you need dedup but predictable, reproducible ordering — e.g. removing duplicates from a list without scrambling it.

A TreeSet is backed by a TreeMap, a red-black tree (self-balancing binary search tree). Elements are kept in sorted order — natural ordering via Comparable, or a Comparator you pass to the constructor — and core operations are O(log n) rather than O(1).

Set<Integer> nums = new TreeSet<>();
nums.add(5); nums.add(1); nums.add(3);
System.out.println(nums);  // [1, 3, 5] — always sorted

// custom order:
Set<String> byLen = new TreeSet<>(Comparator.comparingInt(String::length));

Because it's a tree, you get range and neighbour queries for free (first, last, floor, ceiling, headSet…). The trade-off vs HashSet: slower point lookups but ordered iteration and rich navigation.

Elements must be mutually comparable — either they implement Comparable (natural ordering) or you supply a Comparator. Without one of those, the very first add of a non-comparable type throws ClassCastException at runtime.

class Point { int x, y; }              // no Comparable, no Comparator
Set<Point> s = new TreeSet<>();
s.add(new Point());                    // ClassCastException at runtime

// fix: give it an ordering
Set<Point> ok = new TreeSet<>(Comparator.comparingInt(p -> p.x));
ok.add(new Point());                   // fine

Note the failure is not at construction time — an empty TreeSet is happy; it only blows up when it needs to compare an element it can't order.

The classic side-by-side every interviewer expects:

Feature HashSet LinkedHashSet TreeSet
Backed by HashMap LinkedHashMap TreeMap (red-black tree)
Ordering none insertion order sorted (natural/comparator)
add/contains/remove O(1) avg O(1) avg O(log n)
null allowed one null one null no (NPE with natural order)
Needs equals/hashCode yes yes uses compareTo/Comparator
Extra navigation no no yes (first/floor/subSet…)
// pick by need:
new HashSet<>();        // fastest, order doesn't matter
new LinkedHashSet<>();  // dedup, keep insertion order
new TreeSet<>();        // need sorted iteration or range queries

None of the three is thread-safe. Default to HashSet; upgrade only when you specifically need order or sorting.

SortedSet is the older interface guaranteeing sorted iteration and offering first(), last(), headSet, tailSet, subSet, and comparator(). NavigableSet (Java 6+) extends it with neighbour lookups and reverse-order views. TreeSet implements both.

NavigableSet<Integer> s = new TreeSet<>(List.of(10, 20, 30, 40));
s.first();           // 10        (SortedSet)
s.last();            // 40
s.floor(25);         // 20  <= 25 (NavigableSet)
s.ceiling(25);       // 30  >= 25
s.descendingSet();   // [40, 30, 20, 10]
s.pollFirst();       // 10, and removes it

In practice you usually just declare the variable as NavigableSet (or TreeSet) to get the full method set — SortedSet alone lacks the handy floor/ceiling/higher/lower queries.

They find the closest element to a target. The distinction is inclusive vs exclusive of the target itself:

Method Returns
floor(e) greatest element ≤ e
ceiling(e) smallest element ≥ e
lower(e) greatest element strictly < e
higher(e) smallest element strictly > e
NavigableSet<Integer> s = new TreeSet<>(List.of(10, 20, 30));
s.floor(20);   // 20  (<=)
s.lower(20);   // 10  (strictly <)
s.ceiling(20); // 20  (>=)
s.higher(20);  // 30  (strictly >)
s.floor(5);    // null — nothing <= 5

All return null when no such element exists, so callers must null-check. These run in O(log n) and are exactly why you reach for a TreeSet over a HashSet for "nearest neighbour" or range problems.

They return range views of a sorted set — not copies. headSet(to) is everything below to, tailSet(from) is everything from from up, and subSet(from, to) is the half-open range [from, to). The NavigableSet overloads let you toggle the inclusive/exclusive bounds.

NavigableSet<Integer> s = new TreeSet<>(List.of(10, 20, 30, 40));
s.headSet(30);            // [10, 20]      (exclusive end)
s.tailSet(20);            // [20, 30, 40]  (inclusive start)
s.subSet(20, 40);         // [20, 30]      (half-open)
s.subSet(20, true, 40, true); // [20, 30, 40]

They are backed by the original set: changes flow both ways, and adding an element outside the view's bounds throws IllegalArgumentException. Wrap in a new TreeSet<>(view) if you need an independent snapshot.

Use descendingSet() (a reverse-ordered view) or descendingIterator(). Both walk the tree from largest to smallest without copying or re-sorting.

NavigableSet<Integer> s = new TreeSet<>(List.of(1, 2, 3));
for (int n : s.descendingSet()) {
    System.out.print(n + " ");   // 3 2 1
}

Iterator<Integer> it = s.descendingIterator();
while (it.hasNext()) { /* 3, then 2, then 1 */ }

Because descendingSet() is a live view, mutations on it affect the original set and vice versa. It's the idiomatic alternative to constructing a TreeSet with Comparator.reverseOrder() when you only need reverse iteration occasionally.

No — with natural ordering, adding null throws NullPointerException, because the set must call compareTo on the element and null.compareTo(...) can't happen. (A HashSet/LinkedHashSet, by contrast, allows a single null.)

Set<String> tree = new TreeSet<>();
tree.add(null);   // NullPointerException

Set<String> hash = new HashSet<>();
hash.add(null);   // fine — exactly one null allowed

Even a null-tolerant Comparator (e.g. Comparator.nullsFirst(...)) only helps for non-first elements; the first add(null) on an empty natural-order TreeSet still throws. Treat TreeSet as null-hostile.

EnumSet is a specialized Set for enum types only. Internally it's a bit vector — a single long (or array of longs for big enums) where each bit represents one constant. That makes operations near-instant and the memory footprint tiny.

enum Day { MON, TUE, WED, THU, FRI, SAT, SUN }

EnumSet<Day> work = EnumSet.range(Day.MON, Day.FRI);
EnumSet<Day> weekend = EnumSet.complementOf(work); // [SAT, SUN]
EnumSet<Day> none = EnumSet.noneOf(Day.class);
EnumSet<Day> all  = EnumSet.allOf(Day.class);

Iteration follows the enum's declaration order. It's created via factory methods (of, range, allOf, noneOf, complementOf) rather than new. Whenever your set elements are enum constants, EnumSet beats HashSet on every axis — speed, memory, and clarity.

The standard HashSet/TreeSet/LinkedHashSet are not synchronized. Options, from cheap to concurrent:

  • Collections.synchronizedSet(set) — wraps every method in a lock; you must still manually synchronize when iterating.
  • ConcurrentHashMap.newKeySet() — a concurrent hash set backed by ConcurrentHashMap; high-throughput, no external locking, weakly consistent iteration.
  • CopyOnWriteArraySet — copies the backing array on every write; great for tiny, read-heavy sets, terrible for write-heavy ones.
Set<String> sync = Collections.synchronizedSet(new HashSet<>());
synchronized (sync) {                // iteration needs the lock
    for (String s : sync) { /* ... */ }
}

Set<String> concurrent = ConcurrentHashMap.newKeySet(); // preferred default

For most concurrent code, ConcurrentHashMap.newKeySet() is the right choice — it scales far better than a globally locked wrapper.

CopyOnWriteArraySet (backed by a CopyOnWriteArrayList) makes a fresh copy of the whole array on every mutation. Reads and iteration are lock-free and see a stable snapshot, but each add/remove is O(n).

Set<Listener> listeners = new CopyOnWriteArraySet<>();
listeners.add(a);                 // copies the backing array
for (Listener l : listeners) {    // iterates a snapshot — no CME, ever
    l.onEvent();                  // safe even if another thread mutates
}

It shines for small, read-mostly, rarely-mutated sets — the textbook case being event-listener registries. Because it scans the array, contains is O(n), so it's a poor fit for large or write-heavy sets, where ConcurrentHashMap.newKeySet() wins.

Copy one set and addAll the other — duplicates are dropped automatically by the Set contract.

Set<Integer> a = new HashSet<>(List.of(1, 2, 3));
Set<Integer> b = new HashSet<>(List.of(3, 4, 5));

Set<Integer> union = new HashSet<>(a);
union.addAll(b);          // {1, 2, 3, 4, 5}

addAll is the union operator. Always copy first (new HashSet<>(a)) so you don't mutate the original a — a common bug is calling a.addAll(b) and silently changing the caller's set.

retainAll keeps only the elements present in both sets.

Set<Integer> a = new HashSet<>(List.of(1, 2, 3));
Set<Integer> b = new HashSet<>(List.of(3, 4, 5));

Set<Integer> intersection = new HashSet<>(a);
intersection.retainAll(b);   // {3}

retainAll is the intersection operator. For performance, copy and iterate the smaller set against the larger one — the cost is roughly O(min(a, b)) lookups, and lookups are O(1) on a HashSet.

removeAll strips out every element that also appears in the other set, leaving the difference (elements in a but not b).

Set<Integer> a = new HashSet<>(List.of(1, 2, 3));
Set<Integer> b = new HashSet<>(List.of(3, 4, 5));

Set<Integer> diff = new HashSet<>(a);
diff.removeAll(b);    // {1, 2}

So the trio is: addAll = union, retainAll = intersection, removeAll = difference. A symmetric difference (in either but not both) is just union minus intersection, or two removeAlls combined.

A HashSet places each element in a bucket chosen by its hashCode() at insertion time. If you then mutate a field that hashCode/equals depend on, the element's hash changes but it stays in the old bucket — so the set can no longer find it.

Set<Point> set = new HashSet<>();
Point p = new Point(1, 1);
set.add(p);
p.x = 99;                  // mutated a field used by hashCode

set.contains(p);           // false — looks in the new bucket, finds nothing
// p is now a "ghost": present but unreachable; even remove() may miss it

That's why Set (and Map-key) elements should be immutable, or at least never mutated on their equals/hashCode fields while stored. It's the same reason String and the wrapper types — immutable — make perfect set elements.

Set.of(...) (Java 9+) returns a small, immutable set. Any mutating call (add, remove, clear) throws UnsupportedOperationException, and — unlike a normal add that silently ignores dups — passing a duplicate to the factory throws IllegalArgumentException.

Set<String> s = Set.of("a", "b", "c");
s.add("d");                    // UnsupportedOperationException

Set<String> dup = Set.of("a", "a");  // IllegalArgumentException at creation!

It also rejects null (NPE) and has an unspecified iteration order that can vary between runs. Use it for compact, never-changing constant sets; use LinkedHashSet/TreeSet when you need order or mutability.

Each implementation makes a different promise — knowing them prevents a class of "works on my machine" bugs:

Set Iteration order
HashSet no guarantee (can even change between JVM runs)
LinkedHashSet insertion order
TreeSet sorted (natural or comparator)
EnumSet enum declaration order
Set.of(...) unspecified, may vary per run
// never rely on HashSet order:
for (String s : new HashSet<>(List.of("a", "b", "c"))) { /* any order */ }

The rule of thumb: if your output or tests depend on order, don't use HashSet — choose LinkedHashSet for insertion order or TreeSet for sorted order.

HashSet.contains hashes the element, jumps straight to one bucket, and checks only the few items there — average O(1). List.contains has no index into its data, so it scans from the front comparing each element — O(n).

List<Integer> list = new ArrayList<>(/* 1_000_000 items */);
Set<Integer> set  = new HashSet<>(list);

list.contains(999_999);   // O(n) — walks ~a million elements
set.contains(999_999);    // O(1) — one bucket lookup

This is why a frequent pattern is to dump a list into a HashSet first when you'll do many membership checks. The rule of thumb: need fast "is it in there?" — use a Set, not a List.

More ways to practice

The self-quiz is live. Get notified when mock interviews and new question packs drop.

or
Join our WhatsApp Channel