Skip to content

Collectors & Grouping Interview Questions & Answers

22 questions Updated 2026-06-20 Share:

Java Collectors interview questions — collect and the Collectors factory, toList/toMap/toSet, groupingBy and partitioningBy, downstream collectors, joining, counting and summing, and writing a custom collector.

Read the in-depth guideJava Collectors & groupingBy — collect(), Downstream Collectors & Custom Collectors(opens in new tab)
22 of 22

collect is a mutable reduction terminal operation: it folds the stream's elements into a mutable result container (a List, Map, StringBuilder, etc.) by repeatedly accumulating into it. It takes a Collector, and the java.util.stream.Collectors class is a factory of ready-made ones.

List<String> upper = names.stream()
    .map(String::toUpperCase)
    .collect(Collectors.toList());

Unlike reduce, which combines immutable values, collect mutates a container in place, which is far more efficient for building collections and strings. Reach for the Collectors factory first — only write a custom collector when nothing there fits.

A Collector<T, A, R> is defined by four functions (T = input element, A = mutable accumulation type, R = final result):

Component Role
supplier creates a new empty mutable container (A)
accumulator folds one element into the container
combiner merges two partial containers (used in parallel)
finisher transforms the container A into the result R
// toList conceptually: supplier=ArrayList::new,
// accumulator=List::add, combiner=addAll, finisher=identity

A fifth piece, characteristics, hints at optimizations. The combiner is what makes a collector parallel-safe; the finisher is skipped entirely when the container already is the result (IDENTITY_FINISH).

All three gather elements into a collection, differing in the container:

  • toList() — accumulates into a List (an ArrayList in practice).
  • toSet() — accumulates into a Set (a HashSet), dropping duplicates, with no order guarantee.
  • toCollection(supplier) — accumulates into whatever collection you supply, when you need a specific type.
List<String> list = s.collect(Collectors.toList());
Set<String>  set  = s.collect(Collectors.toSet());
TreeSet<String> sorted =
    s.collect(Collectors.toCollection(TreeSet::new)); // sorted, no dups

Use toCollection when the default type won't do — e.g. a TreeSet for ordering or a LinkedList for insertion semantics.

Both produce a List, but the mutability and null handling differ:

collect(Collectors.toList()) stream.toList() (Java 16+)
Mutability modifiable (ArrayList) unmodifiable
Allows null yes yes
Conciseness verbose one method
List<Integer> a = nums.stream().collect(Collectors.toList());
a.add(99);                      // OK — mutable

List<Integer> b = nums.stream().toList();
b.add(99);                      // UnsupportedOperationException

Prefer the newer stream().toList() for read-only results; it's shorter and its immutability prevents accidental mutation. Use Collectors.toList() only when you genuinely need to modify the result afterward.

Java 10 added toUnmodifiableList, toUnmodifiableSet and toUnmodifiableMap, which return collections that throw UnsupportedOperationException on any mutation. They also reject null elements (throwing NullPointerException).

List<String> ro = names.stream()
    .filter(n -> n.length() > 3)
    .collect(Collectors.toUnmodifiableList());
ro.clear(); // UnsupportedOperationException

These are the collector equivalents of List.of/Set.of/Map.of. Since Java 16 you can also just use stream().toList() for an unmodifiable List.

toMap builds a Map from each element using a key mapper and a value mapper — two functions that derive the key and value from each element.

Map<String, Integer> byName = people.stream()
    .collect(Collectors.toMap(
        Person::name,        // key mapper
        Person::age));       // value mapper

By default the result is a HashMap. The catch interviewers probe is duplicate keys: if two elements map to the same key, the two-argument toMap throws an IllegalStateException ("Duplicate key") — you must supply a merge function to resolve collisions.

The three-argument toMap takes a merge function (existing, new) -> result invoked whenever two elements produce the same key. Without it, duplicate keys throw IllegalStateException.

// two-arg: throws IllegalStateException on duplicate "Bob"
// three-arg: resolve the clash
Map<String, Integer> totals = orders.stream()
    .collect(Collectors.toMap(
        Order::customer,
        Order::amount,
        Integer::sum));        // merge: add amounts for same customer

A fourth argument supplies the map type (e.g. TreeMap::new) for ordering or a specific implementation. Always provide a merge function when keys aren't guaranteed unique — it's the single most common toMap bug.

groupingBy takes a classifier function and partitions elements into a Map<K, List<T>> keyed by the classifier's result — every element with the same key lands in the same list.

Map<Department, List<Employee>> byDept = employees.stream()
    .collect(Collectors.groupingBy(Employee::department));

It's the SQL GROUP BY of streams. The default value container is a List and the default map is a HashMap. This single-argument form is the gateway — the real power comes from adding a downstream collector to reshape each group.

The two-argument groupingBy(classifier, downstream) applies a second collector to each group instead of just collecting elements into a list. This lets you count, sum, average, or otherwise reduce each bucket.

// count per department
Map<Dept, Long> counts = emps.stream()
    .collect(Collectors.groupingBy(Employee::dept, Collectors.counting()));

// sum of salaries per department
Map<Dept, Integer> totals = emps.stream()
    .collect(Collectors.groupingBy(Employee::dept,
             Collectors.summingInt(Employee::salary)));

Common downstreams: counting, summingInt/Long/Double, averagingInt/Double, mapping, toSet, joining, maxBy/minBy, reducing. Downstream collectors are the heart of expressive aggregation.

Collectors.mapping(mapper, downstream) applies a transform to each element before it reaches a further downstream collector — it adapts a collector to a different input type. It's how you collect a field of each group member rather than the whole object.

// names of employees in each department
Map<Dept, List<String>> names = emps.stream()
    .collect(Collectors.groupingBy(Employee::dept,
             Collectors.mapping(Employee::name, Collectors.toList())));

Think of mapping as a map() embedded inside a collector. It pairs naturally with toList, toSet, or joining to project group members.

Combine groupingBy with the counting() downstream collector. counting returns a Long, so the result is Map<K, Long>.

Map<String, Long> wordFreq = words.stream()
    .collect(Collectors.groupingBy(
        Function.identity(),     // group by the word itself
        Collectors.counting())); // count occurrences
// {"the"=4, "cat"=2, ...}

Function.identity() is the idiom for grouping elements by themselves — a frequency map. This is the canonical "count occurrences" stream pattern.

Because the downstream of groupingBy can be another groupingBy, you nest them to build a multi-level map — exactly like a SQL GROUP BY on two columns.

// group by department, then by city within each department
Map<Dept, Map<String, List<Employee>>> nested = emps.stream()
    .collect(Collectors.groupingBy(Employee::dept,
             Collectors.groupingBy(Employee::city)));

You can keep nesting or end with an aggregating downstream (Map<Dept, Map<String, Long>> via counting()). The outer classifier forms the first key level; each inner collector handles the next.

partitioningBy splits a stream into exactly two groups using a predicate, returning a Map<Boolean, List<T>> with keys true and false. It's a specialized, optimized groupingBy for the boolean case.

Map<Boolean, List<Integer>> parts = nums.stream()
    .collect(Collectors.partitioningBy(n -> n % 2 == 0));
parts.get(true);   // evens
parts.get(false);  // odds

Key difference from groupingBy: the map always contains both keys, even when one partition is empty (groupingBy omits empty groups). It also accepts a downstream collector: partitioningBy(pred, counting()).

joining concatenates the stream's CharSequence elements into one String. It has three forms: no-arg (concatenate), one-arg (delimiter), and three-arg (delimiter, prefix, suffix).

String csv = names.stream()
    .collect(Collectors.joining(", "));          // "Ann, Bob, Cy"

String list = names.stream()
    .collect(Collectors.joining(", ", "[", "]")); // "[Ann, Bob, Cy]"

It only accepts CharSequence, so map(Object::toString) first if your elements aren't strings. Internally it uses a StringBuilder, making it far more efficient than reducing with +.

summingInt/Long/Double apply a value-extracting function and sum the results; averagingInt/Long/Double compute the mean. Each takes a ToIntFunction-style mapper.

int total = orders.stream()
    .collect(Collectors.summingInt(Order::quantity));      // sum -> int/long

double avg = orders.stream()
    .collect(Collectors.averagingInt(Order::quantity));    // mean -> double

Note summingInt returns the primitive's boxed total, while all averaging* variants return Double (a mean is rarely integral). These shine as groupingBy downstreams for per-group totals and means.

summarizingInt/Long/Double compute count, sum, min, max, and average in a single pass, returning an IntSummaryStatistics (or Long/Double variant) that exposes all five.

IntSummaryStatistics stats = employees.stream()
    .collect(Collectors.summarizingInt(Employee::salary));
stats.getCount();    // 50
stats.getSum();      // 3_200_000
stats.getMin();      // 40_000
stats.getMax();      // 180_000
stats.getAverage();  // 64000.0

Use it instead of running several collectors when you need multiple statistics — it traverses the stream once. (stream().mapToInt(...).summaryStatistics() is the equivalent without collect.)

Collectors.reducing is a collector-form of reduction. Standalone it duplicates Stream.reduce, so its real purpose is being a downstream collector inside groupingBy/partitioningBy, where you can't drop to Stream.reduce.

// highest-paid employee per department
Map<Dept, Optional<Employee>> top = emps.stream()
    .collect(Collectors.groupingBy(Employee::dept,
             Collectors.reducing(BinaryOperator.maxBy(
                 Comparator.comparingInt(Employee::salary)))));

For top-level reductions prefer Stream.reduce — it's clearer. Reach for Collectors.reducing (or the dedicated maxBy/minBy) only as a downstream.

collectingAndThen(downstream, finisher) runs a collector, then applies a finishing transformation to its result. It's how you adapt a collector's output — most commonly to wrap a collection as unmodifiable or to extract a value from an Optional.

List<String> immutable = names.stream()
    .collect(Collectors.collectingAndThen(
        Collectors.toList(),
        Collections::unmodifiableList));

// unwrap the maxBy Optional per group
Map<Dept, Employee> top = emps.stream()
    .collect(Collectors.groupingBy(Employee::dept,
             Collectors.collectingAndThen(
                 Collectors.maxBy(Comparator.comparingInt(Employee::salary)),
                 Optional::get)));

It effectively bolts a custom finisher onto an existing collector without writing one from scratch.

Added in Java 9, both are downstream collectors that solve a real problem: filtering before the stream's filter() would silently drop empty groups.

  • filtering(predicate, downstream) — keeps only matching elements per group, but preserves the group key even if it becomes empty (unlike a pre-filter, which removes the whole bucket).
  • flatMapping(mapper, downstream) — flattens each element to a stream and collects the results, the collector-level flatMap.
Map<Dept, List<Employee>> highEarners = emps.stream()
    .collect(Collectors.groupingBy(Employee::dept,
             Collectors.filtering(e -> e.salary() > 100_000,
                 Collectors.toList())));

Use filtering over an upstream filter whenever you need every group key present, even with no surviving members.

teeing (Java 12) feeds each element to two downstream collectors at once, then merges their two results with a BiFunction. It computes two aggregates in a single pass.

// average = sum / count, in one traversal
double avg = nums.stream()
    .collect(Collectors.teeing(
        Collectors.summingDouble(n -> n),  // result 1: sum
        Collectors.counting(),             // result 2: count
        (sum, count) -> sum / count));     // merge

It's ideal when two statistics depend on the same stream and you want to avoid collecting to a list or streaming twice — e.g. min and max, or sum and count.

Characteristics are optimization hints in a collector's characteristics() set that tell the stream pipeline what shortcuts are safe:

  • UNORDERED — the result doesn't depend on encounter order (e.g. toSet), so the pipeline may reorder for speed.
  • CONCURRENT — the accumulator can be called on one shared container from multiple threads (e.g. groupingByConcurrent), avoiding merges.
  • IDENTITY_FINISH — the finisher is the identity, so the container is the result and the finisher step is skipped.
Collectors.toList();  // IDENTITY_FINISH
Collectors.toSet();   // UNORDERED, IDENTITY_FINISH

You rarely set these directly — they matter mostly when writing a custom collector or reasoning about parallel-stream performance.

Use Collector.of(supplier, accumulator, combiner, [finisher], characteristics...), passing the four functions directly. The combiner is mandatory so the collector works in parallel.

// a custom collector that joins names into a single uppercase CSV string
Collector<String, StringJoiner, String> upperCsv = Collector.of(
    () -> new StringJoiner(", "),          // supplier
    (j, s) -> j.add(s.toUpperCase()),      // accumulator
    StringJoiner::merge,                   // combiner
    StringJoiner::toString);               // finisher

String result = names.stream().collect(upperCsv);

Omit the finisher when the container already is the result (an IDENTITY_FINISH collector). In practice, prefer composing existing Collectors (mapping, collectingAndThen, teeing) — write a fully custom collector only when no combination fits.

More ways to practice

The self-quiz is live. Get notified when mock interviews and new question packs drop.

or
Join our WhatsApp Channel