Why LINQ proficiency separates good C# developers from great ones
LINQ is one of the most distinctive features of C# — a declarative query model baked into the language itself. Used well, it makes code significantly more readable and concise. Used poorly, it is a source of N+1 database queries, hidden performance costs, and subtle bugs caused by deferred execution. Interviews probe both sides: can you write correct LINQ, and can you identify where it goes wrong?
What LINQ is (and isn't)
LINQ (Language Integrated Query) is three things working together:
- Language features — query syntax (
from x in xs where … select …), theyieldmachinery powering iterators, and lambda expression support. - Extension methods —
Where,Select,GroupBy, etc. defined onIEnumerable<T>inSystem.Linq.Enumerableand onIQueryable<T>inSystem.Linq.Queryable. - Provider model — any data source can expose
IQueryable<T>and provide a LINQ provider that translates expression trees to native queries. EF Core translates to SQL; LINQ to XML to XPath; custom providers to anything.
LINQ is not magic and is not always the right tool. For a simple loop over a
small local list, a foreach is often clearer and equally fast. Reach for LINQ when
the declarative composition adds clarity or when the query involves filtering, grouping,
projecting, or joining.
Query syntax vs method syntax — they are identical
Both syntaxes compile to the same method calls. The compiler translates query syntax into method syntax before emitting IL.
int[] data = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
// Query syntax:
var q1 = from n in data
where n % 2 == 0
orderby n descending
select n * n;
// Method (fluent) syntax — identical IL after compilation:
var q2 = data
.Where(n => n % 2 == 0)
.OrderByDescending(n => n)
.Select(n => n * n);
// Both: 100, 64, 36, 16, 4
When to prefer query syntax: complex multi-source joins, let clauses (intermediate
variables), and group … by … into grouping — these have cleaner expression in the SQL-like
form.
When to prefer method syntax: everything else. It covers every operator (many have no query-syntax equivalent), chains naturally with other method calls, and is what most .NET codebases use.
Deferred execution — the most important LINQ concept
The defining characteristic of LINQ is deferred execution: a LINQ query is a description of how to compute data, not the data itself. The computation happens only when you iterate the result.
var numbers = new List<int> { 1, 2, 3 };
// Defining the query — no iteration yet:
var query = numbers.Where(n => n > 1); // IEnumerable<int>
numbers.Add(10); // source modified after query defined
foreach (var n in query)
Console.Write(n + " "); // 2 3 10 — sees the added element!
This is useful for query composition — you can build a pipeline incrementally and the source is touched only once when you iterate:
IQueryable<Product> q = db.Products
.Where(p => p.IsActive);
if (filterCategory)
q = q.Where(p => p.Category == category); // add another condition
if (sortByPrice)
q = q.OrderBy(p => p.Price); // add ordering
var results = await q.ToListAsync(); // ONE SQL with all conditions
When deferred execution bites you
Multiple enumeration:
IEnumerable<User> users = db.Users.Where(u => u.IsActive); // deferred
int count = users.Count(); // hits the DB — query runs once
var first = users.First(); // hits the DB — query runs again!
// Fix: materialise once
var cached = users.ToList();
int count2 = cached.Count; // in-memory — no DB hit
var first2 = cached.First(); // in-memory
Source changes between iterations:
var list = new List<int> { 1, 2, 3 };
var q = list.Where(n => n > 1);
list.Clear(); // modify source
foreach (var n in q) // yields nothing — source is empty
Console.WriteLine(n);
Lazy evaluation in loops:
var queries = new List<IEnumerable<int>>();
for (int i = 0; i < 3; i++)
{
int captured = i;
queries.Add(Enumerable.Range(0, captured)); // captured is correct because we copied
}
// Without capturing: classic closure-over-loop-variable bug
Rule: Call .ToList() or .ToArray() when you need a point-in-time snapshot,
will iterate more than once, or want to ensure a database query runs exactly once.
IEnumerable<T> vs IQueryable<T> — the most critical LINQ distinction
This is the most common LINQ performance mistake in EF Core codebases.
// IEnumerable: loads ALL rows from DB, filters in memory
IEnumerable<User> badApproach = db.Users; // implicit ToList() does not happen yet
// but AsEnumerable() forces in-memory
var activeUsers = badApproach.Where(u => u.IsActive).ToList();
// SQL: SELECT * FROM Users ← ALL users loaded, then C# filters
// IQueryable: filter is translated to SQL
IQueryable<User> goodApproach = db.Users;
var activeUsers2 = goodApproach.Where(u => u.IsActive).ToList();
// SQL: SELECT * FROM Users WHERE IsActive = 1 ← only matching rows fetched
How this works: IQueryable<T> stores an expression tree — a data structure
representing the LINQ operations as objects (not compiled delegates). The LINQ provider
(EF Core) walks the expression tree and generates SQL. IEnumerable<T> stores
compiled delegates — once you are on IEnumerable, the rest of the pipeline runs
in C# memory regardless of where the data came from.
// AsEnumerable() switches from IQueryable to in-memory LINQ
// Use this when the provider cannot translate a particular operation:
var results = db.Users
.Where(u => u.IsActive) // translated to SQL: WHERE IsActive = 1
.AsEnumerable() // switch to in-memory from here
.Where(u => MyComplexCSharpFn(u)) // runs in C# — cannot be translated to SQL
.ToList();
// SQL: SELECT * FROM Users WHERE IsActive = 1
// Then: C# filters the result set with MyComplexCSharpFn
The most important LINQ operators
Select and SelectMany
Select is 1-to-1 projection. SelectMany is 1-to-many then flatten.
var sentences = new[] { "hello world", "foo bar baz" };
// Select — each string maps to a string array: IEnumerable<string[]>
var words1 = sentences.Select(s => s.Split(' '));
// [ ["hello","world"], ["foo","bar","baz"] ]
// SelectMany — flatten into IEnumerable<string>
var words2 = sentences.SelectMany(s => s.Split(' '));
// [ "hello", "world", "foo", "bar", "baz" ]
// With index and result selector:
var indexed = sentences.SelectMany(
(sentence, i) => sentence.Split(' '),
(sentence, word) => $"[{sentence[..3]}] {word}"
);
Use SelectMany when each element maps to a collection and you want a flat result.
It corresponds to nested foreach loops.
First, FirstOrDefault, Single, SingleOrDefault
var items = new[] { 3, 1, 4, 1, 5, 9 };
items.First(); // 3 — throws if empty
items.FirstOrDefault(); // 3 — returns default(int) = 0 if empty
items.First(x => x > 4); // 5 — first match; throws if no match
items.Single(x => x == 9); // 9 — throws if 0 or 2+ matches
items.SingleOrDefault(x => x > 100); // 0 — no match returns default; still throws if 2+ match
Use Single/SingleOrDefault when exactly one result is a business invariant (like
looking up by primary key). The exception it throws when multiple matches exist is a
useful contract assertion.
GroupBy
GroupBy partitions a sequence. Each group has a Key and is an IEnumerable<TElement>.
var words = new[] { "ant", "bear", "cat", "ape", "bat", "crow" };
var byFirstLetter = words.GroupBy(w => w[0]);
foreach (var g in byFirstLetter)
Console.WriteLine($"{g.Key}: {string.Join(", ", g)}");
// a: ant, ape
// b: bear, bat
// c: cat, crow
// Group + aggregate (the common pattern):
var wordCountByLetter = words
.GroupBy(w => w[0])
.Select(g => new { Letter = g.Key, Count = g.Count() });
In EF Core, GroupBy on IQueryable<T> translates to GROUP BY. However, complex
GroupBy projections may not translate — if EF Core throws at runtime, add
.AsEnumerable() before GroupBy to force in-memory grouping.
Aggregate (fold/reduce)
var nums = new[] { 1, 2, 3, 4, 5 };
// Without seed — uses first element as initial accumulator:
int sum = nums.Aggregate((acc, n) => acc + n); // 15
int prod = nums.Aggregate((acc, n) => acc * n); // 120
// With seed:
int sumPlus100 = nums.Aggregate(100, (acc, n) => acc + n); // 115
// With result selector (seed, fold, transform):
string result = nums.Aggregate(
new List<string>(),
(list, n) => { list.Add(n.ToString()); return list; },
list => string.Join("-", list)
); // "1-2-3-4-5"
Zip
var names = new[] { "Alice", "Bob", "Carol" };
var scores = new[] { 92, 85, 78 };
// Pairs positional elements:
var paired = names.Zip(scores, (name, score) => $"{name}: {score}");
// [ "Alice: 92", "Bob: 85", "Carol: 78" ]
// C# 6+ — no selector: returns tuples
foreach (var (name, score) in names.Zip(scores))
Console.WriteLine($"{name} scored {score}");
Common LINQ pitfalls
N+1 queries in EF Core
// N+1: one query for orders, one per order for customer name
var orders = db.Orders.ToList();
foreach (var o in orders)
Console.WriteLine(db.Customers.Find(o.CustomerId)?.Name); // 1 query per order!
// Eager load with Include:
var orders2 = db.Orders.Include(o => o.Customer).ToList();
// One JOIN query fetches everything
Calling .Count() instead of .Any()
// Iterates entire collection to count:
if (list.Count() > 0) { }
// Stops at first element:
if (list.Any()) { }
Multiple enumeration of a deferred source
// Executes the query twice (two DB round trips):
void Process(IEnumerable<User> users)
{
Console.WriteLine(users.Count()); // query 1
foreach (var u in users) { } // query 2
}
// Materialise once at the call site:
Process(db.Users.Where(u => u.IsActive).ToList());
Recap
LINQ is a declarative query model built into C# through extension methods on
IEnumerable<T> and IQueryable<T>. Its most important characteristic is deferred
execution — queries describe computation but don't run until iterated; materialise
with .ToList() or .ToArray() when you need a snapshot or will iterate more than
once. IEnumerable<T> runs LINQ operators in memory (LINQ to Objects); IQueryable<T>
translates operators to the data source's query language (SQL via EF Core). Always
keep database-facing LINQ on IQueryable<T> to avoid loading entire tables. The three
operators to understand deeply are Select/SelectMany (projection and flattening),
GroupBy (partitioning), and Aggregate (the general fold). The three mistakes to
avoid are N+1 queries, multiple enumeration of deferred sources, and switching from
IQueryable to IEnumerable too early in the pipeline.