Java volatile & the Java Memory Model — Visibility, happens-before & Safe Publication

Why concurrency breaks your intuition

Single-threaded Java behaves the way you read it: statements run top to bottom, a write you just made is the value you read back. The moment a second thread touches the same fields, that intuition collapses. A write made by one thread might never become visible to another; instructions you wrote in one order might appear to run in another. None of this is a JVM bug — it is the Java Memory Model (JMM) deliberately leaving the JVM, JIT, and CPU free to optimize. To write correct concurrent code you stop reasoning about "what the CPU does" and start reasoning about the JMM's one core abstraction: happens-before. This guide builds up that model and shows where volatile, synchronized, and final fit.

The visibility problem

The classic motivating bug is a flag loop that never stops. One thread spins on a boolean; another sets it to true; the spinning thread keeps going forever. Nothing is "wrong" with the code — the reader is simply allowed to keep the flag in a register or CPU cache, or the JIT may hoist the read out of the loop entirely, so it never re-reads main memory.

class Worker {
  boolean stop = false;          // plain field — NOT volatile

  void run() {
    while (!stop) { /* busy */ } // JIT may read stop ONCE, then loop on a cached copy
  }
  void shutdown() { stop = true; } // a different thread sets it — write may never be seen
}

A write being invisible is the heart of the problem: without a synchronization action, the JMM gives no promise that one thread's writes ever reach another. Marking stopvolatile is the smallest fix, but to understand why it works you need the model underneath it.

What the JMM is and why it exists

The Java Memory Model is the slice of the language spec that answers two questions: which writes is a thread guaranteed to see, and what reorderings are legal. It has to exist because real systems optimize aggressively — store buffers, multi-level caches, out-of-order execution, register allocation, JIT reordering. Without a portable contract, the same program would behave differently on every CPU and be impossible to reason about.

// The JMM does NOT promise a single global order of operations across threads.
// It promises a MINIMUM set of guarantees you can build on:
//   volatile, synchronized, final, Thread.start()/join(), and a few library tools.

Crucially, the JMM does not hand you a single tidy timeline of every operation. It gives you a partial order — happens-before — and tells you that anything not ordered by it may be reordered or read stale. Your job is to deliberately place happens-before edges where threads communicate.

The happens-before relationship

Happens-before is the JMM's promise: if action A happens-before B, then everything A wrote is visible to B, and A appears to run before B. The edges come from a fixed set of rules, and transitivity chains them together:

// Sources of happens-before edges:
//   Program order  — each action hb later actions in the SAME thread
//   Monitor lock   — unlocking a monitor hb a later lock of the SAME monitor
//   Volatile       — a write to a volatile field hb every later read of it
//   Thread start   — t.start() hb every action inside the started thread
//   Thread join    — every action in a thread hb another thread's t.join() returning
//   Transitivity   — if A hb B and B hb C, then A hb C

Notice what is missing: two plain reads/writes in different threads have no edge between them by default. Where there is no happens-before edge, the JMM owes you nothing — no visibility, no ordering. Every concurrency bug is, at bottom, a missing edge.

What volatile guarantees — and what it doesn't

volatile does exactly two things: it provides visibility and ordering. A volatile write is flushed so every thread sees it; a volatile read always fetches the latest value from main memory; and the write establishes a happens-before edge to every later read of that field. Because of that edge, plain writes that came before the volatile write ride along and become visible too.

volatile boolean ready = false;
int data;                 // plain field

// writer thread
data = 42;                // ordinary write
ready = true;             // volatile write — happens-before any later read of ready

// reader thread
if (ready) {              // volatile read
  use(data);              // guaranteed to see 42 (carried along by happens-before)
}

What volatile does not give you is atomicity. It makes each individual read and write fresh, but it cannot make a sequence of them indivisible — which is exactly the next trap.

Atomicity vs visibility vs ordering

Interviewers love these three because conflating them is the classic mistake. They are genuinely distinct concerns:

Atomicity — an operation completes all-or-nothing; no thread sees a half-done state.
Visibility — a write by one thread becomes observable to another.
Ordering — the order operations appear to execute in.

volatile int count = 0;
count++;   // really: int tmp = count; tmp = tmp + 1; count = tmp;  — THREE steps

count++ is a read-modify-write: two threads can both read the same value, both add one, both store the same result — a lost update — even though every step sees fresh memory. volatile covers visibility and ordering; it adds no atomicity. For an atomic counter use AtomicInteger.incrementAndGet() (built on hardware compare-and-swap), or a lock. The rule: volatile is correct only when a write doesn't depend on the current value, like a one-way status flag — never for read-modify-write.

volatile vs synchronized

Both establish happens-before edges and both fix visibility, but they solve different problems. synchronized also gives mutual exclusion and atomicity over a whole block; volatile is a single-field, never-blocking visibility tool. And synchronized provides visibility because releasing a monitor happens-before the next acquire of the same monitor — it is not only a lock.

volatile boolean flag;              // cheap, lock-free visibility for ONE field

Object lock = new Object();
synchronized (lock) {               // exclusive access + visibility for a region
  balance = balance - amount;       // compound action made atomic
}

Reach for volatile when you only need one variable's latest value visible. Reach for synchronized when you need an atomic compound operation or must keep multiple related fields consistent together — volatile is per-field and cannot maintain an invariant that spans two fields.

Instruction reordering

Compilers, the JIT, and CPUs may execute instructions out of program order as long as a single thread's result is unchanged — the as-if-serial guarantee. Reordering powers register allocation and latency hiding, so it is everywhere.

int a = 1;   // no dependency between these two,
int b = 2;   // so the JVM/CPU may run them in either order

int y = a;   // depends on a — can NEVER be reordered before 'a = 1'

Within one thread you never notice. Across threads you can: a second thread might observe b set before a. The JMM only forbids reordering across happens-before edges — a volatile access or a lock release/acquire acts as a barrier the optimizer may not cross. This is the precise reason the next pattern needs volatile.

Double-checked locking needs volatile

The double-checked-locking (DCL) singleton checks the instance once without a lock and once inside, to avoid synchronizing on every call. Without volatile it is subtly broken, because instance = new Singleton() is not atomic: it allocates memory, runs the constructor, and assigns the reference — and those steps may be reordered. A second thread could see a non-null reference pointing at a partially constructed object.

class Singleton {
  private static volatile Singleton instance;   // volatile is REQUIRED, not optional

  static Singleton get() {
    if (instance == null) {                // 1st check, no lock — fast path
      synchronized (Singleton.class) {
        if (instance == null)              // 2nd check, under the lock
          instance = new Singleton();      // volatile forbids the dangerous reorder
      }
    }
    return instance;
  }
}

volatile bans the reordering and adds a happens-before edge, so any thread that sees the reference also sees a fully built object. If hand-writing DCL feels fragile, the holder-class idiom (a static nested class initialized lazily by the classloader) gets the same lazy, thread-safe result with no volatile at all.

Safe publication

Publishing an object means making its reference visible to other threads. Safe publication additionally guarantees that a thread which sees the reference also sees the object's fully-initialized fields. Just assigning to a plain field is unsafe: another thread may see the reference but stale field values inside the object.

class Holder {
  private volatile Config config;        // volatile field = safe publication
  void set(Config c) { config = c; }     // readers of config see a COMPLETE Config
  Config get() { return config; }
}

Per Java Concurrency in Practice, the safe routes are: store the reference in a volatile field (or AtomicReference), in a final field set in a constructor, in a field guarded by a lock, or via a static initializer. Transitivity is what makes this work — the one volatile/lock edge carries along every plain write that preceded the publication.

Final fields and freeze semantics

final fields get special freeze semantics: when a constructor finishes, the JMM "freezes" the object's final fields, and any thread that obtains the object through a reference published after construction is guaranteed to see those fields correctly — with no extra synchronization.

class Point {
  final int x, y;                       // frozen when the constructor returns
  Point(int x, int y) { this.x = x; this.y = y; }
}
// Any thread that later sees a Point reference always sees the real x and y.

The one caveat: the object must not leak this during construction (registering a listener or storing this in a shared field before the constructor returns defeats the freeze). Final fields are precisely why immutable objects are inherently thread-safe and can be shared freely.

long and double tearing

The JMM guarantees atomic reads and writes only for 32-bit-or-smaller types and references. A non-volatile long or double (64 bits) may be written as two separate 32-bit stores, so another thread can read a torn value — the high half of one write spliced with the low half of another.

long balance;                         // NOT volatile — a 64-bit write may tear
// Thread A: balance = 0xFFFFFFFF_00000000L;
// Thread B may read the full value, 0, or a mangled mix of two writes.

Declaring the field volatile makes 64-bit reads and writes atomic (and visible). Most modern 64-bit JVMs happen to write longs atomically, but the spec does not require it, so portable code that shares a writable long/double should mark it volatile or guard it with a lock.

Recap

Concurrency intuition fails because the JMM lets the JVM and CPU cache, buffer, and reorder freely; the model's one guarantee is happens-before, and every bug is a missing edge. volatile supplies visibility and ordering for a single field — perfect for a one-way flag — but not atomicity, so count++ still races; use AtomicInteger or a lock for read-modify-write, and a lock for invariants spanning multiple fields. synchronized adds mutual exclusion and atomicity on top of visibility. Reordering is legal within a thread (as-if-serial) but constrained across threads only by happens-before edges, which is why double-checked locking demands a volatile field. Lean on safe publication (volatile, final, locks, static init) and final-field freeze for immutable objects, and remember that shared 64-bit long/double can tear without volatile. Master happens-before and the rest of the JMM falls into place.

More ways to practice