Python identity, explained
is and == look interchangeable until they suddenly aren't. The difference comes down to
identity versus equality, and a CPython optimisation called interning that
quietly reuses certain objects. Once you understand both, the "random" behaviour of is
becomes completely predictable.
Identity vs equality
== asks "do these have the same value?" (via __eq__). is asks "are these the
same object in memory?" Two distinct objects can be equal; the same object is always
both.
a = [1, 2, 3]
b = [1, 2, 3]
a == b # True — same contents
a is b # False — two separate list objects
c = a
a is c # True — same object
is never calls __eq__; it's a pure pointer comparison, which makes it fast and
impossible to fool.
id() reveals the object
id(obj) returns a unique integer identifying the object for its lifetime — in CPython, its
memory address. x is y is exactly id(x) == id(y).
a = [1, 2, 3]
b = a
id(a) == id(b) # True
id(a) == id([1, 2, 3]) # False — the literal is a different object
Note that ids can be reused after an object is freed, so don't store an id and compare it to a later object's id.
Integer interning (the small-int cache)
CPython pre-creates and caches the integers -5 through 256. Any reference to a value in
that range returns the same cached object, so is returns True. Outside that range, equal
integers are usually distinct objects.
a = 100; b = 100
a is b # True — both the cached 100
x = 1000; y = 1000
x is y # often False — separate objects with the same value
This is the classic interview trap: is "works" for small numbers and breaks for large
ones. The lesson — never use is to compare numeric values.
String interning
Short, identifier-like string literals are interned (deduplicated) at compile time, so they share one object. Strings built at runtime usually aren't.
a = "hello"; b = "hello"
a is b # True — interned literal
x = "".join(["he", "llo"])
x == a # True
x is a # often False — built at runtime, not interned
You can force interning with sys.intern(s) — useful when you have many duplicate strings
(e.g. parsed tokens) and want fast is comparisons and lower memory. But for ordinary code,
compare strings with ==.
The golden rule: when to use is
Use is only for singletons and genuine identity checks — never for values. The
canonical cases are None, True, and False, which are unique objects.
if value is None: ... # correct
if flag is True: ... # fine, but usually just: if flag
if name == "admin": ... # value comparison → use ==, never is
Using is for value comparison may pass tests on small inputs and fail in production on
larger ones — exactly the kind of bug that hides until it's expensive.
Mutability and identity
Identity matters for mutable objects: two names bound to the same object see each other's mutations, while equal-but-distinct objects don't.
a = [1, 2]; b = a # same object
b.append(3)
a # [1, 2, 3] — both names see it
c = [1, 2]; d = list(c) # distinct copies
d.append(3)
c # [1, 2] — unaffected
Recap
== compares value (calls __eq__); is compares identity (same object, exactly
id(x) == id(y)). CPython interns integers -5..256 and identifier-like string literals,
so is returns True for those and False for equal-but-separately-created values — which is
why is on numbers/strings is a trap. Use is only for None, True, False, and real
identity checks; use == for everything value-related. Identity also explains why aliases of
a mutable object share mutations while copies don't.