Compilers

LR(1) Parsing

A stack, a state table, and one token of lookahead — the parser that swallowed every serious language

LR(1) parsing builds a parse tree bottom-up using a shift-reduce stack driven by a precomputed state table, deciding each step from one lookahead token to recognize the largest class of deterministic context-free grammars in linear time.

Parse timeO(n)
Lookahead1 token
Directionbottom-up, rightmost
Table buildO(states × grammar)
Per-step decisionO(1) table lookup

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How LR(1) parsing works

A top-down parser reads a grammar rule and tries to guess which production to expand before it has seen the input that justifies the guess. An LR parser does the opposite: it pushes tokens onto a stack and waits until it has accumulated the entire right-hand side of a rule, then collapses that span into the left-hand side. Because it postpones every decision to the last possible moment — after seeing the whole production plus one lookahead token — it can parse grammars that a top-down parser chokes on, including left-recursive ones.

The name encodes the algorithm. L means it scans input Left-to-right. R means it constructs a Rightmost derivation, but in reverse — it discovers the last rule applied first. The (1) is the lookahead: one token of context is enough to decide what to do next.

At runtime the parser is almost embarrassingly simple. It maintains a stack of states (small integers) and consults two tables:

ACTION[state, terminal] tells it to shift (push the input token and move to a new state), reduce (pop a completed right-hand side and apply a rule), accept, or error.
GOTO[state, nonterminal] tells it which state to enter after a reduce produces a nonterminal.

The loop is just: look at the top-of-stack state and the current lookahead, fetch the action, do it, repeat. All the intelligence lives in the table, which is computed once at parser-generation time and then read-only forever. That separation — a dumb fast engine plus a precomputed brain — is exactly why generated LR parsers are both tiny and fast.

Where the state table comes from

The magic is the table, and it is built from LR(1) items. An item is a grammar rule with a dot marking how much has been recognized, plus a lookahead set saying which token may legally follow. Writing A → α • β, a means "we have seen α, expect β next, and after the whole rule completes we expect terminal a."

The generator computes the canonical collection of item sets — each item set is one state. Two operations drive it:

CLOSURE. If a state expects a nonterminal next (the dot sits before B), add every production of B with the dot at the start, and compute its lookahead from FIRST of what follows. This expands "we might see a B" into all the concrete ways a B can begin.
GOTO. Move the dot past a symbol to get the successor state. This is literally the transition function of a deterministic finite automaton over the grammar symbols.

The result is a recognizer for viable prefixes — the prefixes of right-sentential forms that can still lead to a valid parse. The reduce decisions fall out of items whose dot has reached the end: A → α •, a says "reduce by A → α when the lookahead is a." It is that per-item lookahead — carried through closure — that gives LR(1) its precision and separates it from the weaker SLR and LALR variants below.

When to reach for LR parsing

Programming-language compilers and interpreters. C, C++, Go (originally), Ruby, PHP, and Bash all ship with yacc/Bison-style LALR(1) grammars. The token stream is deterministic and the grammar is fixed, which is exactly LR's sweet spot.
Configuration and query languages. SQL dialects, expression evaluators, and DSLs benefit from declarative grammar files that a generator turns into a parser, instead of hand-maintained recursive descent.
Left-recursive grammars. Arithmetic with left-associative operators (E → E + T) is natural in LR and a chore in LL, where you must rewrite the recursion away.
You want conflict reports up front. A generator tells you at build time that your grammar is ambiguous, which is far better than discovering it from a mis-parsed input in production.

Reach for something else when the grammar must change at runtime, when you need brilliant error messages and recovery (hand-written recursive descent wins here, which is why Clang and GCC abandoned generated parsers), or when the language is genuinely ambiguous and you would rather use a GLR or PEG parser than fight conflicts.

LR(1) versus the other parsing families

	Canonical LR(1)	LALR(1)	SLR(1)	LL(1) / recursive descent	GLR	PEG / packrat
Direction	bottom-up	bottom-up	bottom-up	top-down	bottom-up	top-down
Grammar class	all det. CFGs	most det. CFGs	fewer	strict subset of LR	all CFGs (ambiguous too)	limited backtracking
Lookahead source	per-item set	merged per-item	FOLLOW set	FIRST/FOLLOW	splits on conflict	ordered choice
Left recursion	native	native	native	must rewrite	native	not allowed
Table / code size	largest	~10× smaller	smallest table	tiny (code)	LR table + stacks	memo table
Parse time	O(n)	O(n)	O(n)	O(n)	O(n³) worst	O(n) w/ memo
Typical tool	Menhir, IELR	yacc, Bison, GNU	old tutorials	ANTLR, by hand	Bison %glr, Elkhound	pest, PEG.js

The practical hierarchy is SLR(1) ⊂ LALR(1) ⊂ LR(1): each accepts strictly more grammars than the last using more lookahead precision. LL(1) is incomparable in shape but strictly weaker in power — it sits entirely inside LR(1). Bison's default is LALR(1) because it captures nearly everything people write at a fraction of the table size; you opt into IELR or canonical LR(1) only when a stubborn reduce-reduce conflict demands it.

What the numbers actually say

Parse time is strictly O(n). Each of the n tokens is shifted exactly once, and the number of reductions equals the number of internal parse-tree nodes, also O(n). With O(1) table lookups, generated LR parsers commonly hit 10–100 million tokens per second, and lexing — not parsing — is usually the bottleneck.
LR(1) tables are roughly 10× larger than LALR(1). For an ANSI C grammar, canonical LR(1) produces on the order of tens of thousands of states; the LALR(1) merge collapses that to roughly a few thousand. Knuth noted this state explosion in his 1965 paper, and it kept full LR(1) impractical for a decade.
The table is sparse. A naive ACTION/GOTO table is states × symbols, but most entries are error. Real generators compress it with row displacement or default reductions, often shrinking it by 80–95%, which is how a parser for a full language fits in a few tens of kilobytes.
Table construction is the expensive phase, not parsing. Building the canonical collection is roughly O(states × items × grammar size), polynomial but with a large constant. It runs once at build time, so the cost never touches your hot path.
LR was discovered in 1965 by Donald Knuth, who proved that one token of lookahead suffices to parse every deterministic context-free language bottom-up — a result that surprised people who assumed bottom-up parsing needed unbounded context.

JavaScript implementation

Here is the entire runtime engine. It is deliberately table-driven: the ACTION and GOTO tables below encode the classic left-recursive arithmetic-expression grammar E → E + T | T, T → id, which is left-recursive and therefore impossible for a naive top-down parser.

// Grammar (rule index → [lhs, rhsLength]):
//   0: E' -> E        (augmented start)
//   1: E  -> E + T
//   2: E  -> T
//   3: T  -> id
const RULES = [['E\'',1], ['E',3], ['E',1], ['T',1]];

// ACTION[state][terminal]: 's4' = shift to 4, 'r2' = reduce by rule 2, 'acc' = accept
const ACTION = {
  0:{id:'s3'},                 1:{'+':'s4', $:'acc'},
  2:{'+':'r2', $:'r2'},        3:{'+':'r3', $:'r3'},
  4:{id:'s3'},                 5:{'+':'r1', $:'r1'},
};
// GOTO[state][nonterminal]
const GOTO = { 0:{E:1, T:2}, 4:{T:5} };

function parse(tokens) {
  tokens = [...tokens, '$'];        // end marker
  const stack = [0];                // stack of states
  const trail = [];                 // reductions in the order discovered
  let ip = 0;

  while (true) {
    const state = stack[stack.length - 1];
    const tok = tokens[ip];
    const act = (ACTION[state] || {})[tok];
    if (!act) throw new SyntaxError(`unexpected '${tok}' in state ${state}`);

    if (act === 'acc') return trail;            // success
    if (act[0] === 's') {                        // SHIFT
      stack.push(parseInt(act.slice(1)));
      ip++;
    } else {                                     // REDUCE by rule act[1..]
      const [lhs, len] = RULES[parseInt(act.slice(1))];
      for (let i = 0; i < len; i++) stack.pop();  // pop the RHS
      const back = stack[stack.length - 1];
      stack.push(GOTO[back][lhs]);                // GOTO on the new nonterminal
      trail.push(`${lhs} -> (rule ${act.slice(1)})`);
    }
  }
}

// Parse  id + id + id  — note the left recursion folds left-to-right
console.log(parse(['id', '+', 'id', '+', 'id']));
// => [ 'T -> ...', 'E -> ...', 'T -> ...', 'E -> ...', 'T -> ...', 'E -> ...' ]

Notice what is and is not here. The engine has no recursion, no grammar logic, and no lookahead reasoning — all of that was baked into ACTION and GOTO when the tables were generated. The runtime is a 20-line loop. This is the property that makes LR parsers so fast and so portable: the same engine drives any grammar once you swap the tables.

Python implementation

The same engine in Python, plus a sketch of closure to show where the tables actually come from — closure over LR(1) items is the heart of table construction.

RULES = [("E'", 1), ('E', 3), ('E', 1), ('T', 1)]

ACTION = {
    0: {'id': ('s', 3)},
    1: {'+': ('s', 4), '$': ('acc', None)},
    2: {'+': ('r', 2), '$': ('r', 2)},
    3: {'+': ('r', 3), '$': ('r', 3)},
    4: {'id': ('s', 3)},
    5: {'+': ('r', 1), '$': ('r', 1)},
}
GOTO = {0: {'E': 1, 'T': 2}, 4: {'T': 5}}

def parse(tokens):
    tokens = list(tokens) + ['$']
    stack, ip, trail = [0], 0, []
    while True:
        state, tok = stack[-1], tokens[ip]
        act = ACTION.get(state, {}).get(tok)
        if act is None:
            raise SyntaxError(f"unexpected {tok!r} in state {state}")
        kind, arg = act
        if kind == 'acc':
            return trail
        if kind == 's':                       # SHIFT
            stack.append(arg)
            ip += 1
        else:                                  # REDUCE by rule arg
            lhs, length = RULES[arg]
            del stack[-length:]                # pop the whole RHS
            stack.append(GOTO[stack[-1]][lhs]) # GOTO
            trail.append(f"reduce {lhs} (rule {arg})")

# --- where the table comes from: CLOSURE of LR(1) items ---
# An item is (lhs, rhs_tuple, dot_index, lookahead).
def closure(items, grammar, first):
    items = set(items)
    changed = True
    while changed:                              # fixpoint iteration
        changed = False
        for (lhs, rhs, dot, la) in list(items):
            if dot < len(rhs) and rhs[dot] in grammar:   # dot before a nonterminal B
                B = rhs[dot]
                beta = rhs[dot + 1:] + (la,)             # what follows B
                for prod in grammar[B]:                  # add B's productions
                    for tok in first(beta):              # with computed lookahead
                        new = (B, prod, 0, tok)
                        if new not in items:
                            items.add(new); changed = True
    return items

print(parse(['id', '+', 'id', '+', 'id']))

The closure function is the conceptual core that parse hides. It runs a fixpoint loop: whenever the dot sits before a nonterminal B, it pulls in every production of B, and — crucially for the (1) in LR(1) — it computes each new item's lookahead from FIRST of whatever follows. SLR and LALR cut corners exactly here; canonical LR(1) keeps every lookahead distinct, which is why it accepts the most grammars and produces the biggest tables.

Variants worth knowing

SLR(1) — Simple LR. Uses the grammar-wide FOLLOW set to decide reduces instead of per-state lookahead. Smallest tables, weakest power; it rejects many grammars that LALR accepts. Mostly a teaching stepping stone today.

LALR(1) — Look-Ahead LR. Build the canonical LR(1) automaton, then merge any two states whose item cores (rules and dot positions) match, unioning their lookaheads. This shrinks the table by roughly 10× at the cost of occasional spurious reduce-reduce conflicts. It is what yacc, Bison, and the original Go parser used.

IELR(1). A 2010 refinement that produces LALR-sized tables but recovers full canonical-LR(1) power, eliminating the merge-induced conflicts. Available in Bison via %define lr.type ielr.

GLR — Generalized LR. When the table has a conflict, fork the parse stack and pursue every interpretation in parallel using a graph-structured stack, pruning the ones that die. This parses any context-free grammar, including ambiguous ones like C++'s, at the price of up to O(n³) worst-case time. Bison's %glr-parser and Elkhound implement it.

Operator-precedence parsing. A lightweight cousin that resolves shift-reduce decisions purely from operator precedence and associativity tables. Not a full LR family but a common pragmatic shortcut for expression sublanguages.

Common bugs and edge cases

The dangling-else shift-reduce conflict. After if E then S, an else can either shift (attach here) or reduce (close the inner if). Generators silently default to shift, binding else to the nearest if — usually what you want, but worth declaring explicitly with precedence.
Reduce-reduce conflicts from LALR merging. A grammar that is conflict-free under canonical LR(1) can develop reduce-reduce conflicts after the LALR state merge unions incompatible lookaheads. Switching Bison to ielr or canonical LR(1) often fixes it.
Forgetting to augment the start symbol. The table needs a synthetic rule S' → S so the parser has a single, unambiguous accept action. Omit it and you cannot distinguish "done" from "could continue."
Popping the wrong stack count on reduce. You must pop exactly the length of the right-hand side, then GOTO from the uncovered state. Popping states and symbols inconsistently (in a two-stack design) is the classic off-by-one.
Poor error recovery. A bare LR engine just reports "syntax error" at the first bad token. Real parsers add error-token productions or panic-mode recovery; the table tells you nothing about how to resynchronize on its own.
Treating LALR conflicts as grammar bugs. Sometimes the grammar is fine and the parser model is the limitation. Before rewriting the grammar, check whether full LR(1) accepts it — you may just need a stronger generator.

Frequently asked questions

What does the (1) in LR(1) mean?

It means the parser reads input Left-to-right, builds a Rightmost derivation in reverse, and uses 1 token of lookahead to decide between shifting and reducing. LR(0) uses no lookahead and conflicts on most real grammars; LR(2) and higher add tokens but are almost never used because they blow up table size for little practical gain.

What is the difference between LR(1), LALR(1), and SLR(1)?

All three share the same shift-reduce engine; they differ only in how reduce actions get their lookahead. SLR(1) uses the grammar-wide FOLLOW set, LALR(1) merges canonical LR(1) states with identical cores and unions their lookaheads, and canonical LR(1) keeps every state distinct. LR(1) accepts the most grammars but produces the largest tables — often 10× more states than LALR for a real language.

Why do yacc and Bison use LALR(1) instead of full LR(1)?

Canonical LR(1) tables for a language like C can reach tens of thousands of states, which was prohibitive on 1970s hardware and is still wasteful today. LALR(1) collapses states with identical item cores, shrinking the table by roughly an order of magnitude while accepting nearly every grammar people actually write. The cost is rare mysterious reduce-reduce conflicts that pure LR(1) would not have.

What causes a shift-reduce conflict?

A shift-reduce conflict happens when a parser state can both shift the lookahead token and reduce by a completed rule, and the lookahead does not disambiguate. The classic example is the dangling else: after parsing 'if E then S', seeing 'else' could attach to this if or an outer one. Generators default to shift, which binds else to the nearest if, and let you resolve it explicitly with precedence declarations.

Is LR parsing more powerful than LL parsing?

Yes. Every LL(1) grammar is also LR(1), but not the reverse. Because LR delays its decision until it has seen the entire right-hand side plus one lookahead token, it handles left recursion natively and accepts a strictly larger class of grammars. LL parsers must rewrite left recursion away and commit to a production from the first tokens, which they cannot always do.

How fast is LR parsing?

Linear in the input length: O(n) for n tokens. Each token triggers exactly one shift, and the total number of reduces is bounded by the size of the parse tree, which is itself linear in the input. Table lookups are O(1) array indexing, so a generated LR parser typically runs at millions of tokens per second, with lexing usually the bottleneck rather than parsing.