Data Structures

Red-Black vs AVL Tree

Same asymptote, opposite philosophies — strict balance vs looser balance

Red-black and AVL are the two dominant self-balancing binary search trees, both guaranteeing O(log n) per operation. They differ in how strictly they balance: AVL keeps heights of sibling subtrees within +/-1 (max height ~1.44 log2 N), giving faster lookups but more rotations per mutation. Red-black tolerates a 2× height ratio (max height 2 log2(N+1)), giving slightly slower lookups but at most 2 rotations per insert. Standard libraries — std::map, TreeMap, Linux's task scheduler — overwhelmingly pick red-black because real-world workloads mix reads and writes, and red-black's mutation thrift wins overall.

  • AVL max height1.44 log₂ N
  • RB max height2 log₂(N+1)
  • AVL insert rotationsUp to O(log N)
  • RB insert rotationsAt most 2
  • RB delete rotationsAt most 3
  • Used byRB: std::map, TreeMap, Linux CFS

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The balance invariants — side by side

AVL (Adelson-Velsky and Landis, 1962). For every node v, the heights of v's left and right subtrees differ by at most 1. The balance factor BF(v) = height(right) − height(left) is always in {−1, 0, +1}. Any insertion or deletion that breaks this triggers rebalancing rotations until the invariant is restored.

Red-Black (Bayer 1972 / Guibas-Sedgewick 1978). Five rules: (1) every node is red or black; (2) root is black; (3) NIL leaves are black; (4) red nodes have black children only; (5) every path from a node to its descendant NIL leaves has the same number of black nodes (black height). Together these guarantee max height ≤ 2 × log2(n+1).

The invariants imply very different shapes. AVL trees are nearly perfectly balanced: a chain of length 30 is impossible if AVL height is 30 — you need at least F(30) ≈ 832,040 nodes (where F is the Fibonacci sequence) to reach height 30 in AVL. Red-black trees can be more lopsided: a chain of black-red-black-red can reach height 30 with as few as 2^15 = 32,768 nodes.

Rotation counts — where red-black wins

For insert, both trees do a standard BST insert first, then rebalance. The rebalance cost differs sharply:

  • AVL insert. Walk up from the new node, updating balance factors. At the first node where |BF| reaches 2, perform a rotation (single or double, depending on whether the path is LL, RR, LR, or RL). This rotation always restores balance — but in the worst case, you may walk up to O(log n) levels updating balance factors before reaching that point.
  • RB insert. Color the new node red. If parent is black: done — no rebalancing needed. If parent is red: a "red-red violation" — fix by recoloring (case 1: uncle red — recolor parent/uncle black, grandparent red, recurse upward) or rotating (case 2/3: uncle black — at most 2 rotations and we're done permanently).
  • Worst-case rotations. RB insert: at most 2 rotations regardless of N. AVL insert: 1 single rotation or 1 double rotation per imbalanced ancestor — but those imbalances can cascade O(log n) levels up.
  • Recolorings. RB inserts may recolor up to O(log n) nodes (each is just one bit flip — O(1) work each). AVL updates O(log n) balance factors. Roughly equal mutation cost on recoloring/updating; the rotation savings on RB are the differentiator.

Lookup speed — where AVL wins

Lookup walks from root to target, comparing keys at each step. Average comparisons ≈ tree height. Since AVL is shorter (1.44 log2 N) than RB (2 log2 N), AVL averages fewer comparisons per lookup.

For n = 1,000,000:

  • Perfect tree height: 20 (2^20 ≈ 1M)
  • AVL max height: ~29 (1.44 × 20)
  • RB max height: ~40 (2 × 20)

In practice, for random insertions, both trees stay close to the perfect height. The worst-case gap matters only on adversarial input — and benchmarks consistently show AVL is 5–10% faster on read-only workloads. For 90% read / 10% write workloads, the AVL lookup advantage starts to pay back its higher mutation cost; somewhere around 80% read it crosses over and AVL becomes competitive overall. For 50/50 or write-heavy workloads, RB clearly wins.

Side-by-side comparison

Red-black treeAVL tree
Balance invariantPath-length ratio ≤ 2Height difference ≤ 1
Max height2 log₂(N+1)1.44 log₂(N+2) − 0.328
Min nodes for height h2^(h/2)F(h+2) − 1 (Fibonacci)
Lookup speedSlower (taller tree)Faster (shorter tree)
Insert rotationsAt most 2Up to O(log N)
Delete rotationsAt most 3Up to O(log N)
Per-node metadata1 color bitBalance factor (2-3 bits)
Recoloring on insertO(log N) (cheap, bit flips)O(log N) BF updates
Best workloadMixed reads + writesRead-heavy
Used bystd::map, TreeMap, Linux CFSDatabase indexes (less common)
Inventor / yearBayer 1972, Guibas-Sedgewick 1978Adelson-Velsky & Landis 1962

How to choose

  • Use red-black when reads and writes are mixed in your workload, when you're using a standard library implementation (it's almost certainly RB), or when you need to interop with std::map / TreeMap.
  • Use AVL when reads dominate (95%+ read), when you need tight latency bounds and can pay extra on writes, when the data structure is in a hot read path and the 5–10% lookup speedup matters.
  • Use neither when you don't need order — use a hash table. When you need persistent/immutable trees — use functional finger trees. When you need range queries on disk — use B-trees or B+-trees. When you need lock-free concurrency — use skip lists.

Worked example: 7 inserts

Insert the keys [10, 20, 30, 40, 50, 25, 15] in order into both an empty AVL and empty RB tree:

  • AVL trajectory. Insert 10 (root). Insert 20 right of 10 — BF = +1, OK. Insert 30 right of 20 — BF at 10 becomes +2 — rotate left at 10. Now tree is 20/10\30. Insert 40 right of 30 — BF at 20 becomes +1. Insert 50 right of 40 — BF at 30 becomes +2 — rotate left at 30. Tree: 20/10\40-30,50. Insert 25 left of 30 — no imbalance, BF stays under 2. Insert 15 left of 20 — no imbalance. Final tree height: 3. Total rotations: 2. Total BF updates: ~10.
  • RB trajectory. Insert 10 (root, black). Insert 20 (red, right of 10) — parent black, OK. Insert 30 (red, right of 20) — red-red violation — uncle is NIL (black) — rotate left at 10, recolor. Tree: 20(B)/10(R)\30(R). Insert 40 (red, right of 30) — uncle 10 is red — recolor both children of 20 black, recolor 20 red. But 20 is root — make it black. Insert 50 (red, right of 40) — uncle is NIL, rotate left at 30, recolor. Insert 25 (red, left of 30) — no violation. Insert 15 (red, left of 10) — no violation. Final height: 3. Total rotations: 2. Total recolorings: ~4.

Both trees do 2 rotations on this sequence — a tie. For adversarial sequences (sorted input), AVL still rebalances at every level, while RB does at most 2 rotations regardless. RB's mutation thrift is most visible on long insertion sequences.

Real-world usage

  • std::map and std::set (libstdc++, libc++). Red-black tree. Reasonable balance, cheap mutations.
  • java.util.TreeMap and TreeSet. Red-black tree.
  • Linux kernel's Completely Fair Scheduler (CFS). Red-black tree, keyed by virtual runtime. Picks the leftmost (lowest vruntime) process to run next. Insert/remove on every context switch — extremely mutation-heavy, hence RB.
  • Linux kernel's epoll, virtual memory areas (vma_struct). Red-black tree.
  • Some database B-tree variants. Effectively RB on a per-page basis when leaf data is in-memory.
  • AVL trees in practice. Some embedded systems libraries (with strict latency bounds favoring shorter trees), some database catalog indexes where reads dominate, some algorithmic libraries for theoretical work. AVL is rarely the default in modern standard libraries.

Common misconceptions

  • "AVL is always faster than RB." Only for lookups. Mutations are cheaper on RB. Mixed workloads favor RB.
  • "RB is always faster than AVL." Only for writes. Lookup-heavy workloads favor AVL by 5–10%.
  • "AVL's height bound is 1.5 log N." Close — the precise bound is 1.4404 log2(N+2) − 0.328. The 1.44 coefficient is the golden-ratio-related constant 1/log2(phi).
  • "RB does at most 2 rotations per delete." Insert yes, delete is up to 3. Different operations have different rotation budgets.
  • "AVL's balance factor must be stored as integer." No — it fits in 3 bits ({-2, -1, 0, +1, +2} during rebalance). Some implementations store the full subtree height (1 byte) instead for simpler code.
  • "Red-black trees are 'order' trees by definition." No — they're a coloring scheme on top of an otherwise plain BST. The colors enforce balance; ordering is the BST invariant, unrelated.

Frequently asked questions

Which is faster — red-black or AVL?

It depends on the workload. AVL is faster for read-heavy workloads because it keeps the tree shorter — typically 5–10% fewer comparisons per lookup. Red-black is faster for write-heavy workloads because it does fewer rotations per insert (at most 2 vs AVL's potential O(log n) rebalancing). For mixed workloads where reads and writes are roughly balanced, red-black usually wins overall because the rotation savings on writes dominate the small lookup penalty.

What are the height bounds?

AVL: at most 1.4404 * log2(n + 2) - 0.328 ≈ 1.44 log2 n. Red-black: at most 2 * log2(n + 1). For n = 1,000,000, AVL allows height up to about 30; red-black up to about 40. Both are O(log n), but the AVL constant is tighter. Red-black trees can be up to 2x as tall, but in practice the difference for random keys is small.

Why does red-black do fewer rotations?

AVL's tight ±1 balance means even minor imbalances trigger rotation — and rotations may cascade up the tree, since rebalancing the parent can imbalance the grandparent. Red-black's looser invariant (path lengths within 2x) tolerates more imbalance before forcing a rotation. RB inserts do at most 2 rotations regardless of tree size; AVL inserts may rotate at every level on the way up.

What's stored per node — the metadata cost?

AVL stores either a balance factor (-2 to +2, fits in 3 bits) or the full subtree height (typically 1 byte = 8 bits since heights stay under 64 for any practical n). Red-black stores 1 color bit. Both round up to 1 byte in practice due to padding.

Are there workloads where AVL clearly wins?

Yes. Read-mostly databases with a static-ish index — say, 95% lookups and 5% inserts — benefit from AVL's shorter tree height. Some embedded systems also choose AVL for the predictability of its bounds: AVL's worst-case lookup is at 1.44 log2 N, vs red-black's 2 log2 N, so AVL has a 30% tighter latency bound.

What about other balanced BSTs?

Many alternatives exist. WAVL trees (Haeupler-Sen-Tarjan 2015) get AVL-like balance with red-black-like rotation counts. Scapegoat trees rebuild subtrees instead of rotating. Treaps mix BST and heap properties for randomized balance. Skip lists offer O(log n) probabilistically with no rotations whatsoever, naturally lock-free.