Data Structures

Heap

A priority queue in O(log n) — the array that thinks it's a tree

A binary heap is a complete binary tree stored in a flat array, where each parent is smaller (min-heap) or larger (max-heap) than its children. Insert and extract-min both run in O(log n); peek-min runs in O(1). It's the standard implementation of a priority queue and the backbone of heap sort, Dijkstra's algorithm, and event simulators.

  • InsertO(log n)
  • Extract-min / extract-maxO(log n)
  • PeekO(1)
  • Build heap from arrayO(n) — heapify
  • SpaceO(n) — flat array
  • Tree propertyComplete binary tree (level-by-level)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How a binary heap works

Two rules define a binary heap:

  1. Shape: the tree is complete — every level is filled left-to-right, last level may be partial. This is what allows array storage with no gaps.
  2. Heap property: for a min-heap, every parent ≤ its children. For a max-heap, every parent ≥ its children. Note: the heap property is local — siblings have no required relationship.

The shape rule lets you store the tree in a flat array indexed level by level. The heap property gets maintained by two helper operations:

  • Sift up (used after insert at the end): compare the new node with its parent; if the heap property is violated, swap. Repeat upward until the parent is smaller (min-heap) or until you reach the root.
  • Sift down (used after extract-min, when the last element is moved to the root): compare with the smaller of the two children; if the heap property is violated, swap with that child. Repeat downward until both children are larger or you reach a leaf.

Both sift operations run in O(log n) because the tree height is log₂(n) for a complete binary tree of n nodes.

Array representation

For a node stored at array index i:

  • Left child is at 2i + 1
  • Right child is at 2i + 2
  • Parent is at floor((i - 1) / 2)

So the heap [1, 3, 5, 7, 9, 8, 6] represents this tree:

        1            (root, index 0)
       / \
      3   5          (indices 1, 2)
     /|   |\
    7 9   8 6        (indices 3, 4, 5, 6)

Every parent ≤ its children, and the array fills level by level. No pointers, perfect cache locality.

When to use a heap

  • Priority queue. Whenever you need "give me the smallest/largest item available" repeatedly while items keep arriving. Task schedulers, event simulators, OS process priority, network packet QoS.
  • Dijkstra's algorithm and Prim's MST. Both repeatedly extract the next-closest unvisited node — a min-heap operation.
  • K-th largest / smallest. Maintain a heap of size k; for the k-th largest, use a min-heap of size k and pop the smallest whenever a new element exceeds the root. O(n log k), beats sorting (O(n log n)) when k ≪ n.
  • Median maintenance. Two heaps — a max-heap of the lower half and a min-heap of the upper half. Median is at the top of one of them. O(log n) per insert.
  • Heap sort. Build a max-heap, then repeatedly extract-max into the back of the array. In-place, O(n log n) worst case.
  • Event-driven simulation. Simulate events in chronological order: insert events with their timestamp as priority, repeatedly extract-min to get the next event to process.

Heap vs BST

HeapBST (balanced)
InsertO(log n)O(log n)
Find min / maxO(1)O(log n)
Extract min / maxO(log n)O(log n)
Search arbitrary valueO(n)O(log n)
Sorted iterationO(n log n)O(n) in-order
Memory layoutFlat array (cache-friendly)Pointer-heavy nodes
Best forPriority queue, partial orderOrdered map, range queries

A heap is the right answer when you only ever look at the min or max. A BST is right when you also need ordered iteration, range queries, or "find next-larger key X."

JavaScript implementation

class MinHeap {
  constructor() { this.heap = []; }
  size() { return this.heap.length; }
  peek() { return this.heap[0]; }

  push(value) {
    this.heap.push(value);
    this._siftUp(this.heap.length - 1);
  }

  pop() {
    if (!this.heap.length) return undefined;
    const min = this.heap[0];
    const last = this.heap.pop();
    if (this.heap.length) {
      this.heap[0] = last;
      this._siftDown(0);
    }
    return min;
  }

  _siftUp(i) {
    while (i > 0) {
      const parent = (i - 1) >> 1;
      if (this.heap[parent] <= this.heap[i]) break;
      [this.heap[parent], this.heap[i]] = [this.heap[i], this.heap[parent]];
      i = parent;
    }
  }

  _siftDown(i) {
    const n = this.heap.length;
    while (true) {
      const left = 2 * i + 1, right = 2 * i + 2;
      let smallest = i;
      if (left  < n && this.heap[left]  < this.heap[smallest]) smallest = left;
      if (right < n && this.heap[right] < this.heap[smallest]) smallest = right;
      if (smallest === i) break;
      [this.heap[smallest], this.heap[i]] = [this.heap[i], this.heap[smallest]];
      i = smallest;
    }
  }

  // Build a heap from an array in O(n) instead of n inserts (O(n log n))
  static heapify(arr) {
    const heap = new MinHeap();
    heap.heap = [...arr];
    for (let i = (arr.length >> 1) - 1; i >= 0; i--) heap._siftDown(i);
    return heap;
  }
}

Python implementation

import heapq

# Python's heapq is a min-heap built on a regular list
heap = []
heapq.heappush(heap, 5)
heapq.heappush(heap, 3)
heapq.heappush(heap, 8)
heapq.heappop(heap)  # → 3

# Build a heap from a list in O(n)
data = [9, 4, 1, 7, 2, 8, 5]
heapq.heapify(data)  # data is now a valid min-heap

# For a max-heap, negate values:
maxheap = []
heapq.heappush(maxheap, -5)
heapq.heappush(maxheap, -3)
-heapq.heappop(maxheap)  # → 5

Python's heapq is implemented in C and runs ~10× faster than any pure-Python heap. For production Python code, always use heapq.

Famous heap problems

K-th largest element in an array

function kthLargest(nums, k) {
  // Maintain a min-heap of size k. The root is always the k-th largest seen.
  const heap = new MinHeap();
  for (const n of nums) {
    heap.push(n);
    if (heap.size() > k) heap.pop();
  }
  return heap.peek();
}

O(n log k) time, O(k) space. Beats sorting (O(n log n), O(n) space) when k ≪ n.

Merge k sorted lists

function mergeKLists(lists) {
  const heap = new MinHeap();
  // Heap stores [value, listIndex, elementIndex]
  for (let i = 0; i < lists.length; i++) {
    if (lists[i].length) heap.push([lists[i][0], i, 0]);
  }
  const out = [];
  while (heap.size()) {
    const [val, listIdx, elemIdx] = heap.pop();
    out.push(val);
    if (elemIdx + 1 < lists[listIdx].length) {
      heap.push([lists[listIdx][elemIdx + 1], listIdx, elemIdx + 1]);
    }
  }
  return out;
}

O(n log k) where n is total elements and k is number of lists. The heap holds at most k elements at a time.

Common bugs and edge cases

  • Forgetting to sift after pop. Replacing the root with the last element doesn't preserve the heap property — you must sift down. Skipping this step silently produces wrong answers.
  • Building a heap with n inserts. O(n log n). Use heapify instead — O(n). Easy mistake when the input is a static array; an n=10⁶ array can take seconds vs milliseconds.
  • Comparing complex objects with the default comparator. Heaps need a total ordering. For tuples, the lexicographic default works. For custom objects, supply a comparator or wrap in a tuple (priority, value).
  • Using a heap when you need a sorted structure. Heaps are partially ordered — heap[1] and heap[2] have no defined relationship. Iterating the array doesn't give sorted output. Either pop everything (O(n log n)) or use a BST.
  • Decrease-key without an index map. A flat heap has no efficient way to find the node holding a particular value. If you need decrease-key (Dijkstra-style), maintain a separate hash map from value/id → array index, updated on every swap.

Frequently asked questions

How is a heap stored as an array?

The complete binary tree is laid out level-by-level in an array. For a node at index i, the left child is at 2i+1 and the right child is at 2i+2. The parent of index i is at floor((i-1)/2). No pointers, no allocation per node — just arithmetic on array indices. Cache-friendly and fast.

What's the difference between a heap and a BST?

A heap maintains a partial order — only the parent-child relationship. Siblings can be in any order. A BST maintains a total order — for every node, all of left subtree is smaller, all of right subtree is larger. The heap property is weaker, which is why heap operations are O(log n) without rotations. The cost: a heap can extract min/max in O(log n) but can't search for an arbitrary value in better than O(n).

How does heapify build a heap from an array in O(n)?

Start from the last non-leaf node (index n/2 - 1) and sift down each node toward the leaves. The sift-down at each level does work proportional to its height. The total work sums to O(n) because most nodes are near the bottom (low height) and only one node sifts log n levels. Counter-intuitive but mathematically tight — this is faster than n inserts (which would be O(n log n)).

Why does Dijkstra's algorithm use a heap?

Dijkstra repeatedly picks the next-closest unvisited node. With a heap, that's O(log V) per pick instead of O(V) with a flat array. For dense graphs the heap version isn't always faster, but for sparse graphs it gives Dijkstra its O((V + E) log V) bound. Fibonacci heaps drop it further to O(E + V log V) but the constants are huge — binary heaps win in practice almost always.

Is heap sort fast?

O(n log n) worst case, in-place, not stable. The constants are higher than quicksort and merge sort because of cache-unfriendly access patterns (each sift-down jumps around the array). Used when worst-case guarantees matter and you can't afford merge sort's O(n) extra memory — introsort uses heap sort as a fallback when quicksort recursion gets too deep.

Can a heap support decrease-key operation?

Yes, in O(log n) — sift the changed node up. But you need a way to find the node by value, which a flat heap doesn't support natively. The fix is a separate hash map from value-or-id to heap-array-index, updated on every swap. Used by Dijkstra and Prim's algorithm with priority updates.

What's a Fibonacci heap and is it ever worth using?

A Fibonacci heap is a more complex priority queue with O(1) amortized insert and decrease-key, O(log n) amortized extract-min. Theoretically gives the best Dijkstra bound. In practice, the constant factors are so high that binary heaps beat it for almost all real graph sizes. Used in theory papers, rarely in production code.