Algorithms

Heapsort

O(n log n) guaranteed, O(1) extra space — the safety net under quicksort

Heapsort builds a max-heap from the array in O(n), then repeatedly extracts the max and places it at the end. It guarantees O(n log n) worst-case time, sorts in-place, and never falls into quicksort's pathological cases.

Time (best)O(n log n)
Time (average)O(n log n)
Time (worst)O(n log n)
SpaceO(1)
StableNo
In-placeYes
AdaptiveNo

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How heapsort works

Heapsort is selection sort done correctly. Selection sort scans the unsorted region linearly to find the next smallest — O(n) work per pick, O(n²) total. Heapsort organizes the unsorted region as a heap, finding the next largest in O(log n) — O(n log n) total.

The algorithm has two phases:

Build a max-heap in place over the entire array. After this phase, arr[0] contains the largest element, and the array satisfies the heap property: every parent is ≥ both children.
Sort by repeated extract-max. Swap arr[0] (the max) with the last unsorted slot. Shrink the heap region by one. Sift down the new root to restore the heap property. Repeat until one element remains.

After phase 2, the original array region is filled left-to-right with extracted maxes — placed at progressively earlier positions — yielding ascending order.

Storing a heap in an array

The heap is conceptually a complete binary tree, but lives in the same array. Index arithmetic encodes the tree:

For a node at index i (0-based):
  parent      = (i - 1) / 2
  left child  = 2i + 1
  right child = 2i + 2

Tree:                Array layout:
       16            [16, 14, 10, 8, 7, 9, 3, 2, 4, 1]
      /  \
    14    10
    / \   / \
   8   7 9   3
  /\  /
 2  4 1

No pointers, no allocations, perfect cache density when the heap is small. The trade-off: parent and child indices grow exponentially apart, so once the heap exceeds cache size, accesses become semi-random.

Trace through 5 elements

Sorting [3, 1, 4, 1, 5]:

=== Phase 1: build max-heap ===

Start tree:        3
                  / \
                 1   4
                / \
               1   5

heapify from i=1 (last non-leaf in 0-indexed):
  arr[1] = 1, children = arr[3]=1, arr[4]=5
  largest child = 5; swap arr[1] ↔ arr[4]
  → [3, 5, 4, 1, 1]
              5
             / \
            1   4    (now sift down at i=4: leaf, stop)
           / \
          1   1

heapify from i=0:
  arr[0] = 3, children = arr[1]=5, arr[2]=4
  largest child = 5; swap arr[0] ↔ arr[1]
  → [5, 3, 4, 1, 1]
  sift down at i=1: arr[1]=3, children arr[3]=1, arr[4]=1
  3 ≥ both children, stop.

Heap built: [5, 3, 4, 1, 1]   (max=5 at root)

=== Phase 2: extract-max repeatedly ===

Iter 1: swap arr[0] ↔ arr[4]:    [1, 3, 4, 1 | 5]
        heap size = 4, sift down arr[0]:
        arr[0]=1, children arr[1]=3, arr[2]=4 → swap with 4
        [4, 3, 1, 1 | 5]
        sift down at i=2: leaf, stop.

Iter 2: swap arr[0] ↔ arr[3]:    [1, 3, 1 | 4, 5]
        heap size = 3, sift down arr[0]:
        arr[0]=1, children arr[1]=3, arr[2]=1 → swap with 3
        [3, 1, 1 | 4, 5]

Iter 3: swap arr[0] ↔ arr[2]:    [1, 1 | 3, 4, 5]
        heap size = 2, sift down arr[0]:
        arr[0]=1, child arr[1]=1, no swap.
        [1, 1 | 3, 4, 5]

Iter 4: swap arr[0] ↔ arr[1]:    [1 | 1, 3, 4, 5]
        heap size = 1, done.

Final: [1, 1, 3, 4, 5] ✓

When to use heapsort

Worst-case guarantees matter. Quicksort's O(n²) worst case can be triggered by adversarial inputs. Heapsort never gets worse than O(n log n), no matter the input.
Memory budgets are tight. Merge sort needs O(n) auxiliary space; heapsort needs O(1). Useful in embedded systems, kernel code, or any context where you can't allocate.
You need a priority queue anyway. If you already have a heap data structure, heapsort is essentially free — repeated extract-min/max into an output array.
You want a top-k extraction. Heapsort can be stopped after k extracts, giving you the k largest in O(n + k log n) — useful for partial sorts.

The most common heapsort sighting in production code: as the fallback inside introsort. Quicksort runs by default, but if recursion depth exceeds 2·log₂(n), the algorithm switches to heapsort to avoid the quadratic blowup. C++'s std::sort, .NET's Array.Sort, and Linux kernel's sorting all use this pattern.

Heapsort vs other O(n log n) sorts

	Heapsort	Quicksort	Merge sort	Timsort
Best case	O(n log n)	O(n log n)	O(n log n)	O(n)
Average	O(n log n)	O(n log n)	O(n log n)	O(n log n)
Worst case	O(n log n)	O(n²)	O(n log n)	O(n log n)
Space	O(1)	O(log n) stack	O(n)	O(n)
Stable	No	No	Yes	Yes
Adaptive	No	No	No	Yes
Cache locality	Poor	Excellent	OK	Good
Comparisons (avg)	~2n log n	~1.39n log n	~n log n	~n log n
Best for	Worst-case bound	Random data	Linked lists, stable	Real-world data

Heapsort has the best worst case among comparison sorts but the worst constant factor. Quicksort runs faster on random data because of cache locality. Merge sort is the only O(n log n) stable sort. Timsort dominates real-world (mostly-sorted) data because it's adaptive.

What "O(n log n)" actually costs

Heapsort's inner loop does two integer multiplies (or shifts), two array reads, a compare, and a conditional swap per level. About 8-15 ns per sift-down step on modern hardware:

n	Comparisons (~2n log₂ n)	Wall time (≈)	Quicksort (≈)
1,000	~20,000	0.2 ms	0.08 ms
10,000	~265,000	3 ms	1 ms
100,000	~3.3 million	40 ms	14 ms
1,000,000	~40 million	500 ms	180 ms
10,000,000	~466 million	6 s	2.2 s

The 2-3× gap with quicksort is the cache-locality penalty in numbers. Once n exceeds L2 cache (~256K-1M ints), every sift-down jumps across cache lines: a parent at index i and its child at index 2i+1 are 4·(i+1) bytes apart. For i = 100,000, that's 400KB — well outside L1.

JavaScript implementation

function heapsort(arr) {
  const n = arr.length;

  // Phase 1: build max-heap with bottom-up sift-down (Floyd's method).
  // Start from the last non-leaf node: index ⌊n/2⌋ - 1.
  for (let i = (n >> 1) - 1; i >= 0; i--) {
    siftDown(arr, i, n);
  }

  // Phase 2: repeatedly swap root (max) with the last unsorted slot,
  // shrink the heap, and restore the heap property.
  for (let end = n - 1; end > 0; end--) {
    [arr[0], arr[end]] = [arr[end], arr[0]];
    siftDown(arr, 0, end);
  }
  return arr;
}

function siftDown(arr, root, end) {
  while (true) {
    const left = 2 * root + 1;
    const right = 2 * root + 2;
    let largest = root;
    if (left < end && arr[left] > arr[largest]) largest = left;
    if (right < end && arr[right] > arr[largest]) largest = right;
    if (largest === root) return;
    [arr[root], arr[largest]] = [arr[largest], arr[root]];
    root = largest;
  }
}

Note that siftDown is iterative, not recursive — recursive sift-down adds 1.5-2× overhead from stack frames in tight loops. The end parameter gives sift-down the heap-region boundary; everything from index end onward is the sorted suffix.

Python implementation

def heapsort(arr):
    n = len(arr)
    # Phase 1: build max-heap (bottom-up)
    for i in range(n // 2 - 1, -1, -1):
        sift_down(arr, i, n)
    # Phase 2: extract-max repeatedly
    for end in range(n - 1, 0, -1):
        arr[0], arr[end] = arr[end], arr[0]
        sift_down(arr, 0, end)
    return arr

def sift_down(arr, root, end):
    while True:
        left = 2 * root + 1
        right = 2 * root + 2
        largest = root
        if left < end and arr[left] > arr[largest]:
            largest = left
        if right < end and arr[right] > arr[largest]:
            largest = right
        if largest == root:
            return
        arr[root], arr[largest] = arr[largest], arr[root]
        root = largest

# Note: Python's heapq module gives min-heap operations.
# For sorting, you'd negate keys or do:
import heapq
def heapsort_via_heapq(arr):
    h = list(arr)
    heapq.heapify(h)        # O(n) min-heap construction
    return [heapq.heappop(h) for _ in range(len(h))]  # O(n log n)

Why heap-build is O(n), not O(n log n)

The phase-1 loop runs n/2 times, each calling sift-down. A naive analysis says each sift-down is O(log n) so total is O(n log n). But that's loose. Sift-down at depth d costs O(h − d), where h = log₂(n) is the tree height:

Level 0 (root, 1 node): up to log n swaps
Level 1 (2 nodes): up to log n − 1 swaps each
Level 2 (4 nodes): up to log n − 2 swaps each
...
Level log n − 1 (n/2 leaves): 0 swaps each

Sum: Σ (n / 2^(d+1)) · (h − d) = O(n). The leaves dominate the count but do zero work; the root does O(log n) work but there's only one of it. The series telescopes to a constant times n.

Top-down vs bottom-up heapify

Two ways to build a heap:

Top-down (sift-up): Insert elements one at a time, calling sift-up after each insertion. Each insertion is O(log n), total O(n log n).
Bottom-up (sift-down, Floyd's method): Place all elements first, then sift-down from the last non-leaf to the root. Total O(n).

Floyd's method is the standard for heapsort because it's a constant factor faster: O(n) vs O(n log n) for the build phase. The phase-2 extract loop is O(n log n) either way, so the build savings is real but proportionally small (~10-15% of total runtime).

The intuition for why bottom-up is faster: top-down asks every node to walk up to the root (long path); bottom-up asks every node to walk down to the leaves (and most nodes are already close to the leaves).

Heapsort as the safety net: introsort

Most production sorts use introsort: quicksort by default, switching to heapsort if recursion depth gets dangerous. The pseudocode:

function introsort(arr, lo = 0, hi = arr.length - 1, depthLimit = 2 * Math.log2(arr.length)) {
  if (hi - lo < 16) {
    insertionSort(arr, lo, hi);  // small subarrays
    return;
  }
  if (depthLimit === 0) {
    heapsort(arr.slice(lo, hi + 1));  // worst-case fallback
    return;
  }
  const p = partition(arr, lo, hi);
  introsort(arr, lo, p - 1, depthLimit - 1);
  introsort(arr, p + 1, hi, depthLimit - 1);
}

The 2·log₂(n) threshold is the depth a healthy quicksort should reach. Crossing it implies pathologically bad pivots — adversarial input or all-duplicates — and triggers the switch. std::sort in C++ and Array.Sort in .NET both implement this pattern.

Common bugs and edge cases

Off-by-one in heap-build start. The first non-leaf in a 0-indexed array is at (n >> 1) - 1, not n / 2. Off-by-one here misses the topmost layer of internal nodes, leaving the heap broken.
Forgetting the end bound in sift-down. Sift-down must respect the shrinking heap boundary; reaching into the sorted suffix overwrites already-placed elements.
Recursive sift-down on deep heaps. Python's recursion limit (1000 by default) can blow up on heaps with n > 2¹⁰⁰⁰. Use iterative sift-down.
Min-heap/max-heap confusion. Heapsort needs a max-heap to sort ascending. If you reach for heapq in Python, that's a min-heap — you'll get descending order or need to negate keys.
Stability assumption. Code that relies on stable order (sorting by secondary keys via a sort chain) breaks silently when switched from merge sort to heapsort.
Trying to use it on linked lists. Heapsort needs O(1) random access — index arithmetic for parent/child. Linked lists make every access O(n), turning the algorithm into O(n² log n).

Frequently asked questions

Why is heapsort slower than quicksort if both are O(n log n)?

Heapsort's memory access pattern jumps between parent and child indices that are far apart in the array (parent of index i is at i/2, child at 2i). This thrashes the cache. Quicksort's partition step reads sequential memory — perfect for prefetchers. In practice quicksort is 2-3× faster on random data despite identical asymptotics.

Why is heap construction O(n) and not O(n log n)?

Floyd's bottom-up heapify works on the leaves first, where sift-down does zero work. Each level up doubles the work per node but halves the node count. The sum n/4·1 + n/8·2 + n/16·3 + ... converges to O(n) — leaves dominate the count, internal nodes the work, and they balance out.

Is heapsort stable?

No. Equal elements can swap positions during heap construction or extraction — sift-down picks one of two equal children arbitrarily. To make it stable you must augment keys with their original indices, costing O(n) extra space and turning equal-key comparisons into tie-breaks by index.

What is the difference between sift-up and sift-down?

Sift-up (or 'percolate up') moves a too-large element toward the root by repeated parent swaps. Sift-down ('percolate down') moves a too-small root toward the leaves by repeated child swaps. Heap construction uses sift-down; insertion uses sift-up. Heapsort uses sift-down exclusively.

Why is heapsort the textbook example for O(n log n) lower bound?

Because it achieves the comparison-sort lower bound (proven via decision trees: any comparison sort needs ⌈log₂(n!)⌉ ≈ n log n − 1.44n comparisons). Heapsort does ~2n log n comparisons, within a constant factor of optimal — and unlike merge sort, it does it in O(1) extra space.

When should I pick heapsort over quicksort or merge sort?

When you need a worst-case guarantee in O(1) extra space. Embedded systems with strict memory budgets, real-time systems where O(n²) quicksort blowup is unacceptable, and security-sensitive contexts where adversarial inputs could trigger quicksort's worst case all favor heapsort. Linux's introsort falls back to heapsort when quicksort recursion depth exceeds 2·log₂(n).