Algorithms
Heapsort
O(n log n) guaranteed, O(1) extra space — the safety net under quicksort
Heapsort builds a max-heap from the array in O(n), then repeatedly extracts the max and places it at the end. It guarantees O(n log n) worst-case time, sorts in-place, and never falls into quicksort's pathological cases.
- Time (best)O(n log n)
- Time (average)O(n log n)
- Time (worst)O(n log n)
- SpaceO(1)
- StableNo
- In-placeYes
- AdaptiveNo
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
How heapsort works
Heapsort is selection sort done correctly. Selection sort scans the unsorted region linearly to find the next smallest — O(n) work per pick, O(n²) total. Heapsort organizes the unsorted region as a heap, finding the next largest in O(log n) — O(n log n) total.
The algorithm has two phases:
- Build a max-heap in place over the entire array. After this phase,
arr[0]contains the largest element, and the array satisfies the heap property: every parent is ≥ both children. - Sort by repeated extract-max. Swap
arr[0](the max) with the last unsorted slot. Shrink the heap region by one. Sift down the new root to restore the heap property. Repeat until one element remains.
After phase 2, the original array region is filled left-to-right with extracted maxes — placed at progressively earlier positions — yielding ascending order.
Storing a heap in an array
The heap is conceptually a complete binary tree, but lives in the same array. Index arithmetic encodes the tree:
For a node at index i (0-based):
parent = (i - 1) / 2
left child = 2i + 1
right child = 2i + 2
Tree: Array layout:
16 [16, 14, 10, 8, 7, 9, 3, 2, 4, 1]
/ \
14 10
/ \ / \
8 7 9 3
/\ /
2 4 1
No pointers, no allocations, perfect cache density when the heap is small. The trade-off: parent and child indices grow exponentially apart, so once the heap exceeds cache size, accesses become semi-random.
Trace through 5 elements
Sorting [3, 1, 4, 1, 5]:
=== Phase 1: build max-heap ===
Start tree: 3
/ \
1 4
/ \
1 5
heapify from i=1 (last non-leaf in 0-indexed):
arr[1] = 1, children = arr[3]=1, arr[4]=5
largest child = 5; swap arr[1] ↔ arr[4]
→ [3, 5, 4, 1, 1]
5
/ \
1 4 (now sift down at i=4: leaf, stop)
/ \
1 1
heapify from i=0:
arr[0] = 3, children = arr[1]=5, arr[2]=4
largest child = 5; swap arr[0] ↔ arr[1]
→ [5, 3, 4, 1, 1]
sift down at i=1: arr[1]=3, children arr[3]=1, arr[4]=1
3 ≥ both children, stop.
Heap built: [5, 3, 4, 1, 1] (max=5 at root)
=== Phase 2: extract-max repeatedly ===
Iter 1: swap arr[0] ↔ arr[4]: [1, 3, 4, 1 | 5]
heap size = 4, sift down arr[0]:
arr[0]=1, children arr[1]=3, arr[2]=4 → swap with 4
[4, 3, 1, 1 | 5]
sift down at i=2: leaf, stop.
Iter 2: swap arr[0] ↔ arr[3]: [1, 3, 1 | 4, 5]
heap size = 3, sift down arr[0]:
arr[0]=1, children arr[1]=3, arr[2]=1 → swap with 3
[3, 1, 1 | 4, 5]
Iter 3: swap arr[0] ↔ arr[2]: [1, 1 | 3, 4, 5]
heap size = 2, sift down arr[0]:
arr[0]=1, child arr[1]=1, no swap.
[1, 1 | 3, 4, 5]
Iter 4: swap arr[0] ↔ arr[1]: [1 | 1, 3, 4, 5]
heap size = 1, done.
Final: [1, 1, 3, 4, 5] ✓
When to use heapsort
- Worst-case guarantees matter. Quicksort's O(n²) worst case can be triggered by adversarial inputs. Heapsort never gets worse than O(n log n), no matter the input.
- Memory budgets are tight. Merge sort needs O(n) auxiliary space; heapsort needs O(1). Useful in embedded systems, kernel code, or any context where you can't allocate.
- You need a priority queue anyway. If you already have a heap data structure, heapsort is essentially free — repeated extract-min/max into an output array.
- You want a top-k extraction. Heapsort can be stopped after k extracts, giving you the k largest in O(n + k log n) — useful for partial sorts.
The most common heapsort sighting in production code: as the fallback inside introsort. Quicksort runs by default, but if recursion depth exceeds 2·log₂(n), the algorithm switches to heapsort to avoid the quadratic blowup. C++'s std::sort, .NET's Array.Sort, and Linux kernel's sorting all use this pattern.
Heapsort vs other O(n log n) sorts
| Heapsort | Quicksort | Merge sort | Timsort | |
|---|---|---|---|---|
| Best case | O(n log n) | O(n log n) | O(n log n) | O(n) |
| Average | O(n log n) | O(n log n) | O(n log n) | O(n log n) |
| Worst case | O(n log n) | O(n²) | O(n log n) | O(n log n) |
| Space | O(1) | O(log n) stack | O(n) | O(n) |
| Stable | No | No | Yes | Yes |
| Adaptive | No | No | No | Yes |
| Cache locality | Poor | Excellent | OK | Good |
| Comparisons (avg) | ~2n log n | ~1.39n log n | ~n log n | ~n log n |
| Best for | Worst-case bound | Random data | Linked lists, stable | Real-world data |
Heapsort has the best worst case among comparison sorts but the worst constant factor. Quicksort runs faster on random data because of cache locality. Merge sort is the only O(n log n) stable sort. Timsort dominates real-world (mostly-sorted) data because it's adaptive.
What "O(n log n)" actually costs
Heapsort's inner loop does two integer multiplies (or shifts), two array reads, a compare, and a conditional swap per level. About 8-15 ns per sift-down step on modern hardware:
| n | Comparisons (~2n log₂ n) | Wall time (≈) | Quicksort (≈) |
|---|---|---|---|
| 1,000 | ~20,000 | 0.2 ms | 0.08 ms |
| 10,000 | ~265,000 | 3 ms | 1 ms |
| 100,000 | ~3.3 million | 40 ms | 14 ms |
| 1,000,000 | ~40 million | 500 ms | 180 ms |
| 10,000,000 | ~466 million | 6 s | 2.2 s |
The 2-3× gap with quicksort is the cache-locality penalty in numbers. Once n exceeds L2 cache (~256K-1M ints), every sift-down jumps across cache lines: a parent at index i and its child at index 2i+1 are 4·(i+1) bytes apart. For i = 100,000, that's 400KB — well outside L1.
JavaScript implementation
function heapsort(arr) {
const n = arr.length;
// Phase 1: build max-heap with bottom-up sift-down (Floyd's method).
// Start from the last non-leaf node: index ⌊n/2⌋ - 1.
for (let i = (n >> 1) - 1; i >= 0; i--) {
siftDown(arr, i, n);
}
// Phase 2: repeatedly swap root (max) with the last unsorted slot,
// shrink the heap, and restore the heap property.
for (let end = n - 1; end > 0; end--) {
[arr[0], arr[end]] = [arr[end], arr[0]];
siftDown(arr, 0, end);
}
return arr;
}
function siftDown(arr, root, end) {
while (true) {
const left = 2 * root + 1;
const right = 2 * root + 2;
let largest = root;
if (left < end && arr[left] > arr[largest]) largest = left;
if (right < end && arr[right] > arr[largest]) largest = right;
if (largest === root) return;
[arr[root], arr[largest]] = [arr[largest], arr[root]];
root = largest;
}
}
Note that siftDown is iterative, not recursive — recursive sift-down adds 1.5-2× overhead from stack frames in tight loops. The end parameter gives sift-down the heap-region boundary; everything from index end onward is the sorted suffix.
Python implementation
def heapsort(arr):
n = len(arr)
# Phase 1: build max-heap (bottom-up)
for i in range(n // 2 - 1, -1, -1):
sift_down(arr, i, n)
# Phase 2: extract-max repeatedly
for end in range(n - 1, 0, -1):
arr[0], arr[end] = arr[end], arr[0]
sift_down(arr, 0, end)
return arr
def sift_down(arr, root, end):
while True:
left = 2 * root + 1
right = 2 * root + 2
largest = root
if left < end and arr[left] > arr[largest]:
largest = left
if right < end and arr[right] > arr[largest]:
largest = right
if largest == root:
return
arr[root], arr[largest] = arr[largest], arr[root]
root = largest
# Note: Python's heapq module gives min-heap operations.
# For sorting, you'd negate keys or do:
import heapq
def heapsort_via_heapq(arr):
h = list(arr)
heapq.heapify(h) # O(n) min-heap construction
return [heapq.heappop(h) for _ in range(len(h))] # O(n log n)
Why heap-build is O(n), not O(n log n)
The phase-1 loop runs n/2 times, each calling sift-down. A naive analysis says each sift-down is O(log n) so total is O(n log n). But that's loose. Sift-down at depth d costs O(h − d), where h = log₂(n) is the tree height:
- Level 0 (root, 1 node): up to log n swaps
- Level 1 (2 nodes): up to log n − 1 swaps each
- Level 2 (4 nodes): up to log n − 2 swaps each
- ...
- Level log n − 1 (n/2 leaves): 0 swaps each
Sum: Σ (n / 2^(d+1)) · (h − d) = O(n). The leaves dominate the count but do zero work; the root does O(log n) work but there's only one of it. The series telescopes to a constant times n.
Top-down vs bottom-up heapify
Two ways to build a heap:
- Top-down (sift-up): Insert elements one at a time, calling sift-up after each insertion. Each insertion is O(log n), total O(n log n).
- Bottom-up (sift-down, Floyd's method): Place all elements first, then sift-down from the last non-leaf to the root. Total O(n).
Floyd's method is the standard for heapsort because it's a constant factor faster: O(n) vs O(n log n) for the build phase. The phase-2 extract loop is O(n log n) either way, so the build savings is real but proportionally small (~10-15% of total runtime).
The intuition for why bottom-up is faster: top-down asks every node to walk up to the root (long path); bottom-up asks every node to walk down to the leaves (and most nodes are already close to the leaves).
Heapsort as the safety net: introsort
Most production sorts use introsort: quicksort by default, switching to heapsort if recursion depth gets dangerous. The pseudocode:
function introsort(arr, lo = 0, hi = arr.length - 1, depthLimit = 2 * Math.log2(arr.length)) {
if (hi - lo < 16) {
insertionSort(arr, lo, hi); // small subarrays
return;
}
if (depthLimit === 0) {
heapsort(arr.slice(lo, hi + 1)); // worst-case fallback
return;
}
const p = partition(arr, lo, hi);
introsort(arr, lo, p - 1, depthLimit - 1);
introsort(arr, p + 1, hi, depthLimit - 1);
}
The 2·log₂(n) threshold is the depth a healthy quicksort should reach. Crossing it implies pathologically bad pivots — adversarial input or all-duplicates — and triggers the switch. std::sort in C++ and Array.Sort in .NET both implement this pattern.
Common bugs and edge cases
- Off-by-one in heap-build start. The first non-leaf in a 0-indexed array is at
(n >> 1) - 1, notn / 2. Off-by-one here misses the topmost layer of internal nodes, leaving the heap broken. - Forgetting the
endbound in sift-down. Sift-down must respect the shrinking heap boundary; reaching into the sorted suffix overwrites already-placed elements. - Recursive sift-down on deep heaps. Python's recursion limit (1000 by default) can blow up on heaps with n > 2¹⁰⁰⁰. Use iterative sift-down.
- Min-heap/max-heap confusion. Heapsort needs a max-heap to sort ascending. If you reach for
heapqin Python, that's a min-heap — you'll get descending order or need to negate keys. - Stability assumption. Code that relies on stable order (sorting by secondary keys via a sort chain) breaks silently when switched from merge sort to heapsort.
- Trying to use it on linked lists. Heapsort needs O(1) random access — index arithmetic for parent/child. Linked lists make every access O(n), turning the algorithm into O(n² log n).
Frequently asked questions
Why is heapsort slower than quicksort if both are O(n log n)?
Heapsort's memory access pattern jumps between parent and child indices that are far apart in the array (parent of index i is at i/2, child at 2i). This thrashes the cache. Quicksort's partition step reads sequential memory — perfect for prefetchers. In practice quicksort is 2-3× faster on random data despite identical asymptotics.
Why is heap construction O(n) and not O(n log n)?
Floyd's bottom-up heapify works on the leaves first, where sift-down does zero work. Each level up doubles the work per node but halves the node count. The sum n/4·1 + n/8·2 + n/16·3 + ... converges to O(n) — leaves dominate the count, internal nodes the work, and they balance out.
Is heapsort stable?
No. Equal elements can swap positions during heap construction or extraction — sift-down picks one of two equal children arbitrarily. To make it stable you must augment keys with their original indices, costing O(n) extra space and turning equal-key comparisons into tie-breaks by index.
What is the difference between sift-up and sift-down?
Sift-up (or 'percolate up') moves a too-large element toward the root by repeated parent swaps. Sift-down ('percolate down') moves a too-small root toward the leaves by repeated child swaps. Heap construction uses sift-down; insertion uses sift-up. Heapsort uses sift-down exclusively.
Why is heapsort the textbook example for O(n log n) lower bound?
Because it achieves the comparison-sort lower bound (proven via decision trees: any comparison sort needs ⌈log₂(n!)⌉ ≈ n log n − 1.44n comparisons). Heapsort does ~2n log n comparisons, within a constant factor of optimal — and unlike merge sort, it does it in O(1) extra space.
When should I pick heapsort over quicksort or merge sort?
When you need a worst-case guarantee in O(1) extra space. Embedded systems with strict memory budgets, real-time systems where O(n²) quicksort blowup is unacceptable, and security-sensitive contexts where adversarial inputs could trigger quicksort's worst case all favor heapsort. Linux's introsort falls back to heapsort when quicksort recursion depth exceeds 2·log₂(n).