Algorithms
Insertion Sort
The sort everyone learns first — and the one your standard library secretly uses
Insertion sort builds a sorted array one element at a time by shifting larger items right and slotting each new element into place. It runs in O(n²) worst case but O(n) on nearly-sorted data, and is the fastest sort known for arrays under ~30 elements.
- Time (best)O(n)
- Time (average)O(n²)
- Time (worst)O(n²)
- SpaceO(1)
- StableYes
- In-placeYes
- AdaptiveYes
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
How insertion sort works
Picture sorting a hand of playing cards. You pick them up one at a time and slot each new card into its correct position among the cards already in your hand. Cards larger than the new one shift right to make room. That's insertion sort, exactly.
The algorithm walks the array from left to right. At index i, the prefix arr[0..i-1] is already sorted. The job is to extend that sorted prefix by one — by taking arr[i] and shifting it left past every larger element until it reaches its rightful place.
- Loop
ifrom 1 ton-1. - Save
key = arr[i]. - Set
j = i - 1. - While
j ≥ 0andarr[j] > key: shiftarr[j]right by one, decrementj. - Place
keyatarr[j+1].
Trace through 5 elements
Sorting [5, 2, 4, 6, 1] step by step. The bracket marks the sorted prefix:
Start: [5] 2 4 6 1 (one element is trivially sorted)
i = 1, key = 2:
[5] 2 4 6 1 2 < 5, shift 5 right
[5 5] 4 6 1 place 2 at index 0
[2 5] 4 6 1 ✓
i = 2, key = 4:
[2 5] 4 6 1 4 < 5, shift 5 right
[2 5 5] 6 1 4 > 2, stop
[2 4 5] 6 1 place 4 at index 1 ✓
i = 3, key = 6:
[2 4 5] 6 1 6 > 5, no shift
[2 4 5 6] 1 place 6 at index 3 ✓
i = 4, key = 1:
[2 4 5 6] 1 1 < 6, shift
[2 4 5 6 6] 1 < 5, shift
[2 4 5 5 6] 1 < 4, shift
[2 4 4 5 6] 1 < 2, shift
[2 2 4 5 6] j = -1, place 1 at index 0
[1 2 4 5 6] ✓
Total: 7 comparisons, 5 shifts. The last element triggered a full-length shift — the worst case for any single insertion is O(i).
When to use insertion sort
- Small arrays. Below ~16-32 elements, insertion sort beats quicksort and merge sort because its constants are 5-10× smaller. Java's
Arrays.sort, V8'sArray.prototype.sort, and Rust'sslice::sortall switch to insertion sort below their threshold. - Nearly-sorted data. If only k elements are out of place, insertion sort runs in O(n + k) time. A sorted array is processed in n−1 comparisons and zero shifts.
- Streaming input. Insertion sort is online — it can sort data as it arrives without seeing the full input first.
- Tight memory budgets. O(1) extra space, no recursion stack, no auxiliary array. It runs comfortably in 512 bytes of RAM.
Insertion sort is what Timsort (Python's and Java's default) does on every "run" of less than 32 elements. The O(n log n) outer machinery runs less often than you'd think.
Insertion sort vs other sorts
| Insertion | Selection | Bubble | Merge | |
|---|---|---|---|---|
| Best case | O(n) | O(n²) | O(n) | O(n log n) |
| Average | O(n²) | O(n²) | O(n²) | O(n log n) |
| Worst case | O(n²) | O(n²) | O(n²) | O(n log n) |
| Space | O(1) | O(1) | O(1) | O(n) |
| Stable | Yes | No | Yes | Yes |
| In-place | Yes | Yes | Yes | No |
| Adaptive | Yes | No | Yes | No |
| Cache locality | Excellent | Excellent | Excellent | Poor (merge step) |
| Best for | Small/sorted arrays | Minimizing writes | Teaching only | Linked lists, stable sort guarantee |
Among the O(n²) sorts, insertion sort dominates. Bubble sort has the same asymptotics but does roughly twice the swaps on random input. Selection sort minimizes writes (useful for flash memory) but is the only one that's not adaptive.
What "O(n²)" actually costs
Concrete numbers help. On modern x86 hardware, insertion sort on integers runs at about 5-10 ns per comparison-swap pair (the inner loop is two reads, a compare, and a conditional write — fully pipelined and cache-resident).
| n | Worst-case ops | Wall time (≈) |
|---|---|---|
| 10 | 45 | 0.5 µs |
| 100 | 4,950 | 50 µs |
| 1,000 | 499,500 | 5 ms |
| 10,000 | 49,995,000 | 500 ms |
| 100,000 | ~5×10⁹ | ~50 s |
That cliff between 10,000 (half a second) and 100,000 (nearly a minute) is why insertion sort isn't your top-level choice for large n. But notice the small-n end: 100 elements in 50 microseconds. Quicksort on the same input pays its recursion overhead and lands at roughly 80 µs — slower.
JavaScript implementation
function insertionSort(arr) {
for (let i = 1; i < arr.length; i++) {
const key = arr[i];
let j = i - 1;
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j--;
}
arr[j + 1] = key;
}
return arr;
}
Note the assignment chain rather than three-way swaps — each element shifts exactly once per inner-loop iteration. Naive implementations with swap() calls do 3× the writes for no benefit.
Sort-already-sorted edge case
// On already-sorted input, the inner while-loop's condition fails
// immediately on every i, so insertion sort does n-1 comparisons
// and 0 shifts — true O(n) behavior.
const sorted = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
insertionSort(sorted); // 9 comparisons, 0 shifts
// Reverse-sorted is the worst case: every key shifts to position 0.
const reversed = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1];
insertionSort(reversed); // 45 comparisons, 45 shifts
Python implementation
def insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i - 1
while j >= 0 and arr[j] > key:
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
return arr
# Adaptive demo: nearly-sorted input is processed in near-linear time
nearly_sorted = list(range(1000))
nearly_sorted[500], nearly_sorted[501] = nearly_sorted[501], nearly_sorted[500]
insertion_sort(nearly_sorted) # ~1000 comparisons, 1 shift
Python's built-in list.sort() uses Timsort, which falls back to insertion sort for runs under 32 elements. So you're using insertion sort whether you call it directly or not.
Binary insertion sort
The vanilla algorithm uses linear scan to find each insertion point — O(i) comparisons per element. If comparisons are expensive (e.g., string compares with long shared prefixes), you can binary-search the sorted prefix instead, dropping comparisons to O(log i):
function binaryInsertionSort(arr) {
for (let i = 1; i < arr.length; i++) {
const key = arr[i];
// Binary search for insertion point in arr[0..i-1]
let low = 0, high = i;
while (low < high) {
const mid = (low + high) >>> 1;
if (arr[mid] <= key) low = mid + 1;
else high = mid;
}
// Shift everything from `low` to `i-1` right by one
for (let j = i; j > low; j--) arr[j] = arr[j - 1];
arr[low] = key;
}
return arr;
}
Total comparisons drop from ~n²/4 to ~n log n. The shifts still cost O(n²) total, so this matters only when comparison is the bottleneck. Java's Timsort uses binary insertion sort for runs of length 7-32.
Shellsort: insertion sort with gaps
Shellsort (1959) was the first sub-quadratic sort. It runs insertion sort on subarrays formed by elements at gap distances — first far apart, then closer, finally adjacent:
function shellsort(arr) {
// Ciura's gap sequence (empirically near-optimal)
const gaps = [701, 301, 132, 57, 23, 10, 4, 1];
for (const gap of gaps) {
if (gap >= arr.length) continue;
for (let i = gap; i < arr.length; i++) {
const key = arr[i];
let j = i;
while (j >= gap && arr[j - gap] > key) {
arr[j] = arr[j - gap];
j -= gap;
}
arr[j] = key;
}
}
return arr;
}
The big gaps quickly move elements close to their final positions; the small gaps clean up. Final pass (gap=1) is plain insertion sort on a near-sorted array — O(n) territory. Best-known gap sequences give O(n log² n) average performance, beating insertion sort handily on n > 1000 while keeping in-place and stable-ish behavior.
Common bugs and edge cases
- Off-by-one on the outer loop. Start at
i = 1, noti = 0— index 0 is trivially a sorted prefix of length 1. - Forgetting
j ≥ 0in the inner condition. Without the bounds check, the smallest element walks off the front of the array and readsarr[-1](segfault in C, undefined in JS). - Using
>=instead of>. Strict inequality is what makes the sort stable — equal elements keep their original relative order. Replacing with>=reverses the order of duplicates. - Three-way swaps instead of shifts. Writing
swap(arr, j, j+1)in the inner loop does 3 assignments per step instead of 1. Always shift then place once. - Calling on a linked list. Insertion sort is conceptually fine on linked lists, but the random-access shift becomes O(n) per element, ruining the constant. Use merge sort for linked lists.
Frequently asked questions
Why is insertion sort used inside quicksort and Timsort?
On small subarrays (typically under 16-32 elements), insertion sort beats quicksort and merge sort despite its O(n²) class. The constant factor is tiny — no recursion overhead, no extra allocations, and excellent cache locality. Most production sorts switch to insertion sort below a threshold.
What does it mean that insertion sort is adaptive?
An adaptive sort runs faster on data that is already partially sorted. Insertion sort takes O(n + k) time where k is the number of inversions — pairs of elements out of order. On already-sorted input it makes only n−1 comparisons and zero swaps.
Is insertion sort or selection sort faster?
Insertion sort is almost always faster in practice. Selection sort always performs n²/2 comparisons regardless of input order, while insertion sort can finish in O(n) on sorted data. Insertion sort is also stable; selection sort is not (in its standard form).
What is binary insertion sort?
A variant that uses binary search to find the insertion point, reducing comparisons from O(n) to O(log n) per element. The number of shifts stays O(n) per element though, so total work is still O(n²) — just with fewer comparisons. Useful when comparisons are expensive (e.g., comparing long strings).
How is shellsort related to insertion sort?
Shellsort runs insertion sort on subarrays of decreasing gap size — first elements 5 apart, then 3 apart, then 1 apart. The early passes move elements long distances cheaply, leaving the final pass with very few inversions. Total complexity drops to O(n log² n) or better depending on gap sequence.
Why does insertion sort have such good cache behavior?
Every read and write is to adjacent memory locations — perfect for hardware prefetchers. There is no recursion, no auxiliary array, no random access. On a modern CPU, insertion sort on 32 ints fits entirely in L1 cache and executes at near-pipeline-throughput rates.