Data Structures
Implicit Treap
A treap that represents a sequence — split, merge, insert, reverse, all in O(log n)
An implicit treap stores a sequence using a treap where the in-order position is the key. No explicit key value — each node carries random priority and subtree size. Supports split, merge, insert at any position, delete range, and reverse range, all in O(log n) expected.
- Split / MergeO(log n) expected
- Insert / Delete at any iO(log n) expected
- Range reverseO(log n) lazy
- Range sum / minO(log n)
- Random accessO(log n)
- Used inropes, ICPC seqs, editor buffers
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
Why "implicit"?
A regular treap is a balanced BST keyed by a value. Each node stores both a key (BST-ordered) and a priority (heap-ordered, drawn at random). The structure supports search, insert, delete in expected O(log n).
An implicit treap drops the key field. Instead, each node's "key" is its in-order rank in the tree — equivalently, its position in the sequence the tree represents. To find the node at position k, you don't compare keys; you use subtree sizes to navigate. Each node stores size = 1 + left.size + right.size, and lookup walks the tree like this:
function find_at_position(node, k):
if node is null: return null
left_size = node.left.size if node.left else 0
if k == left_size: return node # current is at position k
if k < left_size: return find_at_position(node.left, k)
return find_at_position(node.right, k - left_size - 1)
This costs O(log n) expected — the same as a regular treap, but now the position acts as the implicit key. The treap represents a sequence, not a set.
Split and merge — the only two primitives you need
Almost every operation on an implicit treap reduces to two primitives: split and merge.
Split(T, k) divides T into two treaps: L containing positions 0..k−1, and R containing positions k..n−1. It walks down the tree once, redirecting child pointers based on whether each subtree falls left or right of the split point. Cost: O(log n) expected, the height of the tree.
Merge(L, R) concatenates two treaps, given that every position in L comes before every position in R in the desired sequence. Compare root priorities. The larger-priority root becomes the new root; recurse on the side that needs merging with the other treap. Cost: O(log n) expected.
struct Node {
int value;
int priority;
int size;
Node *left = nullptr, *right = nullptr;
Node(int v) : value(v), priority(rand()), size(1) {}
};
int get_size(Node* n) { return n ? n->size : 0; }
void update(Node* n) {
if (n) n->size = 1 + get_size(n->left) + get_size(n->right);
}
// Split T so the first k elements go to L, the rest to R.
void split(Node* t, int k, Node*& l, Node*& r) {
if (!t) { l = r = nullptr; return; }
int left_size = get_size(t->left);
if (k <= left_size) {
split(t->left, k, l, t->left);
r = t;
} else {
split(t->right, k - left_size - 1, t->right, r);
l = t;
}
update(t);
}
// Merge L and R (every pos in L < every pos in R).
Node* merge(Node* l, Node* r) {
if (!l || !r) return l ? l : r;
if (l->priority > r->priority) {
l->right = merge(l->right, r);
update(l);
return l;
} else {
r->left = merge(l, r->left);
update(r);
return r;
}
}
Everything else is split + merge
- Insert value v at position i. Split T at i → (L, R). Create a single-node treap N with value v. Return Merge(Merge(L, N), R). O(log n) expected.
- Delete range [l, r]. Split T at l → (A, B). Split B at r − l + 1 → (M, C). Discard M. Return Merge(A, C). O(log n).
- Random access at position i. Walk the tree following subtree sizes. O(log n).
- Range sum / min / max. Each node stores the aggregate of its subtree. Split out the [l, r] range, read the new root's aggregate, merge back. O(log n).
- Reverse range [l, r]. Split out [l, r] into a middle treap M. Set a lazy
reverseflag on M's root. Merge back. Lazy propagation lazily applies the reverse on descent. O(log n) per reverse, O(log n) amortized per access. - Cyclic shift [l, r] by k positions. Split, split, merge in the new order. O(log n).
- Concatenate two sequences. Single Merge call. O(log(|A| + |B|)) expected.
Lazy range reverse — the cool one
Reversing a range in O(log n) sounds impossible — you can't actually permute O(n) elements in less than O(n) time. The trick is laziness: you defer the work and only do it on demand.
Each node has a boolean reverse flag. When set, it means "everything in this subtree's in-order has been logically reversed, but we haven't pushed it down yet." To apply the lazy flag: swap the node's left and right children, then flip the reverse flag of both new children. The next time you descend into a node, you push down its flag first.
struct Node {
int value, priority, size;
bool rev = false;
Node *left = nullptr, *right = nullptr;
};
void push(Node* n) {
if (n && n->rev) {
swap(n->left, n->right);
if (n->left) n->left->rev ^= true;
if (n->right) n->right->rev ^= true;
n->rev = false;
}
}
void reverse_range(Node*& t, int l, int r) {
Node *a, *b, *c;
split(t, l, a, b);
split(b, r - l + 1, b, c);
if (b) b->rev ^= true;
t = merge(merge(a, b), c);
}
Every other operation (split, merge, find) starts by calling push on the current node so the lazy reverse is applied before any pointer manipulation. The amortized cost stays O(log n) — you do at most one push per node visit, and visits are bounded by O(log n) per operation.
Worked example: split [a, b, c, d, e, f, g] at position 3
Start with the sequence [a, b, c, d, e, f, g]. The implicit treap holds these as in-order. Let's say the tree has shape:
priorities d (p=90, size=7)
/ \
b f
/ \ / \
a c e g
in-order: a b c d e f g
Split at position 3 (zero-indexed) — first three elements [a, b, c] go to L, last four [d, e, f, g] go to R. Walk from root:
- At root d, left subtree (rooted at b) has size 3. We want first 3 to go to L; 3 ≤ left_size (3). So split left subtree at k=3 to get (LL, LR), set root.left = LR, return (LL, root).
- At b, left subtree (a) has size 1. We want first 3 to L; 3 > 1 (left_size). So split right subtree (c) at k = 3 − 1 − 1 = 1 to get (RL, RR), set b.right = RL, return (b, RR).
- At c (a leaf), we want first 1 element to L; 1 > 0 (left_size of leaf = 0). So split right (null) at k = 0 to get (nullptr, nullptr), set c.right = null, return (c, null).
- Unwind: at b, RR = null, b.right = c, return (b, null) → at root, LL = b (with c as right child), set root.left = null (LR), return (b, root). L = b (with subtree {a, c}), R = root d (with subtree {e, f, g}).
The two resulting trees contain [a, b, c] and [d, e, f, g] respectively. Each step descended one level — total work O(log 7) = 3 levels, matching the O(log n) expected.
Implicit treap vs alternatives
| Implicit treap | Array | Linked list | Rope (balanced) | |
|---|---|---|---|---|
| Random access | O(log n) | O(1) | O(n) | O(log n) |
| Insert at position | O(log n) | O(n) | O(n) find + O(1) insert | O(log n) |
| Delete at position | O(log n) | O(n) | O(n) find + O(1) delete | O(log n) |
| Range reverse | O(log n) lazy | O(n) | O(n) | O(log n) |
| Range sum | O(log n) | O(n) or O(log n) with BIT | O(n) | O(log n) |
| Code length | ~80-150 lines | 1 line | ~30 lines | ~200 lines |
| Best for | ICPC, ropes, dynamic seq | Static, sequential | Stream insert at known node | Text editor buffers |
| Constant factor | ~1-2 μs/op (10⁵ elems) | ~1-10 ns | ~1-10 ns/step | ~1-5 μs/op |
Python implementation
import random
class Node:
__slots__ = ('value', 'priority', 'size', 'left', 'right', 'rev')
def __init__(self, value):
self.value = value
self.priority = random.random()
self.size = 1
self.left = self.right = None
self.rev = False
def size(n): return n.size if n else 0
def push(n):
if n and n.rev:
n.left, n.right = n.right, n.left
if n.left: n.left.rev = not n.left.rev
if n.right: n.right.rev = not n.right.rev
n.rev = False
def update(n):
if n: n.size = 1 + size(n.left) + size(n.right)
def split(t, k):
if not t: return (None, None)
push(t)
if k <= size(t.left):
l, t.left = split(t.left, k)
update(t)
return (l, t)
else:
t.right, r = split(t.right, k - size(t.left) - 1)
update(t)
return (t, r)
def merge(l, r):
if not l or not r: return l or r
push(l); push(r)
if l.priority > r.priority:
l.right = merge(l.right, r)
update(l)
return l
else:
r.left = merge(l, r.left)
update(r)
return r
def insert(t, i, value):
l, r = split(t, i)
return merge(merge(l, Node(value)), r)
def delete_range(t, l, r):
a, bc = split(t, l)
b, c = split(bc, r - l + 1)
return merge(a, c)
def reverse_range(t, l, r):
a, bc = split(t, l)
b, c = split(bc, r - l + 1)
if b: b.rev = not b.rev
return merge(merge(a, b), c)
# Demo
t = None
for x in ['a','b','c','d','e','f','g']:
t = merge(t, Node(x))
# Now: a b c d e f g
t = reverse_range(t, 1, 4) # reverse positions 1..4 -> a e d c b f g
Variants and refinements
Persistent implicit treap
Path-copy each split and merge. Old versions remain queryable. Cost: O(log n) extra nodes per op, same as persistent segment tree. Used in some functional-programming libraries for purely-functional ropes.
Cartesian tree
A cartesian tree is a deterministic version of a treap where priorities are determined by input order — equivalent to an implicit treap built from a sequence by repeated rightmost-merge. Useful when randomness is unavailable.
Splay tree alternative
Splay trees give the same operations with amortized O(log n) instead of expected. Simpler code in some implementations; harder to reason about persistence.
SBT (size-balanced tree)
Deterministic O(log n) worst-case alternative. Uses subtree-size invariants to maintain balance. Common in Chinese competitive programming community.
Common pitfalls
- Forgetting to push before split/merge. If a node has a pending lazy flag and you descend without pushing, the reverse never gets applied and queries return wrong answers silently. Push at the top of every split and merge call.
- Off-by-one in split. Split(T, k) puts the first k elements in L and the rest in R. If you write Split(T, k) expecting "elements 0..k inclusive in L", you've split one element too many.
- Mutating shared nodes. If you implement a persistent version, every split and merge must allocate new nodes — never modify a node already referenced by another version.
- 32-bit priority collisions. For n > 65k nodes, priority collisions are likely with a 32-bit RNG (birthday bound). Use 64-bit priorities or tie-break with a secondary criterion.
- Recursion depth. For n = 10⁶, expected depth is ~28. Worst case (vanishingly rare with random priorities) can be much deeper. C++ default stack handles it; Python's 1000-frame limit is fine.
- Adversarial inputs leaking priority. If priorities are derivable from the input (predictable seed, hash of insertion time), an attacker can construct degenerate trees. Use cryptographically unpredictable seeds for adversarial workloads.
Frequently asked questions
What does "implicit key" mean in an implicit treap?
In a regular treap each node has an explicit key (a value the BST is sorted by). In an implicit treap, the key is the node's position in the in-order traversal of the tree. To compute the position of any node, you walk from the root carrying a position counter: descending left keeps the counter; descending right adds (current.left.size + 1). Each node stores only its subtree size, not a key. The implicit-key trick lets you use a treap as a sequence (array-like structure) rather than as a set.
How do split and merge work on an implicit treap?
Split(T, k) returns two treaps: L containing positions 0..k-1, R containing positions k..n-1. Walk from root: let leftSize = root.left.size. If k <= leftSize, recursively split root.left at k to get (LL, LR); set root.left = LR; return (LL, root). Otherwise recursively split root.right at k - leftSize - 1 to get (RL, RR); set root.right = RL; return (combined-with-root, RR). Each step descends one level, total O(log n) expected. Merge(L, R) compares root priorities: the larger-priority root becomes the merge root, and you recurse on the appropriate side. Same O(log n) expected. Together they implement everything else.
How does range reverse work in O(log n)?
Lazy propagation. To reverse positions [l, r]: split at l → (A, B), split B at r-l+1 → (M, C), set a "reverse" lazy flag on M's root, merge back as A + M + C. The flag doesn't immediately reverse anything — it's stored. When you later descend into a node with a pending reverse flag, you swap that node's children and propagate the flag to both children. The reverse becomes visible only when needed, which keeps every operation O(log n) expected.
Is an implicit treap really an array?
Logically yes — it supports random access at any position in O(log n), insert/delete at any position in O(log n), and range operations (sum, min, reverse) in O(log n). The trade-off versus a plain array: random access is O(log n) instead of O(1), but insert and delete are O(log n) instead of O(n). For workloads where you mostly read sequentially or in ranges, an implicit treap costs roughly the same as an array. For workloads dominated by middle-insertions or rotations (text editors, log queues, ICPC problems), it's orders of magnitude faster.
How fast is an implicit treap in practice?
On modern x86, an implicit treap of 100,000 elements does roughly 1-2 million operations per second per core. Each operation costs about 500-1000 nanoseconds. For 1 million elements the constant grows to ~1-2 microseconds per op because of cache misses on tree traversal. Competitive programming submissions typically use implicit treaps for problems with n ≤ 10^5 and q ≤ 10^5 operations, fitting comfortably in a 1-second time limit. For bulk array operations on n > 10^7, a flat array is still faster.
When should I use an implicit treap?
Three classic cases. First: dynamic-sequence problems in competitive programming — insert at position i, delete range, reverse range, sum/min over range. Second: rope-style text editors where you need fast insert/delete in the middle of a long document. Third: log buffers or queue structures where you need to reorder ranges (e.g., move messages around in a chat history). Skip the implicit treap for static or append-only sequences (plain array wins), for ordered sets where you need search by value (regular treap wins), or for production databases (B-trees with version history win on durability).