Distributed Patterns

Idempotency Key

Safe retries for payments and APIs — the same key returns the same response

An idempotency key is a client-generated unique ID per request. Servers cache the response and return it on retry — so a double-clicked Pay button charges exactly once, even across timeouts and crashes.

PatternClient UUID + server cache
Stripe TTL24 hours
HeaderIdempotency-Key
Concurrent retry409 Conflict during processing
Used inStripe, Square, Adyen, AWS, GCP
StorageRedis or Postgres + TTL

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The problem: networks lie about success

A client sends POST /charges with { amount: 5000, card: "tok_…" }. Five seconds later the request times out. Three possibilities:

The request never reached the server. Safe to retry.
The request reached the server, the server charged the card, the response was lost in transit. Not safe to retry — you'd double-charge.
The request reached the server but the server crashed mid-processing. State is unknown.

The client cannot distinguish these from the outside. Retrying blindly is correct in case 1, catastrophic in case 2. Not retrying is wrong in case 1 (lost charge) and right in case 2. There is no client-only strategy that handles all three correctly. You need the server's cooperation.

The solution: client picks the key, server remembers

The client generates a UUID once per logical operation and sends it in a header:

POST /v1/charges HTTP/1.1
Idempotency-Key: 8f14e45f-ce91-4e2a-89be-5b6a8d4d99c1
Content-Type: application/json

{ "amount": 5000, "currency": "usd", "card": "tok_..." }

On first receipt, the server does the work, persists the result, and stores (key, status, response, request_hash). Returns 200. On any retry with the same key, the server returns the cached 200 — no re-processing.

The retry can come seconds, minutes, or hours later. It can come over a different connection, after a client crash and restart. As long as the client uses the same key, the server returns the original result. This is the heart of safe retries.

The Stripe pattern in detail

Stripe published their idempotency model in 2017 and it became the industry default. Five elements:

Header name: Idempotency-Key: <uuid>. Client picks the value — any string up to 255 chars; UUIDs are conventional.
Scope: per (account, endpoint, key). Same key under a different API account is a different request.
TTL: 24 hours from first use. After 24h the key can be reused for a different request.
Body hash check: the server stores SHA-256 of the request body. Retry with the same key but a different body → 400 with a clear error. Prevents accidental and malicious key reuse.
In-flight handling: a retry that arrives while the first request is still processing gets 409 Conflict. The client backs off and tries again.

Stripe's official client libraries auto-retry idempotent requests with the same key on network errors and 5xx responses. That makes "set the idempotency key, retry on failure" the default API code shape — no manual retry loops needed.

Server-side implementation

The middleware sits in front of business logic and handles four cases per incoming request:

def idempotency_middleware(request):
    key = request.headers['Idempotency-Key']
    body_hash = sha256(request.body)

    # Atomic: try to claim the key.
    row = db.execute("""
        INSERT INTO idempotency_keys (key, body_hash, status)
        VALUES (?, ?, 'in_progress')
        ON CONFLICT (key) DO NOTHING
        RETURNING key
    """, key, body_hash)

    if row is None:
        # Key already exists; load the stored record.
        existing = db.fetch_one(
            "SELECT body_hash, status, response FROM idempotency_keys WHERE key = ?",
            key,
        )
        if existing.body_hash != body_hash:
            return Response(400, "Idempotency key reused with different parameters")
        if existing.status == 'in_progress':
            return Response(409, "Request already in flight")
        return Response(200, existing.response)   # cached!

    # First time. Run the actual handler.
    try:
        response = handler(request)
        db.execute("""
            UPDATE idempotency_keys
            SET status = 'completed', response = ?
            WHERE key = ?
        """, response, key)
        return response
    except Exception as e:
        db.execute("DELETE FROM idempotency_keys WHERE key = ?", key)
        raise

The atomic INSERT...ON CONFLICT is the critical primitive. Without it, two concurrent retries both pass the "does the key exist?" check and both run the handler — defeating the whole purpose. The unique-key constraint serializes concurrent retries at the database level.

Storage choices

Three production-grade options:

Redis with TTL. 1-2 ms read/write, simple. Risk: Redis is typically not durable enough for payments — a node failure that loses a few seconds of writes can re-execute a charge. Use only when the consumer of the cached response can also tolerate re-execution (e.g., idempotent endpoints layered on top of idempotent business logic).
Postgres / MySQL with TTL cleanup. 5-20 ms write, durable. The standard choice for payment APIs. Pair a unique index on the key with a nightly DELETE of rows older than the TTL.
DynamoDB with TTL attribute. Built-in TTL expiry, single-digit ms latency, virtually unlimited scale. AWS's own request-token idempotency is implemented this way.

Sizing: 10,000 requests/sec × 86,400 sec × 1 KB per record ≈ 864 GB of idempotency data. A modest Postgres or one DynamoDB table handles this comfortably with TTL-based pruning.

Idempotency key vs other patterns

	Idempotency key	Naive retry	At-most-once (no retry)	UPSERT (database-only)
Safe for state-changing ops	Yes	No (duplicates)	No (lost ops)	Yes for one resource
Works across services	Yes	No	—	No
Returns original response on retry	Yes	—	—	No (re-runs body)
Client effort	Generate UUID, send header	None	None	—
Server storage	Key table + TTL	None	None	Native primary key
Common usage	Stripe, AWS, GitHub APIs	Anti-pattern	Logging endpoints	Single-row mutations
Concurrent retry handling	409 Conflict	Race, double-execute	—	Last write wins

Client-side: when and how to generate keys

The client generates a key per logical operation, not per HTTP request. Critical distinction: if the user clicks "Pay" once, you generate one key and reuse it across all retries of that payment, even after a process restart. If the user clicks "Pay" twice with the intention of paying twice, you generate two keys.

Best practice: persist the key alongside the operation. On mobile, save the key in local storage when the user taps Pay; on retry, load the same key. On a server-side worker queuing payments, generate the key when the job is enqueued and pass it through every retry.

import uuid, requests, time

def charge(amount, card_token):
    key = str(uuid.uuid4())                # one key per logical charge
    body = {"amount": amount, "card": card_token}
    for attempt in range(5):
        try:
            r = requests.post(
                "https://api.stripe.com/v1/charges",
                json=body,
                headers={"Idempotency-Key": key},
                timeout=10,
            )
            if r.status_code == 409:        # in-flight; back off
                time.sleep(2 ** attempt)
                continue
            return r.json()
        except (requests.Timeout, requests.ConnectionError):
            time.sleep(2 ** attempt)        # safe: same key
    raise RuntimeError("Exhausted retries")

Common misconceptions and traps

"I'll just generate a new key on each retry." That's literally not retry — it's a new request. Same key across retries is the entire point.
"GET requests need idempotency keys too." GETs are already idempotent (they don't change state). The header applies to POST, PUT, PATCH, DELETE.
"My ON CONFLICT DO NOTHING in Postgres is enough; I don't need a key." Only true for a single-row mutation. The moment your handler does two writes, calls a third-party API, or sends an email, you need an idempotency key around the whole handler.
"Idempotency keys give me exactly-once." They give effectively-once: at-least-once delivery (the client retries) plus deduplication (the server's key cache). True exactly-once over a fallible network is impossible (Two Generals Problem).
"I can use auto-incrementing IDs as keys." No. Client-generated UUIDs avoid coordination, scale infinitely, and don't leak ordering. Server-generated IDs require the client to know the ID before retrying, which is a chicken-and-egg.
"Keys never need to be invalidated." If a handler partially succeeds and you decide to retry differently (e.g., refund + new charge), pick a new key. Reusing a key after a partial failure produces undefined behavior.
"Storage cost is a problem." At 10k req/sec × 24h, ~864 GB. Negligible for the value protected (zero duplicate payments).

Production deployments

Stripe. 24h TTL. Body hash check. Every state-changing endpoint accepts the header. Official SDKs auto-retry idempotent requests on 5xx.
Square Payments. 72h TTL via idempotency_key field in the request body.
AWS Lambda, EC2, S3. Various endpoints accept RequestToken / ClientToken; backed by DynamoDB with 24h TTL.
GitHub API. Some POST endpoints accept an idempotency token (per their REST guidelines).
Cloudflare API. Adopted the Stripe-style header in 2023.
IETF. Active draft "The Idempotency-Key HTTP Header Field" (draft-ietf-httpapi-idempotency-key) — close to standardization.

Performance and correctness numbers

Overhead per request: 5-20 ms for a Postgres key check, 1-3 ms for DynamoDB or Redis. Stripe absorbs this inside ~100ms p50 for /v1/charges.
Storage: ~1 KB per record (key, body hash, status, cached response, timestamp). 864 GB at 10k req/sec / 24h TTL.
Concurrency: ON CONFLICT primitives serialize concurrent retries at ~100k inserts/sec/Postgres; DynamoDB scales horizontally with no upper bound.
Effectively-once correctness: double-execution rate drops from O(retry rate) ~ percent-level to near-zero (limited only by storage durability and the body-hash check).
Client retry pattern: exponential backoff with jitter, max 5-10 retries, total budget 60-120 seconds. Same key throughout.

Frequently asked questions

What is an idempotency key, simply?

An idempotency key is a unique ID the client picks for each logical request — typically a UUID — and sends in an HTTP header like 'Idempotency-Key: 8f14e45f-ce…'. The server stores the key with the first response. If the same key arrives again, the server returns the cached response instead of executing the request again. The point: the client can retry as many times as it wants after a timeout or network blip and still only charge the card / create the order / send the SMS exactly once. Stripe popularized the convention in 2017; it is now standard across payment APIs (Stripe, Square, Adyen, Braintree) and cloud APIs (AWS, GCP, Cloudflare).

Why not just use a UUID in the request body?

You can — many APIs accept a 'client_token' or 'request_id' field instead of a header. The HTTP header convention is just slightly cleaner: it works for any verb, the server's idempotency middleware can deduplicate before any business code runs, and it doesn't pollute the resource schema. AWS uses request body tokens (CreateBucketRequest.RequestToken); Stripe uses the header. Both achieve the same correctness.

What's the TTL on an idempotency key?

Stripe's default is 24 hours — long enough for retries from a stuck mobile client, short enough that key storage doesn't grow forever. After 24 hours the same key may be reused for a different request. Square uses 72 hours; AWS request tokens are valid for 24 hours; PayPal recommends a similar window. The TTL is a deliberate trade-off between storage cost (each idempotency record is ~1-10 KB including cached response) and client-side resilience (long enough to cover any plausible retry storm).

What happens if the same key arrives while the first request is still processing?

The server returns a '409 Conflict' or '425 Too Early' indicating the request is in-flight. The standard implementation: insert (key, status='in_progress') into a database with a uniqueness constraint at the start of processing. The first request wins the unique constraint and proceeds; concurrent retries with the same key hit a duplicate-key error and return 409. Once processing completes, the row is updated with the final status and cached response. This prevents two parallel charges from one duplicated retry.

What if the client sends different request bodies with the same key?

That's a programming error and Stripe returns a 400 with a 'Keys for idempotent requests can only be used with the same parameters' message. The standard implementation: hash the request body on first execution and store the hash with the key. On retry, compare the new hash with the stored hash; mismatch means the client is buggy or compromised. Without this check, an attacker who guesses a key could intercept its return value. The body-hash check turns the idempotency key into a per-request capability.

Where does the idempotency table live and how big does it get?

Production options: (1) Redis with TTL — fast, simple, but ephemeral. Loss on Redis failure means duplicate processing. (2) Postgres / DynamoDB with TTL — durable, slower (~5-20 ms write). The standard choice for payment APIs. Sizing: 10,000 requests/sec × 24h TTL × 1KB per record ≈ 864 GB. A medium Postgres or a single DynamoDB table handles this with a TTL-based cleanup policy. Stripe shards by API key prefix; AWS uses DynamoDB with per-account partitioning.