Vetora logo
๐Ÿ”Consistency & Transactions

Idempotency

An operation is idempotent if executing it multiple times produces the same result as executing it once. Idempotency is essential for building reliable distributed systems because network failures, retries, and message duplication are inevitable -- idempotent operations ensure that retries are safe and do not cause unintended side effects.

Overview

In a distributed system, every network call can fail ambiguously: the request might have succeeded but the response was lost, the request might have been delivered twice due to a retry, or the message broker might redeliver a message after a consumer crash. When the outcome of a call is uncertain, the safest strategy is to retry -- but retrying a non-idempotent operation (like 'charge $50') can cause double-charges, duplicate records, or corrupted state. Idempotency is the property that makes retries safe.

An operation f is idempotent if f(f(x)) = f(x) -- applying it twice produces the same result as applying it once. Some operations are naturally idempotent: setting a value ('SET balance = 100'), deleting a record ('DELETE WHERE id = 5'), and writing to a specific key ('PUT /users/123'). Others are not: incrementing a counter ('balance += 50'), appending to a list ('INSERT INTO orders'), and sending notifications. Non-idempotent operations must be made idempotent through design patterns.

The most common pattern is the idempotency key: the client generates a unique identifier (UUID) for each logical operation and includes it in every request (and retry). The server stores the key and the result of the first execution. On subsequent requests with the same key, the server returns the cached result without re-executing the operation. Stripe, PayPal, and AWS all use this pattern for financial APIs. The idempotency key store requires atomic check-and-set semantics (read the key, execute the operation, and store the result in a single transaction) to prevent race conditions.

Idempotency is closely related to delivery semantics in messaging systems. At-most-once delivery (fire and forget) avoids duplicates but can lose messages. At-least-once delivery (retry until acknowledged) guarantees delivery but can produce duplicates. Exactly-once semantics (the gold standard) ensures each message is processed exactly once -- but it is technically impossible in the general case without some form of idempotency at the consumer. Kafka's exactly-once semantics, for example, combine idempotent producers, transactional writes, and consumer offset management to provide the effect of exactly-once processing.

Key Points
  • 1Natural idempotency: SET x=5 is idempotent (applying it 10 times still gives x=5). INCREMENT x by 5 is NOT idempotent (applying it 10 times gives x=50). Prefer SET-style operations when possible.
  • 2Idempotency keys: the client includes a unique ID (UUID v4 or v7) with each request. The server deduplicates by checking if the key has been seen before. The key + result must be stored atomically with the operation.
  • 3Conditional writes: 'UPDATE balance SET amount=100 WHERE version=5' fails if the version has changed (optimistic concurrency control). 'INSERT ... ON CONFLICT DO NOTHING' is naturally idempotent for unique keys.
  • 4The idempotency key store must be durable and have a TTL. Keys stored in Redis can be lost on crash; keys in the primary database are safer. TTL prevents the key store from growing unboundedly (typical: 24-48 hours).
  • 5Consumer-side idempotency is critical for at-least-once messaging. Even if the broker provides exactly-once delivery, consumer crashes between processing and acknowledgment cause redelivery. The consumer must detect and skip already-processed messages.
  • 6Idempotency does not mean operations have no effect on retry -- it means they have the SAME effect. A 'create user' operation that returns the existing user on retry (instead of failing with a conflict) is idempotent from the client's perspective.
Simple Example

Double-Click on the Pay Button

A user clicks 'Pay $50' and the request is sent. The network is slow, so the user clicks again. Without idempotency, two $50 charges are created -- $100 total. With an idempotency key, the first click includes key='abc-123'. The server processes the charge and stores key='abc-123' -> result='$50 charged'. The second click also includes key='abc-123'. The server finds the key, returns the cached result ('$50 charged'), and does not charge again. The user is charged exactly $50, regardless of how many times they click.

Real-World Examples

Stripe

Stripe's API accepts an Idempotency-Key header on all POST requests. The key is a client-generated UUID. Stripe stores the key, request parameters, and response for 24 hours. Retries with the same key return the cached response without re-executing the charge. If a request with the same key but different parameters is received, Stripe returns an error (preventing accidental misuse). This pattern handles network retries, client-side retry libraries, and double-form-submissions safely.

Apache Kafka (Idempotent Producers)

Kafka's idempotent producer (enable.idempotence=true) assigns a producer ID and sequence number to each message. The broker deduplicates messages by tracking the highest sequence number per producer per partition. If a retry sends the same message (same producer ID + sequence), the broker silently drops the duplicate. Combined with transactions, this provides exactly-once semantics for the produce-consume pipeline.

AWS (Request Tokens)

Many AWS APIs accept a ClientRequestToken (idempotency key). For example, SQS SendMessage with a deduplication ID prevents duplicate messages in FIFO queues. DynamoDB conditional writes (ConditionExpression) provide idempotency for state mutations. S3 PutObject is naturally idempotent (uploading the same object twice results in one object). AWS documents which operations are idempotent and which require client-side tokens.

Trade-Offs
AspectDescription
Safety vs StorageStoring idempotency keys and results requires persistent storage with atomic check-and-set. For high-throughput systems (millions of requests/day), the key store can grow large. TTL-based expiration (24-48 hours) bounds growth but means very late retries (after TTL) are not deduplicated.
Client Complexity vs Server SafetyIdempotency keys require clients to generate and track unique IDs for each operation. This adds client-side complexity (UUID generation, key storage for retries). Without client cooperation, the server must use other patterns (conditional writes, natural idempotency) that may be less general.
Exactly-Once IllusionTrue exactly-once delivery is impossible in the general case (Two Generals Problem). What systems call 'exactly-once' is actually 'at-least-once delivery + idempotent processing.' Understanding this distinction is important for correctly reasoning about failure modes.
Latency of Deduplication CheckEvery request must check the idempotency key store before processing. If the store is on a remote database, this adds a network round trip. Caching recent keys in memory reduces latency but introduces the risk of cache misses (allowing duplicates if the cache does not contain the key).
Case Study

Stripe's Idempotency Key Infrastructure

Scenario

Stripe processes billions of dollars in payments. Network unreliability between client applications and Stripe's API means requests can time out, be retried, or be sent multiple times. A single duplicate charge can cause customer complaints, chargebacks, and regulatory issues. Stripe needed a universal mechanism to make every write API call safe to retry.

Solution

Stripe implemented the Idempotency-Key header across all POST endpoints. When a request with a key is received for the first time, Stripe atomically (1) locks the key, (2) processes the request, and (3) stores the key + response. Subsequent requests with the same key return the stored response. If a request with the same key but different parameters is received, Stripe rejects it with a 422 error. Keys expire after 24 hours. The implementation uses a distributed lock (Redis) for the initial processing window and a durable store (database) for the cached response.

Outcome

Stripe's idempotency infrastructure processes millions of idempotent requests per day, preventing an estimated thousands of duplicate charges. The pattern has become an industry standard -- virtually all payment APIs now support idempotency keys. Stripe's engineering blog post on idempotency (2017) has been cited by AWS, Google Pay, and dozens of fintech companies as the reference implementation.

Common Mistakes
  • โš Confusing idempotency with safety. An idempotent operation produces the same result when repeated, but it still has an effect the first time. A 'delete user' operation that succeeds the first time and returns 'not found' the second time is idempotent -- the state is the same regardless of how many times it is called.
  • โš Not storing the idempotency key atomically with the operation. If you check the key (not found), process the request, and then store the key as separate steps, a crash between processing and storing causes the retry to re-process. Use a database transaction that checks the key, performs the operation, and stores the key in one atomic unit.
  • โš Using the same idempotency key for different operations. If a client reuses key 'abc-123' for a $50 charge and later for a $100 charge, the server returns the cached $50 result for the $100 request. Keys must be unique per logical operation. UUIDv7 (time-ordered) is recommended over UUIDv4 for better index performance.
  • โš Forgetting to handle the race condition of concurrent retries. If two retries with the same key arrive simultaneously, both might pass the 'key not found' check and both process the request. Use a distributed lock or database UNIQUE constraint on the key to prevent this.
Related Concepts

See Idempotency in action

Explore system design templates that use idempotency and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Measure duplicate detection accuracy with idempotency keys

Metrics to watch
duplicate_detection_rateidempotency_key_collision_pctretry_success_ratethroughput_rps
Run Simulation
Test Your Understanding

1Which operation is naturally idempotent?

2What should happen if a request with an existing idempotency key but different parameters is received?

Deeper Reading