1Why must the idempotency key be generated by the client, not the server?
An idempotency key is a unique identifier attached to a request or message that allows the receiver to detect and safely handle duplicates. If the same key is seen twice, the receiver returns the cached result of the first processing instead of executing the operation again. Idempotency keys are the practical mechanism that turns at-least-once delivery into effectively exactly-once processing.
In distributed systems, clients retry failed requests because they cannot distinguish between three scenarios: (1) the request never reached the server, (2) the server processed the request but the response was lost, or (3) the server is slow and will eventually respond. In scenarios (2) and (3), retrying causes duplicate execution. For non-idempotent operations like payment charges or inventory deductions, this is catastrophic.
An **idempotency key** solves this by giving each logical operation a unique identity. The client generates a key (typically a UUID or a deterministic hash of the operation parameters) and includes it in the request header or body. The server checks if it has already processed a request with this key:
- **First time**: Execute the operation, store the key and result, return the result. - **Duplicate**: Return the stored result without re-executing.
The implementation requires a durable store (database table, Redis) mapping keys to results. The key-result pair is stored atomically with the business operation -- either in the same database transaction or in a write-ahead log. A TTL (time-to-live) on stored keys prevents unbounded storage growth; typically 24-72 hours matches the retry window.
Idempotency keys work at every system boundary: - **Client → API**: Client generates key, sends in `Idempotency-Key` header (Stripe convention). - **API → Message Broker**: Producer includes key in message metadata; consumer deduplicates. - **Service → Database**: Application checks for key existence before performing the write. - **Cross-service calls**: Caller propagates key; callee deduplicates.
The key insight is that idempotency is not a property of the transport layer -- it is an application-level concern that must be designed into every state-changing operation. The message broker can deliver a message twice; only the application logic can ensure the side effects happen once.
Safe Payment Retry
A mobile app charges a customer $50. The app generates an idempotency key (e.g., 'pay_abc123') and sends POST /charge with Idempotency-Key: pay_abc123. The payment service processes the charge, stores ('pay_abc123' → {status: 'success', charge_id: 'ch_789'}) in a database table, and returns the result. The response is lost due to a network glitch. The app retries with the same key. The server finds 'pay_abc123' in its idempotency store and returns the cached result without charging again. The customer is charged exactly $50.
Stripe
Stripe's API accepts an Idempotency-Key header on all POST requests. Keys are stored for 24 hours. On duplicate requests, Stripe returns the exact same response (including the same HTTP status code and body). If the original request is still processing, Stripe returns a 409 Conflict to prevent concurrent duplicates. This pattern is documented as best practice and used by millions of merchants worldwide.
Amazon (AWS)
SQS FIFO queues use a MessageDeduplicationId to detect duplicate sends within a 5-minute window. If a producer sends the same message twice with the same deduplication ID, the second is silently dropped. DynamoDB supports conditional writes (PutItem with condition 'attribute_not_exists(pk)') which acts as a server-side idempotency check -- the second write fails instead of creating a duplicate.
PayPal
PayPal uses a PayPal-Request-Id header for idempotency across their REST API. Each unique ID is associated with a payment for 45 days. If a network error occurs and the client retries, PayPal returns the original payment response. This is critical for payment systems where duplicate charges directly affect customer trust and regulatory compliance.
| Aspect | Description |
|---|---|
| Storage Cost vs Dedup Window | Longer TTLs catch more duplicates but consume more storage. A 24-hour TTL with 1M operations/day requires 1M rows in the dedup table. At 72 hours, that is 3M rows. Use a dedicated table with the idempotency key as primary key and a TTL-based cleanup job. |
| Atomicity of Key + Operation | The idempotency key and the business operation must be committed atomically. If the operation succeeds but the key is not stored (or vice versa), the system is inconsistent. Using the same database transaction for both is simplest. If they are in different stores, you need the outbox pattern or distributed transactions. |
| Key Generation Strategy | Random UUIDs are unique but opaque. Deterministic keys (hash of operation parameters) are debuggable and naturally idempotent for retries, but may collide if different operations have the same parameters. Best practice: use a UUID for the request and include the business entity ID in the dedup check. |
| In-Flight Duplicate Handling | If a duplicate request arrives while the first is still processing, you must handle the race condition. Options: (1) return 409 Conflict (Stripe's approach), (2) block until the first completes and return its result, or (3) use a database lock on the idempotency key to serialize processing. |
Stripe's Idempotency Key Infrastructure
Scenario
Stripe processes millions of payment requests daily. Network issues (timeouts, dropped connections, load balancer retries) cause a significant fraction of requests to be sent multiple times. Without protection, duplicate requests would charge customers twice, create duplicate refunds, or double-transfer money between accounts. Even a 0.01% duplicate rate at Stripe's scale would affect thousands of transactions daily.
Solution
Stripe implemented a universal idempotency key system. Every mutating API request accepts an Idempotency-Key header. Internally, the idempotency layer sits between the API gateway and the business logic. It uses a Redis-backed store with a 24-hour TTL. On first request: acquire a lock on the key, execute the operation, store the result, release the lock. On duplicate: return the stored result. On concurrent duplicate (first still processing): return 409 Conflict. The system handles partial failures: if the operation fails midway, the stored result is the error response, and the client can retry with a new key.
Outcome
Stripe's idempotency system handles billions of key lookups daily with sub-millisecond latency. It catches approximately 0.5% of requests as duplicates (mostly from automatic retries). The system is so successful that Stripe published it as a best practice and it has become the industry standard for payment APIs. The Idempotency-Key header pattern has been adopted by PayPal, Square, Adyen, and dozens of other payment processors.
See Idempotency Keys in action
Explore system design templates that use idempotency keys and run traffic simulations to see how these concepts perform under real load.
Browse Templates1Why must the idempotency key be generated by the client, not the server?
2What should a server do if a duplicate idempotency key arrives while the original request is still processing?