1Why is true exactly-once delivery theoretically impossible?
Delivery semantics define how many times a message is guaranteed to be delivered and processed: at-most-once (may lose messages), at-least-once (may duplicate messages), or exactly-once (neither lost nor duplicated). Understanding these guarantees is critical for designing reliable distributed systems, as they determine whether your application needs deduplication, idempotency, or can tolerate data loss.
In a distributed messaging system, a producer sends a message to a broker, which delivers it to a consumer. At each step, failures can occur: the producer's send may fail, the broker may crash after receiving but before replicating, the consumer may crash after processing but before acknowledging, or the acknowledgment may be lost in transit. Delivery semantics describe the guarantees the system provides in the face of these failures.
**At-most-once** means each message is delivered zero or one times. The producer sends the message and does not retry on failure. If the send fails or the broker crashes, the message is lost. This is the simplest and fastest approach, suitable for metrics and telemetry where occasional data loss is acceptable.
**At-least-once** means each message is delivered one or more times. The producer retries failed sends until the broker acknowledges. The broker delivers to the consumer and waits for acknowledgment before removing the message. If the consumer processes the message but crashes before acknowledging, the broker redelivers it -- causing a duplicate. This is the most common guarantee in production systems (SQS Standard, Kafka default, RabbitMQ with manual ack).
**Exactly-once** means each message is processed exactly one time -- never lost, never duplicated. This is the holy grail of messaging, but it is notoriously difficult to achieve. The Two Generals' Problem proves that two parties communicating over an unreliable channel cannot achieve guaranteed agreement in a finite number of messages. In practice, 'exactly-once' is usually implemented as **at-least-once delivery + idempotent processing**: the broker may deliver a message multiple times, but the consumer detects duplicates (via an idempotency key) and ensures each message's side effects are applied exactly once.
Kafka offers exactly-once semantics (EOS) within its ecosystem via three mechanisms: (1) idempotent producers that deduplicate retried writes using a producer ID + sequence number, (2) transactional writes that atomically commit records to multiple partitions, and (3) consumer-side read_committed isolation that only reads records from committed transactions. However, exactly-once only holds within the Kafka boundary -- once you write the result to an external database, you need application-level idempotency to prevent duplicates.
Payment Processing: Why Delivery Semantics Matter
A payment service receives a 'charge $50' message. With at-most-once: if the message is lost, the customer is never charged (lost revenue). With at-least-once: if the consumer processes the charge but crashes before acknowledging, the message is redelivered and the customer is charged twice ($100). With exactly-once (idempotent processing): the consumer stores the payment_id in a database before charging. On redelivery, it checks if payment_id already exists and skips the duplicate. The customer is charged exactly $50.
Stripe
Stripe uses at-least-once delivery for webhook events and provides idempotency keys for API requests. When a payment event is sent to a merchant's webhook endpoint, Stripe retries on failure for up to 3 days. Merchants must make their webhook handlers idempotent (e.g., checking if the payment_intent_id was already processed) because the same event may be delivered multiple times.
Apache Kafka
Kafka 0.11+ introduced exactly-once semantics (EOS) via idempotent producers (producer ID + sequence number deduplicate retried sends) and transactional consumers (read_committed isolation). Confluent benchmarks show EOS adds only 3-5% overhead vs at-least-once. However, EOS only applies within the Kafka ecosystem -- external system writes require application-level idempotency.
Google Cloud Pub/Sub
Google Cloud Pub/Sub provides at-least-once delivery by default. Messages are redelivered if not acknowledged within the ack deadline. Google Dataflow (Apache Beam runner) builds exactly-once processing on top by checkpointing consumer offsets and output in the same transaction, using Pub/Sub's message_id for deduplication.
| Aspect | Description |
|---|---|
| Reliability vs Performance | At-most-once is fastest (no retries, no ack waiting). At-least-once adds retry latency and acknowledgment overhead. Exactly-once adds transactional coordination. Each step up in reliability costs throughput and latency. For 1M msg/sec telemetry, at-most-once may be the right choice. For payment processing at 1K msg/sec, exactly-once is worth the overhead. |
| Simplicity vs Correctness | At-most-once requires no deduplication logic. At-least-once requires idempotent consumers (checking for duplicate IDs). Exactly-once may require distributed transactions or outbox patterns. The implementation complexity increases with each guarantee level. |
| End-to-End vs Component-Level Guarantees | A system is only as strong as its weakest link. Kafka EOS within Kafka means nothing if the consumer writes to PostgreSQL without an idempotency check. True end-to-end exactly-once requires idempotency at every system boundary, including external APIs, databases, and third-party services. |
| Consumer Acknowledgment Timing | Ack before processing: at-most-once (fast but may lose). Ack after processing: at-least-once (safe but may duplicate). Ack in the same transaction as processing: exactly-once (safest but slowest and most complex). The choice depends on whether losing or duplicating messages is more dangerous for your use case. |
Uber's Exactly-Once Event Processing Pipeline
Scenario
Uber processes billions of trip events daily for pricing, ETA calculations, and driver payments. Early systems used at-least-once delivery with Kafka, leading to duplicate events that caused incorrect payment calculations. A trip_completed event processed twice would double-charge the rider or double-pay the driver. Manual reconciliation required an entire team and still missed edge cases.
Solution
Uber built an exactly-once processing framework on top of Kafka. Each event carries a globally unique event_id. Consumers write the event_id to a deduplication table in the same database transaction as the business logic (e.g., updating the trip record and inserting a payment). If the event_id already exists, the transaction is a no-op. The deduplication table is periodically pruned of entries older than the Kafka retention period. For cross-service calls, idempotency keys are propagated through the call chain.
Outcome
Payment accuracy improved from 99.7% to 99.99%. The reconciliation team was disbanded. The deduplication framework was generalized into an internal library used by 200+ services. The performance overhead was minimal: a single SELECT before each INSERT, with the event_id indexed. The insight was that 'exactly-once' is not a messaging feature -- it is an application-level concern that requires cooperation between the messaging system and the business logic.
See Delivery Semantics in action
Explore system design templates that use delivery semantics and run traffic simulations to see how these concepts perform under real load.
Browse Templates1Why is true exactly-once delivery theoretically impossible?
2What is the most common production approach to achieving effectively exactly-once processing?