Vetora logo
📦Messaging & Streaming

Delivery Semantics

Delivery semantics define how many times a message is guaranteed to be delivered and processed: at-most-once (may lose messages), at-least-once (may duplicate messages), or exactly-once (neither lost nor duplicated). Understanding these guarantees is critical for designing reliable distributed systems, as they determine whether your application needs deduplication, idempotency, or can tolerate data loss.

Overview

In a distributed messaging system, a producer sends a message to a broker, which delivers it to a consumer. At each step, failures can occur: the producer's send may fail, the broker may crash after receiving but before replicating, the consumer may crash after processing but before acknowledging, or the acknowledgment may be lost in transit. Delivery semantics describe the guarantees the system provides in the face of these failures.

**At-most-once** means each message is delivered zero or one times. The producer sends the message and does not retry on failure. If the send fails or the broker crashes, the message is lost. This is the simplest and fastest approach, suitable for metrics and telemetry where occasional data loss is acceptable.

**At-least-once** means each message is delivered one or more times. The producer retries failed sends until the broker acknowledges. The broker delivers to the consumer and waits for acknowledgment before removing the message. If the consumer processes the message but crashes before acknowledging, the broker redelivers it -- causing a duplicate. This is the most common guarantee in production systems (SQS Standard, Kafka default, RabbitMQ with manual ack).

**Exactly-once** means each message is processed exactly one time -- never lost, never duplicated. This is the holy grail of messaging, but it is notoriously difficult to achieve. The Two Generals' Problem proves that two parties communicating over an unreliable channel cannot achieve guaranteed agreement in a finite number of messages. In practice, 'exactly-once' is usually implemented as **at-least-once delivery + idempotent processing**: the broker may deliver a message multiple times, but the consumer detects duplicates (via an idempotency key) and ensures each message's side effects are applied exactly once.

Kafka offers exactly-once semantics (EOS) within its ecosystem via three mechanisms: (1) idempotent producers that deduplicate retried writes using a producer ID + sequence number, (2) transactional writes that atomically commit records to multiple partitions, and (3) consumer-side read_committed isolation that only reads records from committed transactions. However, exactly-once only holds within the Kafka boundary -- once you write the result to an external database, you need application-level idempotency to prevent duplicates.

Key Points
  • 1At-most-once: fire-and-forget. Fastest, simplest, but messages may be lost. Use for non-critical telemetry, real-time dashboards, and metrics where occasional data loss is acceptable.
  • 2At-least-once: retry until acknowledged. Messages are never lost, but duplicates are possible. This is the default for SQS Standard, Kafka (without EOS), and RabbitMQ with manual ack. Requires idempotent consumers.
  • 3Exactly-once: each message processed exactly once. Implemented as at-least-once delivery + idempotent processing (deduplication). Kafka EOS provides this within the Kafka boundary via idempotent producers and transactional consumers.
  • 4The Two Generals' Problem proves that exactly-once delivery is theoretically impossible over unreliable networks. What systems call 'exactly-once' is technically 'effectively-once' -- at-least-once delivery with idempotent application of side effects.
  • 5Consumer acknowledgment strategy is critical: auto-ack before processing (at-most-once, may lose messages), ack after processing (at-least-once, may duplicate if ack fails), or transactional ack with idempotency (effectively exactly-once).
  • 6End-to-end exactly-once requires idempotency at every boundary: producer → broker (idempotent producer), broker → consumer (consumer-side dedup), and consumer → external system (application-level idempotency keys).
Simple Example

Payment Processing: Why Delivery Semantics Matter

A payment service receives a 'charge $50' message. With at-most-once: if the message is lost, the customer is never charged (lost revenue). With at-least-once: if the consumer processes the charge but crashes before acknowledging, the message is redelivered and the customer is charged twice ($100). With exactly-once (idempotent processing): the consumer stores the payment_id in a database before charging. On redelivery, it checks if payment_id already exists and skips the duplicate. The customer is charged exactly $50.

Real-World Examples

Stripe

Stripe uses at-least-once delivery for webhook events and provides idempotency keys for API requests. When a payment event is sent to a merchant's webhook endpoint, Stripe retries on failure for up to 3 days. Merchants must make their webhook handlers idempotent (e.g., checking if the payment_intent_id was already processed) because the same event may be delivered multiple times.

Apache Kafka

Kafka 0.11+ introduced exactly-once semantics (EOS) via idempotent producers (producer ID + sequence number deduplicate retried sends) and transactional consumers (read_committed isolation). Confluent benchmarks show EOS adds only 3-5% overhead vs at-least-once. However, EOS only applies within the Kafka ecosystem -- external system writes require application-level idempotency.

Google Cloud Pub/Sub

Google Cloud Pub/Sub provides at-least-once delivery by default. Messages are redelivered if not acknowledged within the ack deadline. Google Dataflow (Apache Beam runner) builds exactly-once processing on top by checkpointing consumer offsets and output in the same transaction, using Pub/Sub's message_id for deduplication.

Trade-Offs
AspectDescription
Reliability vs PerformanceAt-most-once is fastest (no retries, no ack waiting). At-least-once adds retry latency and acknowledgment overhead. Exactly-once adds transactional coordination. Each step up in reliability costs throughput and latency. For 1M msg/sec telemetry, at-most-once may be the right choice. For payment processing at 1K msg/sec, exactly-once is worth the overhead.
Simplicity vs CorrectnessAt-most-once requires no deduplication logic. At-least-once requires idempotent consumers (checking for duplicate IDs). Exactly-once may require distributed transactions or outbox patterns. The implementation complexity increases with each guarantee level.
End-to-End vs Component-Level GuaranteesA system is only as strong as its weakest link. Kafka EOS within Kafka means nothing if the consumer writes to PostgreSQL without an idempotency check. True end-to-end exactly-once requires idempotency at every system boundary, including external APIs, databases, and third-party services.
Consumer Acknowledgment TimingAck before processing: at-most-once (fast but may lose). Ack after processing: at-least-once (safe but may duplicate). Ack in the same transaction as processing: exactly-once (safest but slowest and most complex). The choice depends on whether losing or duplicating messages is more dangerous for your use case.
Case Study

Uber's Exactly-Once Event Processing Pipeline

Scenario

Uber processes billions of trip events daily for pricing, ETA calculations, and driver payments. Early systems used at-least-once delivery with Kafka, leading to duplicate events that caused incorrect payment calculations. A trip_completed event processed twice would double-charge the rider or double-pay the driver. Manual reconciliation required an entire team and still missed edge cases.

Solution

Uber built an exactly-once processing framework on top of Kafka. Each event carries a globally unique event_id. Consumers write the event_id to a deduplication table in the same database transaction as the business logic (e.g., updating the trip record and inserting a payment). If the event_id already exists, the transaction is a no-op. The deduplication table is periodically pruned of entries older than the Kafka retention period. For cross-service calls, idempotency keys are propagated through the call chain.

Outcome

Payment accuracy improved from 99.7% to 99.99%. The reconciliation team was disbanded. The deduplication framework was generalized into an internal library used by 200+ services. The performance overhead was minimal: a single SELECT before each INSERT, with the event_id indexed. The insight was that 'exactly-once' is not a messaging feature -- it is an application-level concern that requires cooperation between the messaging system and the business logic.

Common Mistakes
  • Assuming the message broker handles exactly-once end-to-end. Even Kafka's EOS only guarantees exactly-once within Kafka. The moment you write to an external database, call an HTTP API, or send an email, you need application-level idempotency.
  • Using auto-acknowledge (ack on receive) for critical messages. This gives at-most-once semantics: if the consumer crashes after ack but before processing, the message is lost. Always use manual acknowledgment after successful processing.
  • Not handling consumer rebalancing in Kafka. When partitions are reassigned, the new consumer may re-read messages that the previous consumer processed but did not commit offsets for. Without idempotent processing, this causes duplicates.
  • Implementing deduplication with an in-memory set. If the consumer restarts, the dedup set is lost and duplicates pass through. Store deduplication keys in a persistent store (database, Redis with persistence) with a TTL matching the message retention period.
Related Concepts

See Delivery Semantics in action

Explore system design templates that use delivery semantics and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Observe exactly-once vs at-least-once delivery trade-offs

Metrics to watch
duplicate_countmessage_loss_rateprocessing_latency_msthroughput_rps
Run Simulation
Test Your Understanding

1Why is true exactly-once delivery theoretically impossible?

2What is the most common production approach to achieving effectively exactly-once processing?

Deeper Reading