Vetora logo
☁️Messaging & Streaming

SQS, SNS & EventBridge

Amazon SQS (Simple Queue Service), SNS (Simple Notification Service), and EventBridge are three AWS messaging services that serve different roles. SQS is a fully managed message queue for point-to-point work distribution. SNS is a pub/sub notification service for fan-out. EventBridge is a serverless event bus with content-based routing. Understanding when to use each -- and how to combine them -- is essential for AWS-based architectures.

Overview

AWS provides three messaging services that solve different problems and are often combined in production architectures.

**Amazon SQS** (launched 2006) is a fully managed message queue. Messages are stored durably for up to 14 days. Consumers poll the queue, process messages, and delete them. Standard queues offer at-least-once delivery with best-effort ordering and nearly unlimited throughput. FIFO queues guarantee exactly-once processing and strict ordering within a message group, but are limited to 3,000 messages/sec (with batching) per queue. SQS is ideal for decoupling producers from consumers when each message should be processed by exactly one worker.

**Amazon SNS** (launched 2010) is a fully managed pub/sub service. A publisher sends a message to an SNS topic, and SNS delivers it to all subscribers: SQS queues, Lambda functions, HTTP endpoints, email addresses, or mobile push notifications. SNS has no message retention -- if delivery fails and retries are exhausted, the message is lost (unless a dead-letter queue is configured). SNS is ideal for fan-out: one event triggers multiple downstream actions.

**Amazon EventBridge** (launched 2019, evolved from CloudWatch Events) is a serverless event bus with rich content-based routing. Events are JSON objects with a standard envelope (source, detail-type, detail). Rules match events based on field values in the event body -- not just the topic name -- and route matched events to targets (Lambda, SQS, SNS, Step Functions, API Gateway, and 20+ AWS services). EventBridge supports schema registries, cross-account event sharing, and event replay (24 hours). It is ideal for event-driven architectures where routing logic depends on event content.

The canonical patterns: - **Simple task queue**: SQS alone. Producer enqueues work, pool of workers dequeue and process. - **Fan-out**: SNS topic → multiple SQS queues. Each queue is consumed by a different service. - **Event-driven microservices**: EventBridge bus. Each service publishes events to the bus. Rules route events to the correct consumers based on event content. - **Hybrid**: EventBridge for routing, delivering to SQS queues for buffering and retry.

Key Points
  • 1SQS Standard queues offer at-least-once delivery, best-effort ordering, and nearly unlimited throughput. FIFO queues offer exactly-once processing and strict ordering within message groups, limited to 3,000 msg/sec with batching.
  • 2SNS delivers to all subscribers simultaneously (fan-out). Subscriber types include SQS, Lambda, HTTP/S, email, and SMS. No message retention -- undelivered messages are lost unless a DLQ is configured.
  • 3EventBridge routes events based on content (field matching in the JSON body), not just topic. This enables fine-grained routing: 'route order events over $1,000 from region=US to the fraud-check Lambda.'
  • 4SNS → SQS fan-out is the standard AWS pattern for event-driven microservices: SNS provides fan-out, each SQS queue provides per-service buffering, retry, and dead-letter handling.
  • 5EventBridge supports cross-account event sharing via event buses, enabling organization-wide event meshes. SNS requires per-account topic subscriptions. SQS requires cross-account IAM policies.
  • 6Cost model differs: SQS charges per API call (SendMessage, ReceiveMessage, DeleteMessage). SNS charges per publish + per delivery. EventBridge charges per event ingested. For high-throughput, SQS is cheapest; for complex routing, EventBridge saves development time.
Simple Example

Order Processing Pipeline

When a customer places an order, the order service publishes an 'order.created' event to an EventBridge bus. An EventBridge rule matches 'detail-type: OrderCreated' and routes to an SNS topic. The SNS topic fans out to three SQS queues: one for the payment service (captures payment), one for the inventory service (reserves stock), and one for the notification service (sends confirmation email). Each service independently polls its SQS queue, processes the message, and deletes it. If the payment service is temporarily down, messages accumulate in its SQS queue (up to 14 days) and are processed when it recovers.

Real-World Examples

Airbnb

Airbnb uses SNS → SQS fan-out extensively. When a booking is confirmed, an SNS topic delivers the event to SQS queues for the pricing service, calendar sync, host notification, guest notification, cleaning scheduler, and analytics pipeline. Each service processes at its own pace with independent retry policies.

Capital One

Capital One uses EventBridge as their enterprise event bus across hundreds of microservices. Content-based routing rules direct events to the right consumers without centralized orchestration. Cross-account event buses share events between teams while maintaining security boundaries via IAM policies.

iRobot

iRobot uses SQS FIFO queues to process commands to Roomba robots. Each robot has a message group ID, ensuring commands are processed in order. FIFO's exactly-once semantics prevent duplicate command execution. The 3,000 msg/sec FIFO limit is sufficient because each robot receives only a few commands per cleaning session.

Trade-Offs
AspectDescription
Simplicity vs Routing PowerSQS is simplest: put messages in, take messages out. SNS adds fan-out with simple topic-based routing. EventBridge adds content-based routing with JSONPath pattern matching. Each layer of routing power adds architectural complexity and learning curve. Use the simplest service that meets your requirements.
Throughput vs OrderingSQS Standard queues offer nearly unlimited throughput with best-effort ordering. FIFO queues guarantee strict ordering and exactly-once but cap at 3,000 msg/sec with batching. SNS has no ordering guarantee. EventBridge preserves order only within a single source.
Retention and ReplaySQS retains messages up to 14 days. EventBridge offers 24-hour replay from an archive. SNS has zero retention -- delivery is immediate or lost (without DLQ). For audit trails or reprocessing, you need Kafka or S3-backed event stores. None of these services replace a durable event log.
Vendor Lock-in vs Managed SimplicitySQS, SNS, and EventBridge are AWS-only services with no open-source equivalents. Using them deeply couples your architecture to AWS. The trade-off is zero operational overhead: no brokers to manage, no ZooKeeper, automatic scaling, and sub-millisecond provisioning. For multi-cloud strategies, Kafka or NATS is more portable.
Case Study

BBC Online: Migrating from Monolithic Queue to Event-Driven Architecture

Scenario

BBC Online processed content updates (articles, videos, schedules) through a single SQS queue consumed by a monolithic worker. As the number of content types grew, the worker became a bottleneck: a slow video transcoding task blocked article publishing. Adding new consumers required modifying the monolith. Deployment risk was high because all processing ran in one service.

Solution

BBC migrated to an EventBridge + SNS + SQS architecture. Content events are published to EventBridge with detail-type indicating the content type (article, video, schedule). EventBridge rules route each type to dedicated SNS topics. Each SNS topic fans out to service-specific SQS queues. Video transcoding, article rendering, schedule updates, and search indexing each run as independent services with their own SQS queues, scaling profiles, and retry policies.

Outcome

Article publishing latency dropped from minutes (blocked behind video jobs) to seconds. Each content processing service scales independently: video transcoding runs on GPU instances while article rendering runs on small Lambda functions. Deploying a new video encoder no longer risks breaking article publishing. Adding a new consumer (e.g., push notifications) requires only a new EventBridge rule and SQS queue -- no changes to existing services.

Common Mistakes
  • Using SNS alone without SQS for critical workflows. SNS has no message retention. If the subscriber is temporarily unavailable and retries fail, the message is permanently lost. Always pair SNS with SQS (SNS → SQS) for durability, or configure an SNS dead-letter queue.
  • Using SQS Standard when you need exactly-once processing. Standard queues deliver messages at-least-once, meaning your consumer may receive duplicates. Either use FIFO queues for exactly-once, or make your consumer idempotent.
  • Over-engineering with EventBridge when SNS suffices. If your routing is purely topic-based (all subscribers of 'orders' get all order events), SNS is simpler and cheaper. Use EventBridge only when you need content-based routing (filtering by fields inside the event body).
  • Ignoring SQS visibility timeout tuning. If a consumer takes longer than the visibility timeout to process a message, SQS makes it visible again and another consumer processes it -- causing duplicate work. Set visibility timeout to at least 6x your average processing time.
Related Concepts

See SQS, SNS & EventBridge in action

Explore system design templates that use sqs, sns & eventbridge and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Route notifications through SNS fan-out to SQS queues

Metrics to watch
delivery_latency_msqueue_depthdlq_message_countthroughput_rps
Run Simulation
Test Your Understanding

1What is the key advantage of the SNS → SQS fan-out pattern over direct SNS → Lambda?

2When should you choose EventBridge over SNS?

Deeper Reading