Medium8 componentsInterview: Very High

Ad Click Aggregator — Stream Processing (Kafka + Windowed Aggregation)

Q: Why Kafka instead of a simpler queue like SQS or RabbitMQ?

Kafka provides three features critical for ad click aggregation: (1) durable, ordered storage with configurable retention (7 days for replay), (2) consumer groups for horizontal scaling of stream processors, and (3) partition-level ordering guarantees that enable local aggregation without distributed coordination. SQS lacks ordering guarantees and retention-based replay. RabbitMQ provides ordering but has lower throughput ceilings and no built-in partition-level consumer scaling.

Q: What happens during a Kafka consumer rebalance?

When a stream processor instance joins or leaves the consumer group, Kafka triggers a rebalance that reassigns partitions across the remaining instances. During rebalance (30-60 seconds), reassigned partitions have no active consumer — events accumulate in Kafka but are not lost. After rebalance, the new consumer resumes from the last committed offset. The impact is a brief increase in dashboard freshness lag, not data loss.

Q: How do 1-minute tumbling windows affect billing accuracy?

Tumbling windows guarantee that every event is counted exactly once within its window — no double-counting from overlapping windows. However, events near window boundaries may appear in the 'wrong' minute (e.g., a click at 12:00:59.999 might be counted in the 12:01 window if processing is delayed). For billing, this minute-level granularity is usually acceptable. For exact billing reconciliation, the Lambda Architecture variant adds a batch layer that recomputes totals from raw events.

Q: How does the stream processor handle late-arriving events?

Events that arrive after their window has closed (e.g., due to network delays) are either dropped or assigned to a special late bucket, depending on configuration. In this template, late events within a 30-second grace period are included in the next window's output with a correction flag. Events arriving more than 30 seconds late are logged but not aggregated, trading accuracy for bounded state size. The batch reconciliation layer in the Lambda variant catches these gaps.

Q: Can you use Kafka Streams or Flink instead of a custom consumer?

Yes, and in production you typically would. Kafka Streams provides built-in windowing, state stores, and exactly-once processing. Apache Flink adds more advanced windowing (session, event-time), watermarks for late-event handling, and checkpoint-based fault tolerance. The custom consumer in this template illustrates the core concepts without framework-specific abstractions, making the design decisions more visible for learning.

Real-time ad click aggregation using Kafka for async event ingestion and a stream processor for windowed aggregation. Sub-5ms click acknowledgment with horizontally scalable throughput.

KafkaStream ProcessingReal-timeAdTech

Try in Simulator

Problem Statement

Stream processing is the industry-standard approach to ad click aggregation, and understanding why requires grasping the fundamental mismatch between click ingestion and aggregation workloads. Click ingestion is a high-throughput, low-latency write workload — millions of events per day, each requiring sub-10ms acknowledgment so the user's browser does not hang. Aggregation is a compute-intensive read workload — summing, grouping, and windowing events into campaign-level metrics that dashboards can query instantly.

The naive approach tries to serve both workloads from a single synchronous database, which creates a fundamental resource contention problem. Stream processing solves this by introducing a message queue (Kafka) as a buffer between the write path and the compute path. Clicks are acknowledged the moment they are written to Kafka (3-5ms), regardless of how long downstream aggregation takes. The stream processor consumes events from Kafka and produces aggregated results — decoupling the two workloads entirely.

This is the architecture used by Google Ads (Millwheel/Dataflow), Meta Ads (streaming aggregation on Kafka), Amazon Advertising, and Twitter Ads. The stream processing pattern appears in virtually every system design interview for adtech roles because it demonstrates several key distributed systems concepts: event-driven architecture, windowed aggregation (tumbling, sliding, session windows), consumer group scaling, backpressure handling, and exactly-once processing semantics.

The specific design decisions in this template — 1-minute tumbling windows, Kafka partitioned by campaign_id, Redis as a fast-path cache for hot campaigns — represent the standard industry approach. Each decision has a clear rationale and trade-off that interview candidates are expected to articulate. Why tumbling windows instead of sliding? Because tumbling windows produce exactly one output per window period, while sliding windows produce overlapping outputs that increase downstream write volume. Why partition by campaign_id? Because it ensures all clicks for one campaign are processed by the same consumer, enabling local aggregation without distributed coordination.

The template also demonstrates the primary limitation of stream processing: eventual consistency. Dashboards reflect clicks that have been processed through the current window, which introduces a lag equal to the window duration (1 minute by default). For billing-critical applications where exact counts matter, this lag may be unacceptable — motivating the Lambda Architecture variant with its batch reconciliation layer.

Architecture Overview

The stream processing ad click aggregator splits the data flow into three phases: ingestion (fast), processing (windowed), and serving (cached). Eight components work together: Client, Load Balancer, Click Ingestion Service, Kafka, Stream Processor (Worker), Aggregation Database, Redis Cache, and Query Service.

The ingestion path is optimized for minimum latency. Clicks arrive at the Load Balancer and are routed to the Click Ingestion Service, which performs minimal validation (required fields, timestamp sanity) and writes the event to a Kafka topic partitioned by campaign_id. The Kafka produce call with acks=1 completes in 3-5ms, and the service returns HTTP 202 Accepted immediately. This decouples click acknowledgment from downstream processing — the user's browser gets a response in under 10ms regardless of aggregation load.

Kafka serves as the durable buffer and ordering layer. The clicks topic is partitioned by campaign_id, ensuring all clicks for a given campaign land on the same partition and are consumed by the same stream processor instance. This partition-level ordering guarantee enables the stream processor to maintain local aggregation state per campaign without distributed coordination (no distributed locks, no cross-partition joins). Kafka retention is configured for 7 days, allowing replay if the stream processor needs to reprocess events.

The Stream Processor (implemented as a Kafka consumer group) reads events from Kafka and performs windowed aggregation. It maintains an in-memory accumulator for each campaign in the current 1-minute tumbling window. When the window closes, the processor emits the aggregated result — total clicks, unique users, click-through rate — and writes it to the Aggregation Database (a time-series optimized store). Tumbling windows were chosen over sliding windows because they produce exactly one output per window period, reducing write amplification to the database.

The Query Service handles dashboard reads. It first checks Redis for cached aggregates (hot campaigns are cached for 60 seconds), falling back to the Aggregation Database for cold queries. Redis stores the latest 5-minute rollup for the top 1,000 campaigns by traffic volume, achieving a 70-80% cache hit rate for dashboard queries. The Query Service also supports time-range queries (last hour, last day) by reading directly from the Aggregation Database's time-series indexes.

Horizontal scaling is achieved by adding Kafka partitions and stream processor instances. With 16 partitions and 16 consumer instances, the system can process approximately 50K events per second. Adding partitions requires a brief rebalance period (30-60 seconds) during which some partitions may temporarily have no active consumer, but no data is lost — events accumulate in Kafka and are processed when the rebalance completes. The Load Balancer and Click Ingestion Service scale independently via pod autoscaling.

The primary trade-off is dashboard freshness. Aggregated metrics are only available after the current window closes and the result is written to the database + cache. With 1-minute windows, dashboards lag behind real time by 30-90 seconds on average. This is acceptable for campaign monitoring but insufficient for billing — where the Lambda Architecture variant adds a batch reconciliation layer for exact counts.

Architecture Preview

Loading architecture preview...

Open in Simulator

Request Flow — Async Ingestion + Windowed Aggregation

The async ingestion pattern fundamentally changes the system's throughput characteristics by decoupling the click acknowledgment from the aggregation computation. Instead of waiting for two database writes (~28ms in the naive approach), the ingestion service writes one Kafka message (~3ms) and returns immediately. The stream processor consumes events in batches, accumulates them in memory during a 1-minute tumbling window, and flushes a single aggregated result per campaign per window — reducing per-event database cost by 100x.

The diagram shows two completely independent data paths: the write path (click → Kafka → stream processor → DB) and the read path (dashboard → Redis/DB). These paths share no resources, enabling independent scaling.

Loading diagram...

Step-by-Step Walkthrough

1Browser sends POST /click — the Ingestion Service validates required fields (campaign_id, ad_id) and rejects malformed requests immediately
2Ingestion Service produces a Kafka message with key=campaign_id, ensuring all clicks for one campaign land on the same partition for ordered processing
3Kafka acknowledges the write in ~3ms (acks=1 mode) — the event is now durably stored with 7-day retention for replay capability
4Ingestion Service returns HTTP 202 Accepted (~5ms total) — 5-6x faster than the naive approach's 28ms synchronous dual-write
5Stream Processor polls Kafka in batches (e.g., 500 events per poll), processing them as a group rather than individually
6Events are accumulated in an in-memory window buffer for the current 1-minute tumbling window — this is where the 100x write reduction happens (thousands of events → one aggregate row)
7When the window closes (every 60 seconds), the processor flushes aggregated results to the Aggregation Database and updates Redis for hot campaigns
8Dashboard queries go through a completely separate path: Query Service checks Redis first (sub-2ms for cached campaigns), falling back to the Aggregation DB (~8ms) on cache miss

Pseudocode

// WRITE PATH — Click Ingestion Service
async function handleClick(campaign_id, ad_id, user_id):
    event = { campaign_id, ad_id, user_id, event_time: now(), click_id: uuid() }

    // Single Kafka produce — 5-6x faster than dual DB writes
    await kafka.produce(
        topic = "ad-clicks",
        key   = campaign_id,     // Partition by campaign for ordered processing
        value = serialize(event),
        acks  = 1                // Leader ack only (3ms vs 10ms for acks=all)
    )
    return 202  // Accepted — not "processed", just "received"

// PROCESSING — Stream Processor (Kafka consumer group)
window_buffer = {}  // { campaign_id → { clicks: 0, users: Set() } }

function processEvent(event):
    cid = event.campaign_id
    if cid not in window_buffer:
        window_buffer[cid] = { clicks: 0, users: new Set() }
    window_buffer[cid].clicks += 1
    window_buffer[cid].users.add(event.user_id)

function flushWindow(window_start, window_end):
    // Batch INSERT — one row per campaign instead of one per click
    batch = []
    for (cid, agg) in window_buffer:
        batch.append({
            campaign_id: cid,
            window_start, window_end,
            click_count: agg.clicks,
            unique_users: agg.users.size
        })
    await db.batchInsert("windowed_aggregates", batch)

    // Push hot campaigns to Redis (top N by traffic)
    for top_campaign in batch.sortBy(clicks).take(1000):
        await redis.setex(
            "campaign:" + top_campaign.cid + ":latest",
            60,  // TTL 60 seconds
            serialize(top_campaign)
        )
    window_buffer.clear()

// READ PATH — Query Service
async function getDashboard(campaign_id):
    cached = await redis.get("campaign:" + campaign_id + ":latest")
    if cached: return deserialize(cached)   // ~1ms

    return await db.query(                  // ~8ms
        "SELECT * FROM windowed_aggregates
         WHERE campaign_id = $1
         ORDER BY window_start DESC LIMIT 60",
        [campaign_id]
    )

Data Model

The data model reflects a three-tier storage strategy, each tier optimized for a different access pattern. Kafka provides durable, ordered event storage with 7-day retention for replay. The Aggregation Database stores pre-computed per-window summaries — dramatically smaller than raw events. Redis provides sub-millisecond reads for the highest-traffic campaigns.

The key insight is the data reduction ratio: at 10K campaigns with 1K clicks/sec, Kafka stores 60K messages/minute while the Aggregation DB stores only 10K rows/minute (one per campaign per window) — a 6x reduction. Redis stores just 1K keys (top campaigns) — a 60x reduction from the Aggregation DB.

Loading diagram...

Step-by-Step Walkthrough

1Raw click events enter the Kafka topic keyed by campaign_id — the partition key ensures ordering per campaign and enables the stream processor to maintain local state without distributed coordination
2Each Kafka message includes a click_id for idempotency — Kafka's idempotent producer (enable.idempotence=true) prevents duplicate writes from producer retries at the network layer
3The stream processor reads from Kafka and accumulates events in memory during each 1-minute tumbling window, then flushes one aggregated row per campaign to windowed_aggregates
4windowed_aggregates stores the final output: click_count, unique_users, and CTR for each campaign in each 1-minute window — this is what dashboard queries read
5The time-series index on (window_start DESC) enables fast 'last N minutes' queries without scanning historical data
6Redis caches the latest rollup for the top 1K campaigns by traffic — populated push-based by the stream processor, not pull-based by the query service, so the cache is always warm for active campaigns

Pseudocode

// Data flow: Raw events → Aggregated windows → Cached hot campaigns

// Tier 1: Kafka (raw events, 7-day retention)
// Write volume: 1 message per click (60K/min at 1K clicks/sec)
// Read pattern: sequential consumption by stream processor
kafka.produce("ad-clicks", key=campaign_id, value={
    ad_id, user_id, publisher_id, event_time, click_id
})

// Tier 2: Aggregation DB (per-window summaries)
// Write volume: 1 row per campaign per minute (10K rows/min at 10K campaigns)
// That's 100x fewer writes than the naive approach (1 row per click)
INSERT INTO windowed_aggregates
    (campaign_id, window_start, window_end, click_count, unique_users, ctr)
VALUES ('camp_123', '2025-01-15 12:00', '2025-01-15 12:01', 847, 612, 0.034)

// Tier 3: Redis (hot campaign cache)
// Write volume: 1K keys refreshed every 60 seconds
// Read latency: <2ms vs ~8ms from Aggregation DB
SETEX "campaign:camp_123:latest" 60 '{"clicks":847,"users":612}'

// Dashboard query pattern (Query Service)
cached = GET "campaign:camp_123:latest"        // ~1ms, 70-80% hit rate
if not cached:
    SELECT * FROM windowed_aggregates          // ~8ms fallback
    WHERE campaign_id = 'camp_123'
    AND window_start >= now() - interval '1 hour'
    ORDER BY window_start DESC

Key Design Decisions

Kafka as Event Buffer

Choice

Apache Kafka with campaign_id partitioning

Rationale

Kafka decouples click ingestion from aggregation processing. The produce call completes in 3-5ms versus 20-30ms for synchronous database writes. Partitioning by campaign_id ensures all events for one campaign are on the same partition, enabling local aggregation without distributed coordination. The trade-off versus SQS or RabbitMQ is operational complexity (Zookeeper/KRaft, partition management) in exchange for higher throughput and replay capability.

Tumbling Windows

Choice

1-minute tumbling windows for aggregation

Rationale

Tumbling windows produce exactly one output per window period, minimizing write amplification to the database. Sliding windows (e.g., 1-minute window sliding every 10 seconds) produce 6x more outputs for the same data. Session windows would be ideal for user-level analytics but require per-user state tracking, which is impractical at millions of concurrent users. The 1-minute duration balances freshness (dashboards update every minute) with processing efficiency (batch writes to the database).

Redis Cache for Hot Campaigns

Choice

Redis with 60-second TTL for top campaigns

Rationale

Ad dashboard traffic follows a power-law distribution — the top 100 campaigns generate 80% of dashboard queries. Caching the latest aggregates for these campaigns in Redis reduces database query load by 70-80% and provides sub-2ms dashboard reads. The 60-second TTL ensures the cache is refreshed at least once per aggregation window, keeping displayed values reasonably fresh.

Separate Query Service

Choice

Dedicated service for dashboard reads

Rationale

Separating the query path from the ingestion path prevents dashboard traffic from affecting click ingestion performance. The Query Service can be scaled independently, applies its own caching strategy, and can serve stale data from the cache if the database is under maintenance. This separation is a core CQRS principle applied pragmatically.

No Deduplication in Stream

Choice

Rely on Kafka idempotent producer for basic dedup

Rationale

Kafka's idempotent producer (enable.idempotence=true) prevents duplicate writes from producer retries, handling the most common source of duplicates (network timeouts). Application-level dedup (tracking click_ids) requires a distributed set or Bloom filter, adding latency and memory overhead. For the standard tier, idempotent producer is a pragmatic 80/20 solution.

Scale & Performance

Target RPS

50K+ (horizontally scalable)

Latency (p99)

<5ms click ack, 30-90s dashboard lag

Storage

~500 GB/year (time-series optimized)

Availability

99.9% (Kafka replication + multi-pod services)

Time & Space Complexity

Operation	Time	Space	Notes
Click ingestion (Kafka produce)	O(1) — single Kafka produce with acks=1	O(1) — one Kafka message per click	Constant 3-5ms per click regardless of total volume. No database interaction on the write path.
Windowed aggregation (stream processor)	O(1) amortized — hash map increment per event, O(C) per window flush where C = active campaigns	O(C × U) — where C is campaigns in current window and U is unique users per campaign (for HyperLogLog or Set)	Memory-bound: at 10K campaigns with 100 unique users each, window state is ~50 MB. Flush cost is O(C) batch INSERT.
Dashboard query (cache hit)	O(1) — Redis GET by campaign key	O(1) — single cached aggregate object	Sub-2ms for cached campaigns. 70-80% hit rate for top 1K campaigns. Cache miss falls back to DB at O(log N).
Dashboard query (cache miss — DB fallback)	O(log N) — B-tree index scan on (campaign_id, window_start)	O(W) — where W is the number of windows in the query range	~8ms for last-hour queries. Index on window_start DESC enables efficient recent-data scans without touching old windows.

Database Schema (HLD)

clicks (Kafka topic)

Durable event stream partitioned by campaign_id. Each click is a Kafka message with key=campaign_id ensuring all clicks for one campaign land on the same partition — enabling the stream processor to maintain local aggregation state without distributed coordination. Retention is 7 days, allowing full replay if the stream processor needs to reprocess events after a bug fix or schema change.

key: campaign_idad_id TEXTuser_id TEXTpublisher_id TEXTevent_time TIMESTAMPclick_id TEXT (idempotency)

Partition: campaign_id

Kafka idempotent producer (enable.idempotence=true) prevents duplicate writes from producer retries. 16 partitions support ~50K events/sec with 16 consumer instances.

windowed_aggregates

Time-series table storing the output of 1-minute tumbling window aggregation. Each row represents the total clicks for one campaign in one 1-minute window. The stream processor flushes window results as batch INSERTs, reducing per-event database cost by 100x compared to the naive approach. Indexed by (campaign_id, window_start) for fast time-range dashboard queries.

campaign_id TEXT PKwindow_start TIMESTAMPTZ PKwindow_end TIMESTAMPTZclick_count BIGINTunique_users BIGINTctr FLOATprocessed_at TIMESTAMPTZ

Indexes: PK on (campaign_id, window_start), idx_window_time ON (window_start DESC)

Write volume: ~1 row per campaign per minute (vs 1 row per click in naive). At 10K campaigns, that is 10K rows/min vs 600K rows/min.

redis_cache

In-memory cache storing the latest 5-minute rollup for the top 1,000 campaigns by traffic volume. Provides sub-2ms dashboard reads for the most frequently accessed campaigns, achieving a 70-80% cache hit rate. TTL of 60 seconds ensures the cache refreshes at least once per aggregation window. Populated push-based by the stream processor, not pull-based by the query service.

key: campaign:{id}:latestclick_count BIGINTunique_users BIGINTttl: 60 seconds

Power-law distribution: top 100 campaigns generate 80% of dashboard queries. Redis memory footprint: ~50 MB for 1K cached campaigns.

Event Contracts

ClickEventad-clicks

Raw ad click events produced by the Ingestion Service. Partitioned by campaign_id for ordered, local aggregation by the stream processor.

Key Schema

campaign_id (STRING)

Value Schema

{ click_id: STRING, campaign_id: STRING, ad_id: STRING, user_id: STRING, publisher_id: STRING, event_time: TIMESTAMP }

AggregatedWindowResultad-click-aggregates

Aggregated per-campaign results emitted when a 1-minute tumbling window closes. Consumed by the Query Service for cache warming and downstream analytics.

Key Schema

campaign_id (STRING)

Value Schema

{ campaign_id: STRING, window_start: TIMESTAMP, window_end: TIMESTAMP, click_count: BIGINT, unique_users: BIGINT, ctr: FLOAT }

What-If Scenarios

Kafka broker goes down during peak traffic (1 of 3 brokers fails)

Impact

Partitions hosted on the failed broker become temporarily unavailable. Produce calls for those partitions fail or timeout (~30 seconds). Approximately 33% of click events experience errors until the Kafka controller reassigns partition leadership to surviving brokers (30-120 seconds). No data loss for events already committed.

Mitigation

Configure replication.factor=3 and min.insync.replicas=2 so the remaining brokers can serve partition leadership immediately. The Ingestion Service retries failed produces with exponential backoff. For V2 Lambda, the Object Storage writer provides a secondary path that continues during Kafka outages.

Stream processor consumer group rebalance during window flush

Impact

Active windows in memory on the reassigned partitions are lost — the current window's accumulated counts are discarded. Dashboard freshness degrades for affected campaigns: instead of 30-90 second lag, they show data from 2-3 minutes ago until the new window completes. No permanent data loss since events are still in Kafka.

Mitigation

Enable Kafka Streams changelog topics or implement local state store checkpointing so that on rebalance, the new consumer can resume the in-progress window from the checkpoint rather than starting empty. Cooperative rebalancing (incremental protocol) reduces rebalance frequency.

Hot campaign receives 10x average traffic (viral ad goes live)

Impact

All events for one campaign land on the same Kafka partition (partitioned by campaign_id). That partition's consumer processes 10x more events, creating a processing lag that grows over time. The hot campaign's dashboard data becomes stale while other campaigns remain fresh.

Mitigation

Implement sub-partitioning for hot campaigns: detect traffic skew and dynamically split the hot campaign across multiple partitions. Alternatively, increase the window flush frequency for hot campaigns. V3 CQRS handles this better with independent projections that can be scaled per-aggregate.

Redis cache node fails, all dashboard queries hit the Aggregation DB

Impact

Dashboard query latency jumps from ~2ms (cache hit) to ~8ms (DB read). At 5K dashboard QPS, the Aggregation DB receives a sudden 4x increase in read traffic. DB connection pool utilization spikes, potentially affecting the stream processor's write path if they share the same database.

Mitigation

Deploy Redis in cluster mode with automatic failover (ElastiCache Multi-AZ). The Query Service implements a circuit breaker that serves stale data from a local in-memory cache during Redis failover. Separate read and write connection pools on the Aggregation DB to isolate dashboard reads from stream processor writes.

Failure Modes & Resilience

Component	Failure	Impact	Mitigation
Kafka cluster	Partition leader unavailable	Produce calls for affected partitions timeout. Click events targeting those campaigns are not ingested. The Ingestion Service returns 503 to the client for affected requests until leadership is reassigned (30-120 seconds).	3-broker cluster with replication.factor=3. Controller reassigns leadership automatically. Ingestion Service retries with exponential backoff. Configure unclean.leader.election.enable=false to prevent data loss.
Stream Processor (consumer group)	Instance crash mid-window	In-memory window state for assigned partitions is lost. The consumer group rebalances, and the replacement instance starts a fresh window. Dashboard data for affected campaigns shows a gap for one window period (1 minute).	Enable changelog topics (Kafka Streams) or external checkpointing (Redis/RocksDB) for window state. On rebalance, the new consumer restores state from the changelog rather than starting empty.
Aggregation Database	Primary instance failover	Stream processor write path fails — window flush fails and retries accumulate. Dashboard queries also fail if no read replica exists. Data is not lost (events remain in Kafka), but dashboards go stale for the duration of the failover (30-60 seconds for RDS Multi-AZ).	RDS Multi-AZ with automatic failover. Stream processor implements retry with exponential backoff for flush operations. Query Service falls back to Redis cache during DB failover, serving slightly stale but available data.
Redis Cache	OOM / eviction storm	All dashboard queries bypass cache and hit the Aggregation DB directly. Read latency increases 4x (from ~2ms to ~8ms). If the DB cannot handle the sudden read traffic increase, both reads and writes degrade.	ElastiCache r7g.large with 13 GB memory (working set ~50 MB for 1K campaigns). Set maxmemory-policy allkeys-lru. Monitor eviction rate — alert if >100 evictions/sec. Size at 2x working set to absorb traffic spikes.

Scaling Strategy

Kafka: add partitions to increase parallelism (16 → 32 → 64 partitions). Each partition supports ~3K events/sec, so 64 partitions = ~200K RPS theoretical ceiling. Stream Processor: one instance per Kafka partition (auto-scale pod count to match partition count). Ingestion Service: CPU-based auto-scaling (target 60% utilization), scale from 4 to 16 pods. Query Service: request-rate auto-scaling, scale from 3 to 12 pods. Redis: vertical scaling (cache.r7g.large → cache.r7g.xlarge) or cluster mode for sharding. Aggregation DB: read replicas for query scaling, vertical scaling for write throughput. Ceiling: ~200K RPS with 64 partitions; beyond this, V2 Lambda adds the batch layer.

Monitoring & Alerting

Key metrics: (1) Kafka consumer lag per partition — alert if lag >10K messages sustained for 5 minutes (indicates stream processor cannot keep up). (2) Click ingestion p99 latency — alert if >10ms (normal is 3-5ms). (3) Window flush duration — alert if >5 seconds (normal is 100-500ms for batch INSERT). (4) Redis cache hit rate — alert if <60% (target is 70-80%). (5) Aggregation DB connections — alert at 70% of max. (6) Kafka broker disk usage — alert at 75% (7-day retention fills fast at high throughput). (7) Consumer rebalance frequency — alert if >2 rebalances per hour (indicates instability). Dashboard: Grafana with panels for consumer lag heatmap per partition, ingestion RPS, window flush latency histogram, cache hit rate gauge, and DB connection pool usage. SLIs: click ingestion p99 <10ms, dashboard freshness <90 seconds, consumer lag <30 seconds.

Cost Analysis

At ~50K RPS target: Amazon MSK 3-broker kafka.m7g.large cluster ($540/month), Aggregation DB RDS db.r7g.xlarge ($370/month), ElastiCache Redis cache.r7g.large ($150/month), ECS Fargate Ingestion Service 4 pods ($260/month), ECS Fargate Stream Processor 16 pods ($520/month), Query Service 3 pods ($195/month), ALB ($25/month). Total: ~$2,060/month. Cost per million clicks: ~$0.016 — a 19x improvement over V0's $0.30. The Kafka cluster is the largest cost component. At lower volumes (<10K RPS), a smaller MSK cluster (kafka.m7g.medium, $360/month) reduces total cost to ~$1,500/month.

Security Considerations

Authentication: API key validation at the Ingestion Service with key rotation every 90 days. Kafka: SASL/SCRAM authentication + TLS encryption for inter-broker and client-broker traffic. Rate limiting: per-API-key rate limit at the Load Balancer (configurable per advertiser). Click fraud prevention: Kafka idempotent producer prevents duplicate writes from network retries. IP hashing: SHA-256 hash applied before Kafka produce — raw IPs never enter the event stream, ensuring GDPR pseudonymization compliance. Data retention: Kafka topic retention set to 7 days; Aggregation DB retention 90 days with automated partition dropping. Redis data encrypted at rest via ElastiCache encryption.

Deployment Strategy

Blue-green deployment for the Ingestion Service and Query Service — deploy new version alongside old, shift ALB target group weight from 0% to 100% over 10 minutes. Stream Processor uses rolling deployment within the Kafka consumer group — replace one instance at a time, allowing consumer rebalance to redistribute partitions. Kafka topic schema changes deployed via Schema Registry with backward compatibility mode. Rollback: shift ALB weight back to old target group (Ingestion/Query) or roll back consumer group deployment (Stream Processor) within 5 minutes.

Real-World Examples

•Google Ads uses Millwheel (now Dataflow) for real-time click aggregation with campaign-level partitioning — the same pattern as Kafka + tumbling windows
•Meta Ads Manager processes billions of ad events daily through Kafka Streams with campaign-partitioned topics for real-time dashboard updates
•Amazon Advertising uses Kinesis Data Streams (AWS Kafka equivalent) for click ingestion with Lambda-based windowed aggregation
•Twitter/X Ads platform uses a Kafka + Flink pipeline for real-time ad engagement metrics, partitioned by advertiser account ID

Solution Comparison

Variant	Tier	Latency	Throughput	Cost	Complexity	Reliability
V0: Naive (Single Service + SQL)	T1	~25ms click ingestion	~500 RPS	$385/month	Low (4 components)	99% (single DB)
V1: Stream Processing (Kafka + Windowed)	T2	<5ms click ack	~50K RPS	$2,200/month	Medium (8 components)	99.9% (multi-AZ)
V2: Lambda Architecture (Speed + Batch)	T3	<8ms click ack	~200K RPS	$5,800/month	High (12 components)	99.99% (dual-path)
V3: CQRS + Event Sourcing	T4	5-8ms write	~100K RPS	$4,500/month	Very High (10 components)	99.99% (independent paths)

This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.

Frequently Asked Questions

Why Kafka instead of a simpler queue like SQS or RabbitMQ?

Kafka provides three features critical for ad click aggregation: (1) durable, ordered storage with configurable retention (7 days for replay), (2) consumer groups for horizontal scaling of stream processors, and (3) partition-level ordering guarantees that enable local aggregation without distributed coordination. SQS lacks ordering guarantees and retention-based replay. RabbitMQ provides ordering but has lower throughput ceilings and no built-in partition-level consumer scaling.

What happens during a Kafka consumer rebalance?

When a stream processor instance joins or leaves the consumer group, Kafka triggers a rebalance that reassigns partitions across the remaining instances. During rebalance (30-60 seconds), reassigned partitions have no active consumer — events accumulate in Kafka but are not lost. After rebalance, the new consumer resumes from the last committed offset. The impact is a brief increase in dashboard freshness lag, not data loss.

How do 1-minute tumbling windows affect billing accuracy?

Tumbling windows guarantee that every event is counted exactly once within its window — no double-counting from overlapping windows. However, events near window boundaries may appear in the 'wrong' minute (e.g., a click at 12:00:59.999 might be counted in the 12:01 window if processing is delayed). For billing, this minute-level granularity is usually acceptable. For exact billing reconciliation, the Lambda Architecture variant adds a batch layer that recomputes totals from raw events.

How does the stream processor handle late-arriving events?

Events that arrive after their window has closed (e.g., due to network delays) are either dropped or assigned to a special late bucket, depending on configuration. In this template, late events within a 30-second grace period are included in the next window's output with a correction flag. Events arriving more than 30 seconds late are logged but not aggregated, trading accuracy for bounded state size. The batch reconciliation layer in the Lambda variant catches these gaps.

Can you use Kafka Streams or Flink instead of a custom consumer?

Yes, and in production you typically would. Kafka Streams provides built-in windowing, state stores, and exactly-once processing. Apache Flink adds more advanced windowing (session, event-time), watermarks for late-event handling, and checkpoint-based fault tolerance. The custom consumer in this template illustrates the core concepts without framework-specific abstractions, making the design decisions more visible for learning.

Related Templates

Ad Click Aggregator — Naive Ad Click Aggregator — Lambda Architecture Ad Click Aggregator — CQRS + Event Sourcing

Discussion

Ready to design your own Ad Click Aggregator?

Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.

Open Simulator