What is important about Interview Walkthrough: Top-K / Trending Topics regarding "Count-Min Sketch provides approximate frequency counts with ..."?

Count-Min Sketch provides approximate frequency counts with bounded memory (a few MB) regardless of the number of unique items. It always overestimates (never underestimates) and the error bound is configurable via the sketch dimensions.

What is important about Interview Walkthrough: Top-K / Trending Topics regarding "The Space-Saving algorithm maintains exact counts for the K ..."?

The Space-Saving algorithm maintains exact counts for the K most frequent items and can detect when a new item displaces an existing top-K member. It is a strong alternative to Count-Min Sketch when exact top-K counts matter.

What is important about Interview Walkthrough: Top-K / Trending Topics regarding "The pipeline architecture is: events (Kafka) -> counter serv..."?

The pipeline architecture is: events (Kafka) -> counter service (Count-Min Sketch) -> aggregator (merge across partitions) -> Redis sorted set (serve top-K). Each stage is independently scalable.

What is important about Interview Walkthrough: Top-K / Trending Topics regarding "Time-windowed counting prevents historical counts from domin..."?

Time-windowed counting prevents historical counts from dominating current trends. Tumbling windows are simple but discontinuous; sliding windows produce smoother results but require more state; exponential decay is the most elegant approach but harder to implement exactly.

What is important about Interview Walkthrough: Top-K / Trending Topics regarding "Geo-partitioned trending (trends by country or city) require..."?

Geo-partitioned trending (trends by country or city) requires partitioning the event stream by geographic region and maintaining separate counter instances per region. The aggregator produces per-region top-K lists alongside the global list.

What is important about Interview Walkthrough: Top-K / Trending Topics regarding "Anti-gaming measures (bot detection, per-user deduplication,..."?

Anti-gaming measures (bot detection, per-user deduplication, velocity limits) are essential to prevent artificial inflation of trending topics. Deduplicate events by user_id before counting so that a single user cannot disproportionately boost a topic.

Vetora

📊Interview Toolkit

Interview Walkthrough: Top-K / Trending Topics

A detailed interview walkthrough for designing a trending topics system. Covers exact vs approximate counting, Count-Min Sketch, time-windowed aggregation, geo-partitioned trends, and anti-gaming strategies.

Overview

Designing a system to show trending topics -- like Twitter Trends, YouTube Trending, or Google Trends -- is a system design interview question that tests your knowledge of streaming algorithms, approximate data structures, and real-time data pipelines. The core challenge is counting events at massive scale with bounded memory and bounded latency, then surfacing the top K items.

The naive approach is straightforward: maintain a hash map of topic-to-count and a min-heap of size K to track the top items. When an event arrives (a tweet with a hashtag, a search query, a video view), increment the topic's count in the hash map and update the heap. This works perfectly for small datasets but fails at scale. If you are tracking millions of unique topics across billions of events per day, the hash map alone consumes gigabytes of memory, and a single machine cannot process the event stream fast enough. Moreover, trends must be time-windowed -- you want to know what is trending now, not what was popular last week.

Approximate counting solves the memory problem. A Count-Min Sketch is a probabilistic data structure that uses a 2D array of counters and multiple hash functions to count item frequencies with bounded error. It uses O(1) memory per item (shared across the array) and provides counts that are always greater than or equal to the true count, with overestimation bounded by a configurable error parameter. For K=100 trending topics and a 0.01% error rate, a Count-Min Sketch needs only a few megabytes of memory regardless of how many unique topics exist. The Space-Saving algorithm is another option that maintains exact counts for the most frequent items and approximate counts for less frequent items, using a fixed-size data structure.

The streaming architecture processes events through a multi-stage pipeline. Raw events (tweets, searches, clicks) are published to Kafka topics, partitioned by event type. Counter service instances consume from Kafka partitions, each maintaining a local Count-Min Sketch for its partition. Periodically (every 5-30 seconds), the counter services export their top-K candidates to an aggregation service that merges counts across all partitions and produces the global top-K. The final top-K list is written to a Redis sorted set that powers the serving layer. The API returns the current trending topics by reading from Redis, ensuring sub-millisecond read latency regardless of the volume of events being processed.

Time windowing determines the recency of trends. A tumbling window (e.g., 5-minute non-overlapping windows) is simple to implement but produces discontinuous trends -- a topic can appear trending at 2:59 and disappear at 3:01. A sliding window (e.g., the last 60 minutes, updated every minute) produces smoother trends but requires maintaining state for the overlap period. Exponential time decay applies a decay function to older events, giving recent events more weight. Many production systems use a combination: a sliding window for the base count with exponential decay weighting, updated every few seconds.

Key Points

1Count-Min Sketch provides approximate frequency counts with bounded memory (a few MB) regardless of the number of unique items. It always overestimates (never underestimates) and the error bound is configurable via the sketch dimensions.
2The Space-Saving algorithm maintains exact counts for the K most frequent items and can detect when a new item displaces an existing top-K member. It is a strong alternative to Count-Min Sketch when exact top-K counts matter.
3The pipeline architecture is: events (Kafka) -> counter service (Count-Min Sketch) -> aggregator (merge across partitions) -> Redis sorted set (serve top-K). Each stage is independently scalable.
4Time-windowed counting prevents historical counts from dominating current trends. Tumbling windows are simple but discontinuous; sliding windows produce smoother results but require more state; exponential decay is the most elegant approach but harder to implement exactly.
5Geo-partitioned trending (trends by country or city) requires partitioning the event stream by geographic region and maintaining separate counter instances per region. The aggregator produces per-region top-K lists alongside the global list.
6Anti-gaming measures (bot detection, per-user deduplication, velocity limits) are essential to prevent artificial inflation of trending topics. Deduplicate events by user_id before counting so that a single user cannot disproportionately boost a topic.

Simple Example

The Suggestion Box Analogy

Imagine a company with 10,000 employees submitting suggestions via a suggestion box. To find the top 10 most requested improvements, the naive approach is to read every suggestion and tally counts (exact counting). But with thousands of suggestions per day, this becomes overwhelming. An approximate approach is to keep a small board with 10 slots. Each time a suggestion arrives, check if it matches an existing slot and increment its count. If it does not match and the board is full, compare it with the lowest-count slot. If the new suggestion appears more frequently (based on a rough estimate), replace the slot. After processing many suggestions, the board reliably shows the most popular topics, even though the exact counts may be slightly off.

Real-World Examples

Twitter

Twitter Trends evolved from a simple MySQL GROUP BY query in the early days to a sophisticated streaming pipeline. The current system processes hundreds of millions of tweets per day through a real-time pipeline that counts hashtag and phrase frequencies, applies time decay, and filters out spam and bot activity. Trends are computed per-location (country, city) and globally, with personalized trend boosting based on user interests.

YouTube

YouTube Trending does not simply show the most-viewed videos. The ranking algorithm uses a weighted score that considers view count, view velocity (how quickly views accumulate), where views are coming from (YouTube vs external), the age of the video, and engagement signals (likes, comments, shares). This multi-signal approach prevents easy gaming by purchased views and surfaces genuinely interesting content.

Reddit's Hot algorithm uses a time-decayed scoring formula: score = log10(max(ups - downs, 1)) + (timestamp - reference_timestamp) / 45000. The logarithmic scaling of votes means the first 10 upvotes have the same weight as the next 100, and the time component ensures newer posts outrank older ones with similar scores. This formula powers the default front page ranking.

Trade-Offs

Aspect	Description
Exact vs Approximate Counting	Exact counting (hash map + heap) provides perfect accuracy but requires O(n) memory proportional to the number of unique items. Approximate counting (Count-Min Sketch) uses O(1) bounded memory but introduces a small overestimation error. For trending topics where the exact count of the 8th vs 9th most popular item rarely matters, approximate counting is the pragmatic choice.
Freshness vs Stability	Shorter aggregation windows (5 seconds) produce fresher trends that respond quickly to breaking events but are noisy and can fluctuate rapidly. Longer windows (5 minutes) produce more stable trends but are slower to surface emerging topics. The right balance depends on the product's tolerance for noise.
Global vs Per-Region Trends	Global trends are simpler to compute but may not be relevant to users in regions with different interests. Per-region trends are more relevant but require N times the computational resources (one counter pipeline per region) and can be easier to game in smaller regions with lower event volumes.
Real-Time vs Batch Aggregation	Real-time streaming aggregation (Kafka Streams, Flink) provides up-to-the-second trends but is more complex to operate and debug. Batch aggregation (periodic MapReduce or Spark jobs) is simpler but introduces a delay equal to the batch interval before new trends appear.

Case Study

Twitter's Evolution from Batch to Real-Time Trending

Scenario

Twitter's original trending algorithm ran a batch query every few minutes: a SQL GROUP BY on recent tweets to count hashtag frequencies, then select the top N. This worked at small scale but had two critical problems as Twitter grew: the batch query became too slow to run on the full tweet volume (taking minutes to complete), and the delay meant trends were always stale -- breaking events would not appear in trends for several minutes after they started.

Solution

Twitter built a real-time trending pipeline using Storm (later migrated to Heron, their in-house stream processing framework). The pipeline consumes the firehose of tweets in real-time, extracts hashtags and phrases, and counts frequencies using a distributed Count-Min Sketch. The sketch is partitioned across multiple worker nodes, each processing a subset of the tweet stream. An aggregator merges the partial sketches every 10 seconds and computes the global top-K. The aggregator also applies anti-gaming filters (removing bot-generated hashtags, deduplicating by user), time decay (recent tweets weighted more heavily), and personalization signals. The final trending list is pushed to a serving cache that powers the Trends UI.

Outcome

The real-time pipeline reduced the delay between a topic starting to trend and its appearance in the Trends list from 5-10 minutes to under 30 seconds. The Count-Min Sketch approach reduced memory usage by over 95% compared to the hash map approach at the same scale. The anti-gaming filters prevented coordinated bot campaigns from hijacking trends, which had been a persistent problem with the batch system. The pipeline handles over 500 million tweets per day with a sustained throughput of 5,000+ tweets per second per worker node.

Common Mistakes

⚠Proposing a hash map of all items without discussing memory constraints. At the scale of millions of unique topics, an unbounded hash map consumes gigabytes of RAM. Interviewers expect you to discuss approximate data structures (Count-Min Sketch, Space-Saving) for bounded-memory counting.
⚠Ignoring the time dimension of trending. Counting total occurrences across all time produces an all-time popularity list, not a trending list. Trends are about velocity of change -- what is gaining popularity right now. Use time-windowed counting or decay functions.
⚠Designing a single-machine solution for a planet-scale event stream. The event volume (billions per day) requires a distributed pipeline: Kafka for ingestion, multiple counter service instances for parallel counting, and an aggregation step to merge partial results.
⚠Not addressing anti-gaming. Without bot filtering and per-user deduplication, coordinated campaigns can easily game the trending algorithm by generating synthetic events. Mention deduplication, velocity limits, and anomaly detection as countermeasures.

Related Concepts

The Four-Step Framework Numbers to Memorize News Feed / Timeline URL Shortener RAG-Powered Search

See Interview Walkthrough: Top-K / Trending Topics in action

Explore system design templates that use interview walkthrough: top-k / trending topics and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Aggregate trending topics from a real-time event stream

Metrics to watch

aggregation_latency_msthroughput_events_per_secwindow_accuracy_pctmemory_usage_mb

Run Simulation

Test Your Understanding

1What advantage does a Count-Min Sketch have over a hash map for counting item frequencies at scale?

2Why is exponential time decay preferred over fixed time windows for trending topic detection?

3In a distributed trending pipeline, why is Kafka used between the event source and the counter services?

Deeper Reading