Vetora logo
Hard8 componentsInterview: Very High

Social Feed (Twitter/X)

A social feed system like Twitter serves 300K timeline reads/s using a hybrid fan-out strategy: fan-out on write for users under 10K followers and fan-out on read for celebrities, with Redis sorted sets caching 800 tweet IDs per user. This 8-component architecture applies ML ranking over ~500 candidate tweets in under 20ms, handles media processing asynchronously, and scales to 50TB/year of storage.

Fan-OutML RankingCaching
Problem Statement

The social feed problem — designing a system like Twitter's home timeline or Facebook's news feed — is arguably the most discussed system design interview question at FAANG companies. It encapsulates the fundamental challenge of distributed systems: how do you deliver personalized, ranked content to hundreds of millions of users in real time when the underlying data is generated by millions of content creators with wildly varying follower counts?

At Twitter's scale, the system serves over 500 million tweets per day to roughly 400 million monthly active users. The core challenge is the fan-out problem: when a user with 50 million followers posts a tweet, that single write operation must eventually appear in 50 million personalized timelines. Performing this fan-out synchronously is infeasible — it would take minutes and consume enormous resources. But performing it too lazily means followers see stale timelines.

The problem becomes richer when you add content ranking. Modern social feeds are not reverse-chronological — they use machine learning models to score and rank content based on engagement probability, recency, relationship strength, content type, and hundreds of other features. This ranking must happen in real time as the user scrolls, incorporating new content that arrives while they are browsing.

This template models the complete social feed architecture: tweet ingestion service, fan-out service with celebrity optimization, timeline cache (Redis), timeline mixer with ML ranking, search indexer, media processing pipeline, and notification service. The simulation reveals how the hybrid fan-out strategy handles the bimodal follower distribution and how cache memory requirements scale with user count.

Architecture Overview

## How the Hybrid Fan-Out Strategy Handles Bimodal Follower Counts

The social feed architecture implements a hybrid fan-out strategy that adapts based on the author's follower count. The Tweet Ingestion Service receives new tweets, persists them to the Tweet Store (sharded by tweet ID), and publishes an event to the Fan-Out Service. For regular users (under 10,000 followers), the system uses fan-out on write: the Fan-Out Service retrieves the author's follower list and appends the tweet ID to each follower's cached timeline in Redis. This pre-materializes each user's feed, making read requests extremely fast — a single Redis LRANGE command returns the latest tweet IDs, serving 300K timeline reads/s.

## Celebrity Tweet Index and Fan-Out on Read

For celebrities (over 10,000 followers), fan-out on write is prohibitively expensive — a single tweet from a user with 50 million followers would generate 50 million Redis writes. Instead, the system uses fan-out on read: celebrity tweets are stored in a separate Celebrity Tweet Index keyed by author ID and scored by timestamp. When a user fetches their timeline, the Timeline Mixer merges their pre-materialized timeline (from fan-out on write) with recent tweets from celebrities they follow (fan-out on read), then applies ML-based ranking. The timeline cache stores only tweet IDs (not full tweet objects) to minimize memory, with each user's cache capped at 800 entries.

## ML Ranking Model and Engagement Prediction

The ML Ranking Model runs in the Timeline Mixer at request time, scoring approximately 500 candidate tweets in under 20ms using a gradient-boosted decision tree. For each candidate tweet, the model computes an engagement score based on features including: author-reader relationship strength (30%), tweet recency with time decay (25%), content type preference (20%), historical engagement rate on similar content (15%), and trending topic relevance (10%). The top-ranked tweets are returned in the response. The model is retrained daily on engagement logs, adapting to shifting user preferences without requiring GPU inference.

## Media Processing Pipeline and Content Moderation

The Media Processing Pipeline handles image and video uploads asynchronously, decoupled from the tweet ingestion path. When a tweet includes media, the ingestion service stores the raw file in object storage (S3), publishes a processing event, and the pipeline generates thumbnails at multiple resolutions, transcodes video to multiple bitrates for adaptive streaming, and runs automated content moderation using ML classifiers. Media URLs in the tweet are updated once processing completes, typically within 30-60 seconds. This async design ensures that media-heavy tweets do not block the text-only fan-out path.

Architecture Preview
Loading architecture preview...
Request Flow — Tweet Post & Timeline Hydration

The social feed system has two fundamentally different request flows: tweet posting (write path with fan-out) and timeline reading (read path with hydration). The hybrid fan-out strategy is the central design decision — regular users' tweets are eagerly pushed to followers' timelines (fan-out on write), while celebrity tweets are lazily merged at read time (fan-out on read). This hybrid approach avoids the thundering-herd problem of pushing a single tweet to millions of followers.

The write path triggers a cascade of asynchronous work: the tweet is persisted, media is processed, and the Fan-Out Service distributes the tweet ID to followers' cached timelines. For a user with 500 followers, this means 500 Redis LPUSH operations — fast individually (~0.1ms each) but adding up to ~50ms total. This is why celebrity tweets (>10K followers) skip write-time fan-out entirely.

The read path is where the Timeline Mixer and ML Ranking Model collaborate. The mixer merges the pre-materialized timeline (from fan-out on write) with recent celebrity tweets (from the Celebrity Tweet Index), producing ~500 candidate tweets. The ML model then scores each tweet by predicted engagement (relationship strength, recency, content type, trending topics) and returns the top ~50 for display. This ranking step adds ~20ms but dramatically improves engagement metrics.

Loading diagram...

Step-by-Step Walkthrough

  1. 1Author posts a tweet via POST /api/tweet. The Tweet Ingestion Service persists it to the Tweet Store (sharded by tweet_id for even distribution) in ~15ms. If media is attached, a separate Media Processing Pipeline runs async (thumbnails, transcoding, moderation).
  2. 2The Fan-Out Service receives a fan-out event and checks the author's follower count. For regular users (<10K followers), it resolves the complete follower list from the social graph.
  3. 3For each follower, the Fan-Out Service pushes the tweet ID (just 8 bytes) to that follower's Timeline Cache in Redis via LPUSH. With 500 followers, this is 500 LPUSH operations taking ~50ms total. Timelines are capped at 800 entries (LTRIM).
  4. 4For celebrity authors (>10K followers), the tweet ID is added to a sorted set in the Celebrity Tweet Index (keyed by author ID, scored by timestamp). No fan-out occurs at write time — this prevents a single tweet from generating millions of cache writes.
  5. 5When a reader opens their timeline, the Timeline Mixer first reads the pre-materialized timeline from Redis (LRANGE, ~2ms) — this contains all tweets from regular followed users.
  6. 6The Mixer then queries the Celebrity Tweet Index for recent tweets from all celebrities the reader follows (ZRANGEBYSCORE since last read, ~3ms). These are merged with the pre-materialized timeline.
  7. 7The combined ~500 tweet IDs are hydrated in batch from the Tweet Store (MGET, ~20ms). Tweet content, author metadata, engagement counts, and media URLs are fetched together.
  8. 8The ML Ranking Model scores all 500 candidates by predicted engagement: 30% relationship strength (interaction frequency), 25% recency (time decay), 20% content type preference, 15% historical engagement on similar content, 10% trending topic boost. The top 50 tweets are returned in ranked order (~20ms).

Pseudocode

// Fan-Out Service — hybrid strategy
async function fanOut(tweetId, authorId):
    followerCount = await socialGraph.getFollowerCount(authorId)

    if followerCount < 10_000:
        // Fan-out on WRITE — push to each follower's timeline
        followers = await socialGraph.getFollowers(authorId)
        pipeline = redis.pipeline()
        for followerId in followers:
            pipeline.lpush(`timeline:${followerId}`, tweetId)
            pipeline.ltrim(`timeline:${followerId}`, 0, 799)  // cap at 800
        await pipeline.exec()   // ~50ms for 500 followers (batched)
    else:
        // Fan-out on READ — store in celebrity index
        await redis.zadd(
            `celebrity:${authorId}`, Date.now(), tweetId
        )   // ~1ms (single write, read at timeline hydration)

// Timeline Mixer — merge + hydrate + rank
async function getTimeline(userId, cursor, limit = 50):
    // 1. Pre-materialized timeline (fan-out on write results)
    tweetIds = await redis.lrange(`timeline:${userId}`, 0, 499)

    // 2. Celebrity tweets (fan-out on read)
    celebIds = await socialGraph.getFollowedCelebrities(userId)
    for celebId in celebIds:
        celebTweets = await redis.zrangebyscore(
            `celebrity:${celebId}`, cursor ?? 0, "+inf"
        )
        tweetIds.push(...celebTweets)

    // 3. Hydrate tweet content (batch fetch)
    tweets = await tweetStore.mget(tweetIds)   // ~20ms

    // 4. ML ranking — score by predicted engagement
    scored = await mlRanker.rank(userId, tweets)   // ~20ms
    return scored.slice(0, limit)   // Top 50
Key Design Decisions
Fan-Out Strategy

Choice

Hybrid: fan-out on write for regular users, fan-out on read for celebrities

Rationale

Pure fan-out on write is O(followers) per tweet — when a celebrity with 50M followers tweets, it generates 50M write operations. Pure fan-out on read requires merging tweets from all followed accounts at read time, which is O(following) per request. The hybrid approach optimizes both extremes: pre-materialized timelines for the 99% of users with modest follower counts, and on-demand merging for the 1% of celebrities.

Timeline Cache Design

Choice

Redis sorted sets storing tweet IDs with timestamp scores

Rationale

Storing only tweet IDs (not full tweet objects) in the timeline cache reduces memory by 10-50x, since tweet content is fetched separately and can leverage a shared tweet cache. Redis sorted sets enable efficient insertion (ZADD), trimming (ZREMRANGEBYSCORE), and retrieval (ZRANGE). Each user's timeline is capped at the most recent 800 tweet IDs to bound memory.

Content Ranking

Choice

Lightweight ML model (gradient-boosted trees) applied at request time

Rationale

Deep learning models provide marginally better ranking quality but require GPU inference, adding 50-100ms of latency. Gradient-boosted trees run on CPU in <10ms per request with a feature set of ~200 signals. The ranking quality difference is measurable but small relative to the latency and infrastructure cost savings. The model is retrained daily on engagement data.

Celebrity Threshold

Choice

10,000 followers as the fan-out strategy boundary

Rationale

The threshold is determined empirically by the point where fan-out on write cost exceeds fan-out on read cost for the average request. Below 10K followers, the write amplification is manageable and the read speedup is significant. Above 10K, the write cost per tweet grows proportionally while each individual follower's timeline read only saves one merge operation. The threshold is tunable per deployment.

Tweet Storage

Choice

Sharded MySQL with tweet ID as shard key

Rationale

Tweets are small, append-only objects with a simple schema (author, content, media refs, metadata). MySQL provides strong consistency for individual tweets, efficient point lookups by ID, and mature tooling for online schema changes and backups. Sharding by tweet ID distributes write load evenly across shards. The Tweet Store is accessed exclusively by ID, so the simple shard key works perfectly.

Scale & Performance

Target RPS

300K timeline reads/s

Latency (p99)

<500ms (timeline load)

Storage

~50 TB/year (tweets + media)

Availability

99.99%

This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.

Frequently Asked Questions
What is fan-out on write vs. fan-out on read in social feeds?

Fan-out on write pre-computes each user's timeline at the moment a tweet is posted — the tweet ID is appended to every follower's cached timeline. This makes reads fast (just fetch the pre-built list) but writes expensive (proportional to follower count). Fan-out on read does no work at write time; instead, when a user loads their feed, the system fetches recent tweets from all accounts they follow and merges them. Reads are expensive but writes are O(1). Most production systems use a hybrid approach.

How does Twitter handle tweets from users with millions of followers?

Twitter uses a hybrid fan-out strategy. For regular users (under ~10K followers), tweets are fanned out on write to each follower's timeline cache. For celebrities with millions of followers, fan-out on write would generate millions of cache writes per tweet. Instead, celebrity tweets are stored in a separate index. When a user loads their timeline, recent celebrity tweets are merged in at read time. This bounds the maximum write amplification while maintaining fast reads.

How does the social feed ML ranking model work?

The ranking model scores each candidate tweet by predicting engagement probability (will the user like, retweet, reply, or spend time reading?). Features include: the user's historical interaction with the author, tweet recency, content type, number of engagements from other users, trending topic relevance, and the user's past engagement patterns by content type. The model is typically a gradient-boosted decision tree trained on engagement logs, retrained daily to adapt to shifting user preferences.

How much Redis memory is needed to cache timelines for 400M users?

Assuming each user's timeline cache stores 800 tweet IDs (8 bytes each) plus sorted set overhead (~50 bytes per entry), each timeline costs approximately 46KB. For 400M users, this totals ~18TB of Redis memory. In practice, only active users (users who logged in recently) need cached timelines — perhaps 100M daily active users — reducing the requirement to ~4.6TB, which is feasible with a Redis Cluster of 50-100 instances.

How do you handle real-time tweet delivery for users currently browsing?

For users with an active app session, new tweets from followed accounts can be pushed via a WebSocket or Server-Sent Events (SSE) connection. The fan-out service publishes new tweet events to a pub/sub system (Redis Pub/Sub or Kafka), and a real-time delivery service subscribes to channels corresponding to online users. A 'new tweets available' notification appears at the top of the timeline rather than inserting tweets directly, avoiding disorienting scroll jumps.

How would you explain the 10K follower threshold for hybrid fan-out to an interviewer?

The threshold is the break-even point where fan-out on write cost equals fan-out on read cost. For a user with 500 followers, fan-out on write costs 500 Redis LPUSH operations (approximately 50ms) per tweet but makes reads free. For a celebrity with 10M followers, fan-out on write would cost 10M writes per tweet, taking minutes and consuming enormous Redis throughput. Fan-out on read for celebrities adds roughly 5-10ms per timeline read to merge celebrity tweets. At 10K followers, the write cost per tweet (approximately 1 second of Redis time) starts exceeding the amortized read savings. The exact threshold is tuned empirically per deployment based on tweet frequency and read patterns.

How would you estimate the write amplification of a social feed system in an interview?

Write amplification is the ratio of total writes generated to original writes. If the average user has 300 followers and posts 2 tweets/day, each tweet generates 300 fan-out writes. With 100M active users posting 200M tweets/day, total fan-out writes are 200M times 300 equals 60 billion writes/day or roughly 700K writes/s. Each write is a Redis LPUSH of an 8-byte tweet ID plus sorted set overhead, consuming approximately 50 bytes. Total fan-out bandwidth is 35 MB/s, well within a Redis cluster's capacity. Celebrity tweets (skipping fan-out on write) reduce this by approximately 30% since celebrities generate a disproportionate share of total tweets.

What are the cold-start and stale-timeline problems in a social feed and how do you solve them?

Cold start occurs when a new user or a user who has been inactive for weeks opens the app and their Redis timeline cache is empty or expired. The solution is a timeline reconstruction job that pulls recent tweets from all followed accounts, merges them, and populates the cache on demand, typically completing in 200-500ms. Stale timelines occur when fan-out workers fall behind during traffic spikes, causing some followers to miss recent tweets. The Timeline Mixer mitigates this by checking the timestamp of the most recent fan-out entry against the current time; if the gap exceeds a threshold (e.g., 5 minutes), it supplements with a direct query to the Tweet Store for tweets from followed accounts posted in the gap.

Related Templates

Discussion

Sign in to join the discussion.

Ready to design your own Social Feed (Twitter/X)?

Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.

Open Simulator