A social feed system like Twitter serves 300K timeline reads/s using a hybrid fan-out strategy: fan-out on write for users under 10K followers and fan-out on read for celebrities, with Redis sorted sets caching 800 tweet IDs per user. This 8-component architecture applies ML ranking over ~500 candidate tweets in under 20ms, handles media processing asynchronously, and scales to 50TB/year of storage.
The social feed problem — designing a system like Twitter's home timeline or Facebook's news feed — is arguably the most discussed system design interview question at FAANG companies. It encapsulates the fundamental challenge of distributed systems: how do you deliver personalized, ranked content to hundreds of millions of users in real time when the underlying data is generated by millions of content creators with wildly varying follower counts?
At Twitter's scale, the system serves over 500 million tweets per day to roughly 400 million monthly active users. The core challenge is the fan-out problem: when a user with 50 million followers posts a tweet, that single write operation must eventually appear in 50 million personalized timelines. Performing this fan-out synchronously is infeasible — it would take minutes and consume enormous resources. But performing it too lazily means followers see stale timelines.
The problem becomes richer when you add content ranking. Modern social feeds are not reverse-chronological — they use machine learning models to score and rank content based on engagement probability, recency, relationship strength, content type, and hundreds of other features. This ranking must happen in real time as the user scrolls, incorporating new content that arrives while they are browsing.
This template models the complete social feed architecture: tweet ingestion service, fan-out service with celebrity optimization, timeline cache (Redis), timeline mixer with ML ranking, search indexer, media processing pipeline, and notification service. The simulation reveals how the hybrid fan-out strategy handles the bimodal follower distribution and how cache memory requirements scale with user count.
## How the Hybrid Fan-Out Strategy Handles Bimodal Follower Counts
The social feed architecture implements a hybrid fan-out strategy that adapts based on the author's follower count. The Tweet Ingestion Service receives new tweets, persists them to the Tweet Store (sharded by tweet ID), and publishes an event to the Fan-Out Service. For regular users (under 10,000 followers), the system uses fan-out on write: the Fan-Out Service retrieves the author's follower list and appends the tweet ID to each follower's cached timeline in Redis. This pre-materializes each user's feed, making read requests extremely fast — a single Redis LRANGE command returns the latest tweet IDs, serving 300K timeline reads/s.
## Celebrity Tweet Index and Fan-Out on Read
For celebrities (over 10,000 followers), fan-out on write is prohibitively expensive — a single tweet from a user with 50 million followers would generate 50 million Redis writes. Instead, the system uses fan-out on read: celebrity tweets are stored in a separate Celebrity Tweet Index keyed by author ID and scored by timestamp. When a user fetches their timeline, the Timeline Mixer merges their pre-materialized timeline (from fan-out on write) with recent tweets from celebrities they follow (fan-out on read), then applies ML-based ranking. The timeline cache stores only tweet IDs (not full tweet objects) to minimize memory, with each user's cache capped at 800 entries.
## ML Ranking Model and Engagement Prediction
The ML Ranking Model runs in the Timeline Mixer at request time, scoring approximately 500 candidate tweets in under 20ms using a gradient-boosted decision tree. For each candidate tweet, the model computes an engagement score based on features including: author-reader relationship strength (30%), tweet recency with time decay (25%), content type preference (20%), historical engagement rate on similar content (15%), and trending topic relevance (10%). The top-ranked tweets are returned in the response. The model is retrained daily on engagement logs, adapting to shifting user preferences without requiring GPU inference.
## Media Processing Pipeline and Content Moderation
The Media Processing Pipeline handles image and video uploads asynchronously, decoupled from the tweet ingestion path. When a tweet includes media, the ingestion service stores the raw file in object storage (S3), publishes a processing event, and the pipeline generates thumbnails at multiple resolutions, transcodes video to multiple bitrates for adaptive streaming, and runs automated content moderation using ML classifiers. Media URLs in the tweet are updated once processing completes, typically within 30-60 seconds. This async design ensures that media-heavy tweets do not block the text-only fan-out path.
The social feed system has two fundamentally different request flows: tweet posting (write path with fan-out) and timeline reading (read path with hydration). The hybrid fan-out strategy is the central design decision — regular users' tweets are eagerly pushed to followers' timelines (fan-out on write), while celebrity tweets are lazily merged at read time (fan-out on read). This hybrid approach avoids the thundering-herd problem of pushing a single tweet to millions of followers.
The write path triggers a cascade of asynchronous work: the tweet is persisted, media is processed, and the Fan-Out Service distributes the tweet ID to followers' cached timelines. For a user with 500 followers, this means 500 Redis LPUSH operations — fast individually (~0.1ms each) but adding up to ~50ms total. This is why celebrity tweets (>10K followers) skip write-time fan-out entirely.
The read path is where the Timeline Mixer and ML Ranking Model collaborate. The mixer merges the pre-materialized timeline (from fan-out on write) with recent celebrity tweets (from the Celebrity Tweet Index), producing ~500 candidate tweets. The ML model then scores each tweet by predicted engagement (relationship strength, recency, content type, trending topics) and returns the top ~50 for display. This ranking step adds ~20ms but dramatically improves engagement metrics.
Step-by-Step Walkthrough
Pseudocode
// Fan-Out Service — hybrid strategy
async function fanOut(tweetId, authorId):
followerCount = await socialGraph.getFollowerCount(authorId)
if followerCount < 10_000:
// Fan-out on WRITE — push to each follower's timeline
followers = await socialGraph.getFollowers(authorId)
pipeline = redis.pipeline()
for followerId in followers:
pipeline.lpush(`timeline:${followerId}`, tweetId)
pipeline.ltrim(`timeline:${followerId}`, 0, 799) // cap at 800
await pipeline.exec() // ~50ms for 500 followers (batched)
else:
// Fan-out on READ — store in celebrity index
await redis.zadd(
`celebrity:${authorId}`, Date.now(), tweetId
) // ~1ms (single write, read at timeline hydration)
// Timeline Mixer — merge + hydrate + rank
async function getTimeline(userId, cursor, limit = 50):
// 1. Pre-materialized timeline (fan-out on write results)
tweetIds = await redis.lrange(`timeline:${userId}`, 0, 499)
// 2. Celebrity tweets (fan-out on read)
celebIds = await socialGraph.getFollowedCelebrities(userId)
for celebId in celebIds:
celebTweets = await redis.zrangebyscore(
`celebrity:${celebId}`, cursor ?? 0, "+inf"
)
tweetIds.push(...celebTweets)
// 3. Hydrate tweet content (batch fetch)
tweets = await tweetStore.mget(tweetIds) // ~20ms
// 4. ML ranking — score by predicted engagement
scored = await mlRanker.rank(userId, tweets) // ~20ms
return scored.slice(0, limit) // Top 50Choice
Hybrid: fan-out on write for regular users, fan-out on read for celebrities
Rationale
Pure fan-out on write is O(followers) per tweet — when a celebrity with 50M followers tweets, it generates 50M write operations. Pure fan-out on read requires merging tweets from all followed accounts at read time, which is O(following) per request. The hybrid approach optimizes both extremes: pre-materialized timelines for the 99% of users with modest follower counts, and on-demand merging for the 1% of celebrities.
Choice
Redis sorted sets storing tweet IDs with timestamp scores
Rationale
Storing only tweet IDs (not full tweet objects) in the timeline cache reduces memory by 10-50x, since tweet content is fetched separately and can leverage a shared tweet cache. Redis sorted sets enable efficient insertion (ZADD), trimming (ZREMRANGEBYSCORE), and retrieval (ZRANGE). Each user's timeline is capped at the most recent 800 tweet IDs to bound memory.
Choice
Lightweight ML model (gradient-boosted trees) applied at request time
Rationale
Deep learning models provide marginally better ranking quality but require GPU inference, adding 50-100ms of latency. Gradient-boosted trees run on CPU in <10ms per request with a feature set of ~200 signals. The ranking quality difference is measurable but small relative to the latency and infrastructure cost savings. The model is retrained daily on engagement data.
Choice
10,000 followers as the fan-out strategy boundary
Rationale
The threshold is determined empirically by the point where fan-out on write cost exceeds fan-out on read cost for the average request. Below 10K followers, the write amplification is manageable and the read speedup is significant. Above 10K, the write cost per tweet grows proportionally while each individual follower's timeline read only saves one merge operation. The threshold is tunable per deployment.
Choice
Sharded MySQL with tweet ID as shard key
Rationale
Tweets are small, append-only objects with a simple schema (author, content, media refs, metadata). MySQL provides strong consistency for individual tweets, efficient point lookups by ID, and mature tooling for online schema changes and backups. Sharding by tweet ID distributes write load evenly across shards. The Tweet Store is accessed exclusively by ID, so the simple shard key works perfectly.
Target RPS
300K timeline reads/s
Latency (p99)
<500ms (timeline load)
Storage
~50 TB/year (tweets + media)
Availability
99.99%
This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.
Fan-out on write pre-computes each user's timeline at the moment a tweet is posted — the tweet ID is appended to every follower's cached timeline. This makes reads fast (just fetch the pre-built list) but writes expensive (proportional to follower count). Fan-out on read does no work at write time; instead, when a user loads their feed, the system fetches recent tweets from all accounts they follow and merges them. Reads are expensive but writes are O(1). Most production systems use a hybrid approach.
Twitter uses a hybrid fan-out strategy. For regular users (under ~10K followers), tweets are fanned out on write to each follower's timeline cache. For celebrities with millions of followers, fan-out on write would generate millions of cache writes per tweet. Instead, celebrity tweets are stored in a separate index. When a user loads their timeline, recent celebrity tweets are merged in at read time. This bounds the maximum write amplification while maintaining fast reads.
The ranking model scores each candidate tweet by predicting engagement probability (will the user like, retweet, reply, or spend time reading?). Features include: the user's historical interaction with the author, tweet recency, content type, number of engagements from other users, trending topic relevance, and the user's past engagement patterns by content type. The model is typically a gradient-boosted decision tree trained on engagement logs, retrained daily to adapt to shifting user preferences.
Assuming each user's timeline cache stores 800 tweet IDs (8 bytes each) plus sorted set overhead (~50 bytes per entry), each timeline costs approximately 46KB. For 400M users, this totals ~18TB of Redis memory. In practice, only active users (users who logged in recently) need cached timelines — perhaps 100M daily active users — reducing the requirement to ~4.6TB, which is feasible with a Redis Cluster of 50-100 instances.
For users with an active app session, new tweets from followed accounts can be pushed via a WebSocket or Server-Sent Events (SSE) connection. The fan-out service publishes new tweet events to a pub/sub system (Redis Pub/Sub or Kafka), and a real-time delivery service subscribes to channels corresponding to online users. A 'new tweets available' notification appears at the top of the timeline rather than inserting tweets directly, avoiding disorienting scroll jumps.
The threshold is the break-even point where fan-out on write cost equals fan-out on read cost. For a user with 500 followers, fan-out on write costs 500 Redis LPUSH operations (approximately 50ms) per tweet but makes reads free. For a celebrity with 10M followers, fan-out on write would cost 10M writes per tweet, taking minutes and consuming enormous Redis throughput. Fan-out on read for celebrities adds roughly 5-10ms per timeline read to merge celebrity tweets. At 10K followers, the write cost per tweet (approximately 1 second of Redis time) starts exceeding the amortized read savings. The exact threshold is tuned empirically per deployment based on tweet frequency and read patterns.
Write amplification is the ratio of total writes generated to original writes. If the average user has 300 followers and posts 2 tweets/day, each tweet generates 300 fan-out writes. With 100M active users posting 200M tweets/day, total fan-out writes are 200M times 300 equals 60 billion writes/day or roughly 700K writes/s. Each write is a Redis LPUSH of an 8-byte tweet ID plus sorted set overhead, consuming approximately 50 bytes. Total fan-out bandwidth is 35 MB/s, well within a Redis cluster's capacity. Celebrity tweets (skipping fan-out on write) reduce this by approximately 30% since celebrities generate a disproportionate share of total tweets.
Cold start occurs when a new user or a user who has been inactive for weeks opens the app and their Redis timeline cache is empty or expired. The solution is a timeline reconstruction job that pulls recent tweets from all followed accounts, merges them, and populates the cache on demand, typically completing in 200-500ms. Stale timelines occur when fan-out workers fall behind during traffic spikes, causing some followers to miss recent tweets. The Timeline Mixer mitigates this by checking the timestamp of the most recent fan-out entry against the current time; if the gap exceeds a threshold (e.g., 5 minutes), it supplements with a direct query to the Tweet Store for tweets from followed accounts posted in the gap.
Sign in to join the discussion.
Ready to design your own Social Feed (Twitter/X)?
Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.
Open Simulator