Industry-standard multiplayer architecture: skill-based matchmaking via Redis sorted sets, dedicated server-authoritative game processes per match at 60 Hz tick rate, async progression updates via Kafka. Anti-cheat by design — no client is trusted.
The dedicated authoritative server architecture is the industry-standard approach used by virtually every competitive multiplayer game in production — Valorant, Fortnite, Rocket League, Overwatch, and Counter-Strike. It solves the three fundamental problems that make client-hosted P2P unworkable for competitive play: host advantage, cheating, and host disconnection.
The key architectural shift from P2P to authoritative is moving the game simulation from a player's machine to a neutral dedicated server. In the authoritative model, no client runs the game simulation. Instead, a dedicated GameWorker process (one per match) runs the authoritative game loop at 60 ticks per second. Clients send only their inputs (movement vector, shoot command, ability activation) to the server. The server validates every input server-side — this is anti-cheat by design, not by detection. The server simulates the game state, and broadcasts state snapshots back to all connected clients. Clients are pure renderers — they display whatever the server tells them.
This eliminates the host advantage because all players connect to the same neutral server with equal network conditions. A player in the same datacenter might have 5ms latency while a remote player has 80ms, but no player has the unfair 0ms advantage that the P2P host enjoys. The latency difference is a function of geography, not architecture.
The matchmaking system uses Redis sorted sets keyed by skill rating (MMR) to group players of similar ability. ZRANGEBYSCORE finds players within an MMR range in O(log N + M) time — approximately 2ms for 100K queued players. When enough players at similar skill levels accumulate, MatchService forms a match and publishes a match-created event to Kafka. AllocatorWorker consumes the event and provisions a dedicated GameWorker container (4 vCPU, 8 GB) for the match. The Kafka decoupling is essential because server provisioning takes 2-5 seconds — if MatchService waited synchronously, the matchmaking endpoint would time out.
After the match completes, GameWorker publishes results (per-player stats, XP earned, win/loss) to a separate ResultStream (Kafka). ProgressionWorker consumes results and updates player XP, rankings, and MMR in PlayerDB (PostgreSQL). This async pipeline means the player sees the match-complete screen immediately while progression updates settle in the background within 5 seconds.
The primary limitation of this architecture is the absence of client-side prediction. Pure server-authoritative without prediction means players feel full network round-trip delay (50-100ms) between pressing a button and seeing the result on screen. For casual games this is acceptable, but for competitive FPS games, 80ms of input lag makes the game feel sluggish. The Global variant (v2) solves this with rollback netcode — client-side prediction with server reconciliation — which makes the game feel responsive at up to 150ms RTT.
The second limitation is single-region deployment. Cross-continent players experience 100-200ms ping, making competitive play unfair based on geography. The Global variant deploys game servers in 5+ regions and routes players to their nearest region. This is the architecture interviewers expect candidates to build toward after establishing the single-region authoritative baseline.
The authoritative multiplayer game system uses 10 components organized into three logical layers: matchmaking and lobby (Client, API Gateway, Load Balancer, MatchService, SessionCache), game simulation (AllocatorWorker, GameWorker), and progression (ResultStream, ProgressionWorker, PlayerDB). Two Kafka streams decouple match formation from server allocation and match completion from progression updates.
All lobby and profile traffic enters through the API Gateway, which validates session tokens (~3ms), enforces rate limits (80K RPS), and routes /api/v1/* traffic to MainLB. The API Gateway handles authentication as a cross-cutting concern — GameWorker-bound UDP traffic bypasses it entirely. MainLB distributes requests across 10 MatchService pods using round-robin (matchmaking is stateless, with queue state in Redis).
MatchService is the core orchestrator. It handles four endpoints: (1) POST /api/v1/matchmaking/queue — add player to the matchmaking queue in SessionCache (Redis sorted set by MMR), wait for enough players at similar skill levels, form a match, publish match-created event to MatchStream; (2) GET /api/v1/matchmaking/status — poll for match assignment by reading session state from Redis (QUEUED/MATCHED/IN_GAME with server address); (3) GET /api/v1/players/{id}/profile — read player stats from PlayerDB with Redis cache; (4) POST /api/v1/matches/{id}/result — receive match results from GameWorker, publish to ResultStream.
SessionCache (Redis, 6-node cluster) stores the matchmaking queue as a sorted set, active session state (player to match mapping), and match server assignments. The sorted set enables O(log N) skill-based grouping. TTL of 300 seconds auto-evicts abandoned queue entries.
The game server provisioning pipeline is fully async. MatchStream (Kafka, 32 partitions) carries match-created events from MatchService to AllocatorWorker. AllocatorWorker (20 instances) provisions a dedicated GameWorker container per match, assigns an IP/port, and writes the server address back to SessionCache. Players discover their server via polling GET /api/v1/matchmaking/status.
GameWorker is the most resource-intensive component: one dedicated instance per active match running at 60 ticks per second. It receives player inputs via UDP, validates server-side (the core anti-cheat mechanism), simulates the game state, and broadcasts state snapshots. At 100K concurrent matches with 4 vCPU per match, the fleet requires 400K vCPUs — the dominant infrastructure cost. On match completion, GameWorker publishes results to ResultStream.
The progression pipeline processes match results asynchronously. ResultStream (Kafka, 32 partitions) carries match-ended events to ProgressionWorker (20 instances), which calculates XP, updates player levels and rankings in PlayerDB (PostgreSQL, 32 partitions, 3 replicas), and recalculates MMR using an ELO-style algorithm. Strong consistency on PlayerDB prevents double-awarding XP on retry.
Choice
Dedicated GameWorker process runs the simulation at 60 Hz — clients only send inputs
Rationale
In competitive games, the server must be the single source of truth. If clients compute their own game state, cheaters modify their client to teleport, aimbot, or become invulnerable. Server-authoritative means the server simulates everything — clients send raw inputs (movement vector, shoot command) and render the server's authoritative state. This is anti-cheat by architecture, not by detection heuristics.
Choice
One GameWorker container (4 vCPU, 8 GB) per active match for CPU isolation
Rationale
A dedicated process per match prevents noisy-neighbor effects. At 60 ticks/sec with 10 players, each match needs approximately 2 CPU cores for deterministic simulation. Shared servers risk one match's complex game state (explosions, particle effects) causing frame drops in adjacent matches on the same host. The cost is high — 400K vCPUs for 100K concurrent matches — but isolation is critical for competitive integrity.
Choice
ZRANGEBYSCORE on sorted sets keyed by skill rating to find players within MMR range
Rationale
Matchmaking groups players by skill rating. Redis ZRANGEBYSCORE finds players within an MMR range in O(log N + M) — approximately 2ms for 100K queued players. The sorted set naturally orders by skill, making range queries trivial. TTL expiry removes players who abandon queue. This is simpler and faster than SQL-based matchmaking queries.
Choice
MatchService publishes match-created events to Kafka; AllocatorWorker consumes and provisions servers
Rationale
Server provisioning takes 2-5 seconds (container startup, health check). If MatchService waited synchronously, the matchmaking endpoint would time out. Kafka decouples match formation from server allocation. MatchService can form matches at 10K/sec while AllocatorWorker provisions at its own pace. Kafka also provides FIFO ordering and replay on worker failure.
Choice
Match results flow through Kafka to ProgressionWorker for background XP/ranking updates
Rationale
DB writes for XP, ranking updates, and MMR recalculation take 50-100ms with strong consistency. If the match-end flow waited for DB writes, the match-complete screen would lag. Async via Kafka means players see results immediately; progression updates settle within 5 seconds. The trade-off is eventual consistency — a player's profile may show stale stats for a few seconds after a match.
Choice
MatchStream for server allocation events, ResultStream for progression events
Rationale
Match-created events flow to AllocatorWorker for server provisioning. Match-ended events flow to ProgressionWorker for XP updates. Different consumers with different processing requirements and latency SLOs. Separate topics prevent slow progression processing from blocking time-sensitive server allocation.
Target RPS
60K RPS (lobby + profile) + 100K concurrent matches
Latency (p99)
50-100ms game latency (full RTT, no prediction)
Storage
~500 GB (player profiles, match history)
Availability
99.9% (replicated components, Kafka durability)
| Operation | Time | Space | Notes |
|---|---|---|---|
| Matchmaking queue (POST /api/v1/matchmaking/queue) | O(log N) ZADD to sorted set + O(log N + M) ZRANGEBYSCORE for match check | O(1) per queued player (~500 bytes in Redis) | N = queued players (~100K). M = players in MMR range (~10-50). Match formation is atomic via ZREM. |
| Game tick (server-side, 60 Hz) | O(P) per tick where P = players in match (process inputs, simulate, broadcast) | O(S) game state per match (~1-10MB) | 16.67ms budget per tick at 60 Hz. Input validation + physics + state broadcast must complete within budget. |
| Server allocation (AllocatorWorker) | O(1) container provisioning (2-5 seconds wall clock) | O(1) per match (4 vCPU, 8 GB per container) | Provisioning time dominates. Pre-warmed server pools reduce this to <500ms. |
| Progression update (ProgressionWorker) | O(P) per match result where P = players (UPDATE per player) | O(1) per update (~500 bytes per player row) | ELO-style MMR recalculation is O(P^2) but P is bounded at ~10-100 players per match. |
Persistent player profiles with progression data. Stores XP, level, skill rating (MMR), match count, and win/loss record. Written by ProgressionWorker after each match, read by MatchService for profile views and leaderboards. Strong consistency prevents double-awarding XP on retry. Partitioned by player_id across 32 shards.
Indexes: idx_players_skill ON (skill_rating) — for leaderboard queries, idx_players_username ON (username) — for profile lookup by name
32 partitions x 3 replicas. ~1M rows at 500 bytes each = ~500MB. Read-heavy (60% of traffic is profile/leaderboard queries).
Active player session state in Redis. Stores matchmaking queue position, assigned match ID, and game server address. Written by MatchService (on queue) and AllocatorWorker (on server allocation). Read by MatchService for status polling. 300s TTL auto-evicts abandoned queue entries.
1M active sessions x 500 bytes = ~500MB working set. Redis sorted set for MMR-based range queries.
Emitted by MatchService when enough players at similar skill levels have queued. Consumed by AllocatorWorker to provision a dedicated GameWorker. Partitioned by match_id across 32 partitions for FIFO allocation ordering. Expected 3K msg/sec at peak.
Key Schema
match_id: string (partition key for FIFO ordering)
Value Schema
{ match_id: string, players: string[] (player IDs), mode: string, avg_mmr: number }
Emitted by GameWorker when a match completes. Contains per-player stats (kills, deaths, XP earned, placement). Consumed by ProgressionWorker for XP and MMR updates. Partitioned by match_id. Expected 3K msg/sec at peak.
Key Schema
match_id: string (partition key)
Value Schema
{ match_id: string, duration_sec: number, results: Array<{ player_id: string, kills: number, deaths: number, xp_earned: number, placement: number }> }
| Variant | Tier | Latency | Throughput | Cost | Complexity | Reliability |
|---|---|---|---|---|---|---|
| Naive (Client-Hosted P2P) | T1 | 0ms host / 50-150ms guests | ~8K RPS (lobby ops only) | $100/month (1 DB, 3 pods) | Low — 3 components, no game servers | ~99% backend; match dies if host disconnects |
| Authoritative (Dedicated Servers) | T2 | 50-100ms all players (equal) | 60K RPS lobby + 100K matches | $15,000/month (dedicated game servers) | Medium — 10 components, Kafka, Redis | 99.9% (replicated, no host dependency) |
| Global Platform (Multi-Region + Rollback) | T3 | <30ms regional + client-side prediction | 100K RPS + 150K matches globally | $50,000/month (multi-region fleet) | High — 10+ components, rollback netcode, 5 regions | 99.9% (multi-region, replay, anti-cheat) |
This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.
In competitive games, money, rankings, and reputation are at stake. If any client computes game state locally, cheaters will modify their client to gain unfair advantages — wallhacks, aimbots, speed hacks, invulnerability. Server-authoritative architecture makes the server the single source of truth. Clients send only raw inputs (move left, shoot, jump) and the server validates every input against game rules before applying it. A client claiming to move at 10x normal speed is rejected server-side. This is anti-cheat by design — it prevents entire categories of cheats that client-side anti-cheat software can only detect after the fact.
Players are added to a Redis sorted set with their MMR (skill rating) as the score. When MatchService checks for matches, it uses ZRANGEBYSCORE to find all players within a configurable MMR range (e.g., +/- 200 MMR). If enough players are found (e.g., 10 for a 5v5), they are atomically removed from the set (ZREM) and grouped into a match. The sorted set provides O(log N) insertion and O(log N + M) range queries — fast enough for 100K queued players at 10K matchmaking checks/sec. Queue entries auto-expire via TTL if no match is found within 30 seconds.
WebSocket push would eliminate the 2-3 second polling delay between match assignment and player notification. However, managing 1M persistent WebSocket connections requires significant infrastructure: connection state tracking, heartbeat management, reconnection handling, and horizontal scaling of stateful connections across pods. Polling is stateless — any MatchService pod can answer any poll request by reading from Redis. The 2-second delay is acceptable because matchmaking already takes 10-30 seconds. The engineering effort for sub-second notification is not justified when the total wait is 10x longer.
GameWorker instances dominate infrastructure cost. Each active match requires a dedicated container with 4 vCPU and 8 GB RAM for deterministic 60 Hz simulation. At 100K concurrent matches, that is 400K vCPUs — approximately $300,000/month on-demand, or $100,000/month with reserved instances and spot capacity. Production games optimize this aggressively: right-sizing containers per game mode (1v1 needs less than 100-player battle royale), using spot instances for non-ranked matches, and pre-warming idle server pools to reduce provisioning latency.
Two critical limitations: (1) No client-side prediction — players feel full RTT delay (50-100ms) between input and visual feedback. In competitive FPS games, 80ms of input lag makes the game feel sluggish. The Global variant adds rollback netcode where clients apply inputs locally for instant feedback and correct on server reconciliation. (2) Single-region deployment — cross-continent players get 100-200ms ping. The Global variant deploys game servers in 5+ regions and matches players to their nearest region, ensuring sub-30ms ping for most players.
Sign in to join the discussion.
Ready to design your own Multiplayer Game Server?
Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.
Open Simulator