Hard11 componentsInterview: Medium

Multiplayer Game — Authoritative Server Architecture

Q: Why use dedicated game servers instead of peer-to-peer networking?

Peer-to-peer networking is unsuitable for competitive multiplayer games for three reasons. First, there is no authoritative state, so each client can compute a different game state, leading to desync issues. Second, one peer must be elected as host, giving that player a significant latency advantage. Third, cheating is trivial because every client has full access to the game state including hidden information like enemy positions. Dedicated servers eliminate all three problems by centralizing simulation on a trusted process that all clients connect to equally.

Q: How does skill-based matchmaking work at scale with 1M concurrent players?

Players enter a matchmaking queue with their MMR (skill rating). The MatchService uses Redis sorted sets to group players by skill. It periodically scans for clusters of players within a narrow MMR window (e.g., 200 points). If enough players are found for a match (e.g., 10 players for a 5v5), a match is formed. If the queue time exceeds a threshold, the MMR window gradually widens to reduce wait times at the cost of match quality. Geographic affinity is also considered, preferring players in the same region to minimize network latency. This adaptive widening ensures median queue times stay under 30 seconds.

Q: How do you handle server provisioning latency when allocating game servers?

Provisioning a new container takes 2-5 seconds, which adds to the total time between match formation and gameplay start. Production systems mitigate this with a pre-warmed server pool: a fleet of idle game server containers are kept running and ready to accept matches instantly. AllocatorWorker assigns a match to an idle server from the pool (sub-second), and a background process replenishes the pool asynchronously. The pool size is dynamically adjusted based on time-of-day traffic patterns, scaling up before evening peak hours when match creation rate is highest.

Q: What is the difference between tick rate and frame rate in multiplayer games?

Tick rate is the frequency at which the server updates the authoritative game state, typically 60 Hz (60 updates per second) for competitive shooters. Frame rate is how often the client renders a visual frame, typically 60-240 Hz depending on hardware. The server tick rate determines the simulation granularity: at 60 Hz, each tick represents approximately 16.7 milliseconds of game time. Higher tick rates improve hit detection accuracy and reduce the feeling of lag but increase CPU and bandwidth costs proportionally. Most competitive FPS games use 60-128 Hz tick rates as a balance between fidelity and cost.

Q: How does the system handle player disconnections during an active match?

The GameWorker monitors each player's UDP heartbeat. If no packets are received for a configurable timeout (typically 10-15 seconds), the player is marked as disconnected. The game continues with the remaining players, and the disconnected player's character may be removed or controlled by AI depending on the game mode. If the player reconnects within a grace period, they rejoin the same match by reconnecting to the same GameWorker IP and port (stored in SessionCache). Match results still credit the disconnected player based on their performance before disconnection, preventing intentional disconnects from avoiding losses.

Design a competitive multiplayer game backend supporting 1M concurrent players and 100K active matches with skill-based matchmaking and server-authoritative simulation at 60 Hz.

Real-TimeMatchmakingGame ServerUDP

Try in Simulator

Problem Statement

Multiplayer game server architecture is a fascinating system design interview problem because it combines real-time networking constraints, stateful server management, and complex matchmaking algorithms into a challenge unlike typical web service designs. The core requirement is supporting millions of concurrent players who need to be grouped by skill level, assigned to dedicated game servers, and provided with a fair, lag-free competitive experience where no client can cheat.

At production scale, games like Fortnite, Valorant, and Apex Legends support over 1 million concurrent players with 100,000 active matches running simultaneously. Each match runs on a dedicated server process executing an authoritative game simulation at 60 ticks per second, meaning the server processes all player inputs, validates them for anti-cheat, and broadcasts the canonical game state 60 times every second. The server is the single source of truth; clients only render what the server tells them. This prevents cheating techniques like teleportation, aim-bots, and invulnerability that plague client-authoritative architectures.

The matchmaking system must group players of similar skill (within an MMR range) in under 30 seconds while considering geographic proximity to minimize network latency. Server allocation must provision a dedicated container for each match within seconds of formation. After each match concludes, the results must flow through an asynchronous pipeline to update player progression (XP, rankings, unlocks) durably in a database without blocking the match-complete screen.

This template models the complete multiplayer backend: API gateway for lobby traffic, skill-based matchmaking with Redis sorted sets, Kafka-driven server allocation, dedicated authoritative game workers with UDP transport, and an asynchronous progression pipeline for post-match stat persistence.

Architecture Overview

The multiplayer game architecture separates lobby/matchmaking traffic (HTTP via API Gateway) from in-match traffic (UDP direct to game servers) to optimize for their fundamentally different latency requirements. The lobby path begins when a player queues for matchmaking via the API Gateway, which validates session tokens and routes through an ALB to the MatchService (10 pods, 100 threads each). MatchService writes the player to a Redis sorted set keyed by skill rating (MMR), enabling O(log N + M) range queries via ZRANGEBYSCORE to find players within a configurable MMR window.

When enough players at similar skill levels are queued, MatchService forms a match and publishes a match-created event to MatchStream (Kafka with 32 partitions). AllocatorWorker (20 instances) consumes these events and provisions a dedicated GameWorker container for each match. The worker assigns an IP and port, then updates SessionCache (Redis) with the server address so players can discover their assigned server via polling on GET /api/v1/matchmaking/status.

Players connect directly to the GameWorker via UDP, bypassing the API Gateway entirely. Each GameWorker instance runs the authoritative game simulation at 60 Hz: it receives player inputs (movement vectors, shoot commands), validates them server-side for anti-cheat, updates the canonical game state, and broadcasts state snapshots to all connected players. With 100K concurrent matches, each requiring 4 vCPU and 8 GB RAM, the GameWorker fleet is the most resource-intensive component at 400K total vCPUs.

When a match ends, the GameWorker publishes results (per-player stats, kills, deaths, XP earned) to ResultStream (Kafka). ProgressionWorker (20 instances) consumes these events and writes to PlayerDB (PostgreSQL with 32 partitions, 3 replicas) using an ELO-style algorithm to recalculate MMR. This asynchronous flow means players see the match-complete screen immediately while progression updates settle in the background within approximately 5 seconds.

Architecture Preview

Loading architecture preview...

Open in Simulator

Key Design Decisions

Game Server Model

Choice

Dedicated authoritative process per match on ECS Fargate

Rationale

A dedicated process per match provides CPU isolation so that one match's complex physics simulation (100 projectiles, particle effects) cannot cause frame drops in adjacent matches. At 60 ticks per second with 10 players, each match requires approximately 2 CPU cores for deterministic simulation. Shared server models risk noisy-neighbor effects that are unacceptable in competitive gaming where a single dropped frame can determine the outcome of a firefight.

Trust Model

Choice

Server-authoritative simulation with client input validation

Rationale

In competitive games, the server must be the single source of truth for game state. If clients compute their own state, cheaters can modify their client to teleport, use aim-bots, or become invulnerable. Server-authoritative means the server simulates everything based on raw player inputs (movement vector, shoot command), and clients only render the server's state. This is the industry standard for competitive FPS and battle-royale titles like Valorant and Fortnite.

Matchmaking Data Structure

Choice

Redis sorted set with ZRANGEBYSCORE for skill-based grouping

Rationale

Matchmaking requires finding players within a narrow skill rating range quickly. Redis ZRANGEBYSCORE finds all players within an MMR window in O(log N + M) time, completing in approximately 2ms even with 100K queued players. The sorted set naturally orders by skill, making range queries trivial. TTL expiry on queue entries automatically removes players who abandon the queue without explicit cancellation, preventing ghost entries from accumulating.

Post-Match Progression

Choice

Asynchronous updates via Kafka and dedicated ProgressionWorker

Rationale

Database writes for XP, ranking recalculation, and stats persistence take 50-100ms with strong consistency. If the match-end flow waited synchronously for these writes, the match-complete screen would lag noticeably. Asynchronous processing via Kafka decouples the player experience from database write latency. Players see their results immediately while progression updates are eventually consistent within approximately 5 seconds. The Kafka topic also provides durability: if ProgressionWorker crashes, results are replayed on restart without data loss.

Scale & Performance

Target RPS

60K lobby requests/s, 100K concurrent matches

Latency (p99)

<50ms (in-game tick), <30s (matchmaking)

Storage

~200 GB/year (player profiles + match history)

Availability

99.95%

This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.

Frequently Asked Questions

Why use dedicated game servers instead of peer-to-peer networking?

Peer-to-peer networking is unsuitable for competitive multiplayer games for three reasons. First, there is no authoritative state, so each client can compute a different game state, leading to desync issues. Second, one peer must be elected as host, giving that player a significant latency advantage. Third, cheating is trivial because every client has full access to the game state including hidden information like enemy positions. Dedicated servers eliminate all three problems by centralizing simulation on a trusted process that all clients connect to equally.

How does skill-based matchmaking work at scale with 1M concurrent players?

Players enter a matchmaking queue with their MMR (skill rating). The MatchService uses Redis sorted sets to group players by skill. It periodically scans for clusters of players within a narrow MMR window (e.g., 200 points). If enough players are found for a match (e.g., 10 players for a 5v5), a match is formed. If the queue time exceeds a threshold, the MMR window gradually widens to reduce wait times at the cost of match quality. Geographic affinity is also considered, preferring players in the same region to minimize network latency. This adaptive widening ensures median queue times stay under 30 seconds.

How do you handle server provisioning latency when allocating game servers?

Provisioning a new container takes 2-5 seconds, which adds to the total time between match formation and gameplay start. Production systems mitigate this with a pre-warmed server pool: a fleet of idle game server containers are kept running and ready to accept matches instantly. AllocatorWorker assigns a match to an idle server from the pool (sub-second), and a background process replenishes the pool asynchronously. The pool size is dynamically adjusted based on time-of-day traffic patterns, scaling up before evening peak hours when match creation rate is highest.

What is the difference between tick rate and frame rate in multiplayer games?

Tick rate is the frequency at which the server updates the authoritative game state, typically 60 Hz (60 updates per second) for competitive shooters. Frame rate is how often the client renders a visual frame, typically 60-240 Hz depending on hardware. The server tick rate determines the simulation granularity: at 60 Hz, each tick represents approximately 16.7 milliseconds of game time. Higher tick rates improve hit detection accuracy and reduce the feeling of lag but increase CPU and bandwidth costs proportionally. Most competitive FPS games use 60-128 Hz tick rates as a balance between fidelity and cost.

How does the system handle player disconnections during an active match?

The GameWorker monitors each player's UDP heartbeat. If no packets are received for a configurable timeout (typically 10-15 seconds), the player is marked as disconnected. The game continues with the remaining players, and the disconnected player's character may be removed or controlled by AI depending on the game mode. If the player reconnects within a grace period, they rejoin the same match by reconnecting to the same GameWorker IP and port (stored in SessionCache). Match results still credit the disconnected player based on their performance before disconnection, preventing intentional disconnects from avoiding losses.

Related Templates

Chat (WhatsApp/Slack)Distributed Counter Video Streaming (YouTube/Netflix)

Discussion

Ready to design your own Multiplayer Game?

Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.

Open Simulator