Full production ride-hailing architecture with a formal ride state machine enforcing lifecycle invariants, outbox-pattern event publishing for exactly-once delivery, a payment saga with compensation, separate rider/driver API gateways, and an observability pipeline feeding surge pricing. Multi-region deployment with per-city sharding.
The global resilient approach to ride-hailing represents the architecture used by mature ride-hailing platforms operating at global scale — Uber, DiDi, Grab. It addresses three fundamental limitations of the V1 geo-indexed variant: (1) no formal ride lifecycle management (ad-hoc status transitions allow invalid states), (2) no exactly-once event delivery (dual-write problem between database and Kafka), and (3) no payment reliability guarantees (failed payments require manual intervention).
The core architectural contribution is the ride state machine. Every ride is modeled as a finite state machine with 8 well-defined states: REQUESTED -> MATCHING -> DRIVER_ASSIGNED -> EN_ROUTE -> IN_PROGRESS -> COMPLETED -> PAYMENT_PROCESSED -> RATED. Each state transition has explicit preconditions (you cannot transition from REQUESTED to COMPLETED without passing through DRIVER_ASSIGNED and IN_PROGRESS). The state machine validates every transition before persisting, preventing data corruption from concurrent requests, network retries, or buggy client code. Invalid transitions return 409 Conflict with the current state and the list of allowed next states.
The outbox pattern solves the dual-write problem between PostgreSQL and Kafka. In the V1 variant, MatchService writes to PostgreSQL and then publishes to Kafka as separate operations. If the database write succeeds but the Kafka publish fails, the ride state is updated but no event is published — downstream consumers miss the transition. The outbox pattern writes both the state change and the event to the same database in a single ACID transaction. A separate relay process reads the outbox table and publishes events to Kafka. If the relay crashes, it restarts and processes from where it left off — no events are lost because they are persisted in the database alongside the state change.
The payment saga orchestrates the charge-rider -> credit-driver -> update-ride flow with compensation on failure. When a ride completes, the Payment Saga Service consumes the ride_completed event and initiates a three-step saga: (1) charge the rider's payment method via the payment gateway, (2) credit the driver's earnings account, (3) transition the ride to PAYMENT_PROCESSED. If step 1 fails (card declined), the saga retries with exponential backoff (1s, 4s, 16s). After 3 failures, it escalates to a support queue with full context. If step 2 fails after step 1 succeeds (driver credit fails), the saga must compensate by refunding the rider — a critical correctness requirement.
Separate rider and driver gateways provide tailored API surfaces with independent rate limiting. The Driver Gateway handles 250K RPS of GPS location updates with minimal authentication overhead (optimized for throughput). The Rider Gateway handles 10K RPS of ride requests and status queries with richer validation and response formatting (optimized for user experience). This separation prevents driver GPS traffic from starving rider requests — a real problem at Uber's scale where location updates outnumber ride requests 100:1.
The observability pipeline consumes all events from Kafka (ride lifecycle + location updates) and produces real-time metrics per geo cell: available drivers, pending rides, average match time, payment failure rate. The supply/demand ratio per cell feeds the surge pricing engine. This is the only architecture in the ride-hailing family that computes surge pricing from real-time data rather than static configuration.
Interviewers at Uber, Lyft, and DiDi expect senior candidates to explain the ride state machine pattern (the core interview insight), the outbox pattern for solving the dual-write problem, the saga pattern for reliable payment processing, and the trade-offs of separate gateway services. The ride state machine is frequently the make-or-break concept — candidates who model rides as simple CRUD records instead of state machines with invariants reveal a gap in distributed systems thinking.
The global resilient architecture uses 11 components organized into five layers: traffic sources (RiderApp, DriverApp), edge layer (RegionalLB, RiderGateway, DriverGateway), application services (RideStateMachine, LocationIngestion), data stores (RedisGeo, RideDB/PostgreSQL), event pipeline (RideEvents/Kafka), and async workers (PaymentSaga, ObservabilityPipeline).
The edge layer provides separate ingress paths for riders and drivers. RegionalLB (AWS NLB, 500K RPS capacity) routes rider traffic to RiderGateway and driver traffic to DriverGateway based on path prefix. RiderGateway (API Gateway, 10K RPS limit) authenticates rider JWT tokens and routes ride requests and status queries to the RideStateMachine service. DriverGateway (API Gateway, 300K RPS limit) authenticates driver tokens and routes location updates to LocationIngestion and ride transitions (accept, start, complete) to RideStateMachine. The separation allows independent rate limiting, API versioning, and scaling.
The RideStateMachine service is the core of the architecture. It implements a formal state machine with 8 states and validates every transition against the current state. State transitions are written to PostgreSQL in a single ACID transaction that includes both the rides table update and an outbox_events table insert. A background relay process polls the outbox table and publishes events to Kafka. This guarantees that every persisted state change produces exactly one Kafka event — solving the dual-write problem. The service runs on 15 pods with 100 threads each, handling 2K ride requests/sec + 14K state transitions/sec at peak.
LocationIngestion handles the high-throughput GPS stream (250K/sec) independently. It writes to RedisGeo via GEOADD and publishes location events to Kafka (fire-and-forget with local buffer). This is identical to the V1 variant's LocationService — the improvement is in how the events are used downstream.
RedisGeo is a 6-node cluster sharded by city, identical to V1. RideDB is a PostgreSQL cluster with 64 shards (up from V1's 32) to handle the additional write load from outbox events. The rides table, outbox_events table, and payments table all live in this cluster.
RideEvents (Kafka, 64 partitions) carries two topics: ride-events (state transitions from the outbox relay, partitioned by ride_id) and location-updates (from LocationIngestion, partitioned by city_id). PaymentSaga consumes ride_completed events and orchestrates the charge-credit-update saga. ObservabilityPipeline consumes all events and produces metrics for surge pricing and operational dashboards.
Multi-region deployment is achieved through per-city infrastructure. Each city's RideStateMachine, LocationIngestion, RedisGeo, and RideDB are deployed in the nearest AWS region. Shared services (payment gateway, analytics) run in a primary region with cross-region replication. Failover between regions is handled at the DNS level (Route 53 health checks) for regional load balancer endpoints.
This sequence diagram traces the full ride lifecycle: request, matching, state transitions, and payment. The critical insight is the outbox pattern — state changes and events are written in a single ACID transaction, guaranteeing exactly-once event delivery to Kafka. The payment saga handles the charge-credit flow with compensation on failure.
The state machine validation is shown at the transition step: the service checks that the requested transition is valid from the current state. If not, it returns 409 Conflict. This prevents invalid state transitions from concurrent requests (e.g., driver starting a ride while rider is cancelling it).
Step-by-Step Walkthrough
Pseudocode
// RIDE STATE MACHINE — validated transitions
const TRANSITIONS = {
REQUESTED: ["MATCHING"],
MATCHING: ["DRIVER_ASSIGNED", "CANCELLED"],
DRIVER_ASSIGNED: ["EN_ROUTE", "CANCELLED"],
EN_ROUTE: ["IN_PROGRESS"],
IN_PROGRESS: ["COMPLETED"],
COMPLETED: ["PAYMENT_PROCESSED"],
PAYMENT_PROCESSED: ["RATED"],
};
async function transitionRide(ride_id, target_state, actor_id):
// Acquire exclusive lock on ride row
ride = await db.execute(
"SELECT * FROM rides WHERE ride_id = $1 FOR UPDATE", [ride_id]
)
// Validate transition
allowed = TRANSITIONS[ride.current_state]
if (!allowed.includes(target_state)):
return 409 // Conflict: "Cannot transition from {current} to {target}"
// Write state + outbox event in ONE transaction
await db.execute("BEGIN")
await db.execute(
"UPDATE rides SET current_state = $1 WHERE ride_id = $2",
[target_state, ride_id]
)
await db.execute(
"INSERT INTO outbox_events (event_id, aggregate_id, event_type, payload, published) " +
"VALUES ($1, $2, $3, $4, false)",
[uuid(), ride_id, "ride_" + target_state.toLowerCase(), JSON.stringify(ride)]
)
await db.execute("COMMIT") // Both writes succeed or both fail
return 200
// PAYMENT SAGA — charge, credit, compensate
async function processPayment(ride_completed_event):
ride = ride_completed_event
payment_id = uuid()
// Step 1: Charge rider
try:
charge = await paymentGateway.charge(
ride.rider_id, ride.fare_cents,
idempotency_key: "charge-" + ride.ride_id
) // ~200ms
await db.execute("INSERT INTO payments ... (status='CHARGED', charge_id=?)", [charge.id])
catch error:
await retryWithBackoff(3, [1000, 4000, 16000], () => charge(...))
if still_failing: escalateToSupport(ride); return
// Step 2: Credit driver
try:
await paymentGateway.credit(ride.driver_id, ride.fare_cents * 0.80) // ~50ms
await db.execute("UPDATE payments SET status='CREDITED' WHERE payment_id=?", [payment_id])
await transitionRide(ride.ride_id, "PAYMENT_PROCESSED", "system")
catch error:
// COMPENSATION: refund rider since driver credit failed
await paymentGateway.refund(charge.id)
await db.execute("UPDATE payments SET status='REFUNDED'")
escalateToSupport(ride)The V3 schema reflects three architectural patterns: the ride state machine (rides table with current_state), the outbox pattern (outbox_events table written in the same transaction), and the payment saga (payments table tracking saga lifecycle). The outbox_events table is the bridge between PostgreSQL (synchronous) and Kafka (async) — every state change produces an outbox row that the relay publishes to Kafka.
The rides table is partitioned by city_id across 64 shards for geographic locality. The outbox_events table is partitioned by created_at for efficient cleanup of published events. The payments table tracks the full saga lifecycle including the charge_id needed for compensation (refunds).
Step-by-Step Walkthrough
Pseudocode
-- RIDES TABLE: State machine with shard key
CREATE TABLE rides (
ride_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
rider_id UUID NOT NULL,
driver_id UUID,
current_state TEXT NOT NULL DEFAULT 'REQUESTED',
pickup_lat FLOAT NOT NULL,
pickup_lng FLOAT NOT NULL,
dest_lat FLOAT NOT NULL,
dest_lng FLOAT NOT NULL,
fare_cents INTEGER,
surge_multiplier FLOAT DEFAULT 1.0,
city_id TEXT NOT NULL, -- shard key
created_at TIMESTAMPTZ DEFAULT now()
) PARTITION BY HASH (city_id);
-- OUTBOX TABLE: Bridge between DB and Kafka
CREATE TABLE outbox_events (
event_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
aggregate_type TEXT NOT NULL DEFAULT 'ride',
aggregate_id UUID NOT NULL, -- ride_id
event_type TEXT NOT NULL,
payload JSONB NOT NULL,
published BOOLEAN NOT NULL DEFAULT false,
created_at TIMESTAMPTZ DEFAULT now()
) PARTITION BY RANGE (created_at);
CREATE INDEX idx_outbox_unpublished ON outbox_events (created_at)
WHERE published = false;
-- PAYMENTS TABLE: Saga lifecycle tracking
CREATE TABLE payments (
payment_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
ride_id UUID NOT NULL REFERENCES rides(ride_id),
rider_id UUID NOT NULL,
driver_id UUID NOT NULL,
amount_cents INTEGER NOT NULL,
status TEXT NOT NULL DEFAULT 'PENDING',
attempt_count INTEGER NOT NULL DEFAULT 0,
charge_id TEXT, -- from payment gateway, needed for refunds
created_at TIMESTAMPTZ DEFAULT now()
);
-- STATE MACHINE TRANSITION (single transaction)
BEGIN;
UPDATE rides SET current_state = 'COMPLETED' WHERE ride_id = $1;
INSERT INTO outbox_events (aggregate_id, event_type, payload)
VALUES ($1, 'ride_completed', '{"fare_cents": 2450}');
COMMIT; -- Both succeed or both fail — no dual-write riskChoice
8-state FSM with validated transitions instead of ad-hoc status updates
Rationale
A ride has well-defined states: REQUESTED -> MATCHING -> DRIVER_ASSIGNED -> EN_ROUTE -> IN_PROGRESS -> COMPLETED -> PAYMENT_PROCESSED -> RATED. Without a state machine, concurrent requests can create invalid states (double-completing a ride, starting a ride that was never matched, completing a ride that is already paid). The state machine validates every transition: an attempt to transition from REQUESTED to COMPLETED returns 409 Conflict with the current state and allowed next states. This prevents data corruption and enables confident debugging — every ride's state history is a valid path through the FSM.
Choice
State change + event written to same DB transaction, relay publishes to Kafka
Rationale
The dual-write problem: writing to PostgreSQL and publishing to Kafka as separate operations means either can fail independently. The outbox pattern writes both in a single ACID transaction. A background relay reads the outbox table and publishes to Kafka. If the relay crashes, it restarts and processes from the last committed position — no events are lost. The relay is idempotent: publishing the same event twice is safe because Kafka consumers deduplicate by event_id. The trade-off is 50-200ms additional latency (relay polling interval) between state change and event publication. CDC via Debezium can reduce this to ~10ms.
Choice
Three-step saga (charge rider -> credit driver -> update ride) with retry and compensation
Rationale
Payment involves three external calls that can fail independently: (1) charge the rider's payment method, (2) credit the driver's earnings, (3) update the ride to PAYMENT_PROCESSED. If step 1 succeeds but step 2 fails, the saga must compensate by refunding the rider. Without a saga, partial payment states (rider charged but driver not credited) require manual intervention. The saga orchestrator tracks the current step and handles retries (exponential backoff: 1s, 4s, 16s) and compensation (refund on step 2 failure). After max retries, it escalates to a support queue with full context.
Choice
Independent API surfaces with different rate limits, auth, and versioning
Rationale
Driver traffic (250K RPS location updates) outnumbers rider traffic (10K RPS) by 25:1. A shared gateway requires sizing for driver throughput while serving rider requests — risking driver GPS floods starving rider ride requests. Separate gateways allow independent rate limiting (300K RPS for drivers, 10K RPS for riders), different auth token lifetimes (8 hours for drivers who are always logged in, 24 hours for riders), and independent API versioning (rider API changes more frequently for new features).
Choice
Separate Redis GEO sets per city instead of one global set
Rationale
GEORADIUS is inherently local — a rider in NYC does not need to search drivers in London. Sharding by city keeps per-shard cardinality under 100K drivers, ensuring sub-2ms GEORADIUS queries. It also enables per-city deployment in the nearest AWS region, reducing network latency for location updates and matching queries. The trade-off is routing complexity: the application must determine the rider's city to query the correct shard.
Choice
Dedicated Kafka consumer computing supply/demand metrics per geo cell
Rationale
Surge pricing requires real-time supply/demand ratio per geographic area. The observability pipeline consumes location events (available drivers per cell) and ride events (pending rides per cell), computes the ratio, and caches the surge multiplier in Redis with a 60-second TTL. This is applied at ride request time by the RideStateMachine. The V0 and V1 variants have no surge pricing mechanism — this is the first variant that computes it from real-time data.
Target RPS
280K peak (250K location + 2K rides + 14K transitions + 8.4K status)
Latency (p99)
<3s match (including driver dispatch), 15ms location write, <5s payment saga
Storage
~2 TB/year (rides, outbox events, payments, location history)
Availability
99.99% (multi-region, per-city failover)
| Operation | Time | Space | Notes |
|---|---|---|---|
| State machine transition validation | O(1) — lookup in transition table (current_state -> allowed_next_states) | O(S) — S states in the FSM (8 for ride-hailing) | The state machine is a simple lookup: given current_state and target_state, check if the transition is in the allowed set. The FSM has 8 states and approximately 12 valid transitions — constant-time validation. |
| Outbox relay (poll + publish) | O(B) — B unpublished events per poll cycle | O(B) — batch of events loaded into memory | The relay polls every 50-200ms and publishes batches of 100-1000 events. At 10K state transitions/sec, this means ~1000 events per 100ms poll. The relay is the latency bottleneck: 50-200ms added between state change and Kafka event. |
| Payment saga (3-step orchestration) | O(1) per step — each step is an external API call | O(1) — saga state per ride | Happy path: ~300ms (charge ~200ms + credit ~50ms + DB update ~50ms). With retries: up to 21s per step (1s + 4s + 16s exponential backoff). Total worst case: ~63s before escalation. |
| Surge pricing computation (per geo cell) | O(D + R) — D drivers + R rides in the cell | O(C) — C geo cells being tracked | The observability pipeline maintains counters per H3 cell (resolution 7, ~5 sq km). Supply/demand ratio = available_drivers / pending_rides. Cached in Redis with 60-second TTL. |
Ride records with formal state machine status tracking. The current_state column is the single source of truth for ride lifecycle position. State transitions are validated by the application before UPDATE. Partitioned by city_id across 64 shards for geographic locality.
Indexes: PK on ride_id, idx_rides_state ON (current_state, city_id), idx_rides_rider ON (rider_id, created_at), idx_rides_driver ON (driver_id, current_state)
The current_state column only transitions via the state machine validation logic. Direct UPDATE to current_state outside the state machine is blocked by application-level guards. State history is preserved in the outbox_events table, enabling full audit trail.
Outbox table for reliable event publishing. Written in the same ACID transaction as ride state changes. A background relay process reads rows where published=false, publishes to Kafka, and marks published=true. The event_id enables consumer-side deduplication.
Indexes: PK on event_id, idx_outbox_unpublished ON (published, created_at) WHERE published = false
The relay process polls this table every 50-200ms for unpublished events. After publishing to Kafka, it updates published=true in a separate transaction. If the relay crashes between publish and update, the event may be published twice — Kafka consumers deduplicate by event_id. CDC (Debezium) can replace polling for lower latency (~10ms).
Payment records tracking the saga lifecycle: PENDING -> CHARGED -> CREDITED -> COMPLETED or PENDING -> CHARGED -> CREDIT_FAILED -> REFUND_INITIATED -> REFUNDED. Each row represents one payment attempt with its retry history.
Indexes: PK on payment_id, idx_payments_ride ON (ride_id), idx_payments_status ON (status) WHERE status IN ('PENDING', 'CHARGED')
The saga orchestrator updates this table at each step. The charge_id from the payment gateway is recorded after step 1 (charge rider) to enable compensation (refund) if step 2 (credit driver) fails.
Identical to V1: Redis GEO sorted sets sharded by city for O(log N) GEORADIUS queries. 6-node cluster, 30-second TTL. LocationIngestion writes via GEOADD; RideStateMachine reads via GEORADIUS during MATCHING phase.
Indexes: Geohash-encoded sorted set
Same as V1 — the geo-matching layer is unchanged. The improvement is in how matched rides are processed (state machine + outbox) and how fares are collected (payment saga).
Ride state transition events published by the outbox relay. Each event represents a single state machine transition. Partitioned by ride_id for per-ride ordering. Consumed by PaymentSaga (on ride_completed) and ObservabilityPipeline (all events).
Key Schema
ride_id (string)
Value Schema
{ event_id: string, ride_id: string, event_type: ride_requested|ride_matched|ride_started|ride_completed|payment_processed|ride_rated, rider_id: string, driver_id?: string, state: string, fare_cents?: number, surge_multiplier?: number, city_id: string, timestamp: number }
Driver GPS coordinates streamed by LocationIngestion to Kafka for the ObservabilityPipeline. Partitioned by city_id for geographic locality. Used for surge pricing computation (available drivers per geo cell) and analytics.
Key Schema
city_id (string)
Value Schema
{ driver_id: string, city_id: string, lat: number, lng: number, heading?: number, speed?: number, timestamp: number }
Internal saga commands for payment processing steps. Published by the PaymentSaga service to itself for retry and compensation tracking. Not consumed by other services.
Key Schema
ride_id (string)
Value Schema
{ payment_id: string, ride_id: string, command: charge_rider|credit_driver|refund_rider|escalate, amount_cents: number, attempt: number, timestamp: number }
Driver goes offline mid-ride (phone dies in tunnel, app crash)
Impact
The ride state machine detects no GPS updates for 30 seconds during an active ride (IN_PROGRESS state). Without heartbeat detection, the ride remains in IN_PROGRESS indefinitely. The rider sees a frozen map and must manually cancel.
Mitigation
Implement heartbeat timeout: if no location update is received from the driver for 30 seconds during IN_PROGRESS, auto-transition to DRIVER_UNRESPONSIVE. Notify the rider with options (wait, cancel, or reassign). If the driver reconnects within 2 minutes, auto-transition back to IN_PROGRESS. After 2 minutes, begin reassignment via a new matching cycle.
Payment saga: rider charged but driver credit fails (banking API outage)
Impact
The rider is charged $24.50 but the driver does not receive earnings. Without saga compensation, this creates a financial discrepancy: the platform holds the rider's money without paying the driver. At 2K completed rides/sec, even a 1-minute banking API outage creates 120 rides with unpaid drivers.
Mitigation
The payment saga detects the credit failure and initiates compensation: refund the rider's charge using the charge_id from step 1. Flag the ride for manual resolution (support queue). When the banking API recovers, a reconciliation job retries the driver credit. The driver receives a notification explaining the delay with an estimated resolution time.
Outbox relay falls behind (relay process slow or crashed)
Impact
State transitions are persisted in PostgreSQL but events are not published to Kafka. PaymentSaga does not receive ride_completed events — no payments are processed. ObservabilityPipeline does not receive events — surge pricing becomes stale. The outbox table grows unboundedly until the relay catches up.
Mitigation
Monitor outbox table row count and relay lag (time between event creation and publication). Alert at 5-second lag, critical at 30 seconds. Run multiple relay instances with leader election for HA. Set outbox table TTL (delete published events older than 24 hours) to prevent unbounded growth. Consider CDC (Debezium) as a more reliable alternative to polling-based relay.
Surge pricing fairness concern (3x multiplier during morning commute)
Impact
Regular commuters face 3x fares during rush hour. Over time, riders switch to public transit or competing services during peak hours, reducing demand but also reducing driver supply (drivers earn less during off-peak). Long-term revenue impact from lost rider loyalty.
Mitigation
Cap surge multiplier at 2x for regular commuters (identified by ride history patterns). Implement surge pricing smoothing: instead of instant jumps from 1x to 3x, ramp up gradually (1x -> 1.5x -> 2x over 10-minute windows). Display surge forecast to riders (next 30 minutes estimated surge level) so they can delay their ride.
Multi-region failover (us-east-1 outage affecting NYC rides)
Impact
All NYC rides in flight lose real-time tracking. New ride requests in NYC fail. Drivers in NYC stop receiving ride assignments. Other cities (London in eu-west-1, Singapore in ap-southeast-1) are unaffected due to per-city deployment.
Mitigation
DNS failover (Route 53) detects us-east-1 health check failure and routes NYC traffic to us-west-2 (backup region). PostgreSQL cross-region read replica is promoted to primary in us-west-2. Redis GEO for NYC is rebuilt from incoming driver GPS updates within 30 seconds (TTL-based self-healing). Estimated failover time: 60-120 seconds.
| Component | Failure | Impact | Mitigation |
|---|---|---|---|
| RideStateMachine Service | Invalid state transition from concurrent requests | Two requests attempt to transition the same ride simultaneously (e.g., driver marks 'started' while rider marks 'cancelled'). Without proper locking, both could succeed, leaving the ride in an impossible dual state. | SELECT ... FOR UPDATE on the ride row before validating the transition. This serializes concurrent transitions for the same ride. The first transition acquires the lock and succeeds; the second waits for the lock, sees the new state, and either succeeds (if valid) or returns 409 Conflict. |
| Outbox Relay | Relay crashes between Kafka publish and outbox UPDATE | The event is published to Kafka but the outbox row is not marked as published. When the relay restarts, it republishes the same event — creating a duplicate. Downstream consumers (PaymentSaga) may process the event twice, potentially charging the rider twice. | Kafka consumers deduplicate by event_id. The PaymentSaga checks if a payment already exists for the ride_id before initiating a charge. Idempotency keys on payment gateway calls prevent duplicate charges even if the saga is triggered twice. |
| PaymentSaga | Saga stuck in CHARGED state (driver credit API hangs indefinitely) | Rider is charged but driver is not credited. The saga waits for the credit API response, blocking a worker thread. If many sagas hang, all worker threads are consumed and no new payments are processed. | Set timeout on driver credit API call (5 seconds). On timeout, treat as failure and enter retry loop. After max retries, compensate (refund rider) and escalate. Use async HTTP client with configurable timeouts — never block a worker thread indefinitely. |
| PostgreSQL (RideDB) | Outbox table bloat causing slow relay queries | If the relay falls behind, the outbox table grows with millions of unpublished rows. The relay's SELECT ... WHERE published=false query becomes slower as the table grows, creating a death spiral — the relay falls further behind because its queries take longer. | Partition the outbox table by created_at (daily partitions). Archive and drop old partitions (events older than 7 days). Add index on (published, created_at) for efficient relay queries. Monitor outbox row count — alert at 100K unpublished rows, critical at 1M. |
| ObservabilityPipeline | Pipeline crash causes surge pricing to use stale data | Surge multipliers are cached in Redis with 60-second TTL. If the pipeline crashes, the cache expires and surge pricing falls back to 1.0x (no surge). During high demand, this means rides are underpriced — drivers are not incentivized to work peak hours, exacerbating the supply shortage. | Set surge pricing fallback to last-known-good value (cached in Redis with 10-minute TTL). If the pipeline is down for more than 10 minutes, fall back to time-of-day-based static surge multipliers (a coarse but better-than-nothing approximation). Alert on pipeline consumer lag > 60 seconds. |
RideStateMachine scales based on ride request volume: 15 pods for 2K/sec, add 5 pods per additional 700/sec. LocationIngestion scales based on GPS throughput: 20 pods for 250K/sec, add 10 pods per additional 125K/sec. PaymentSaga scales based on ride completion rate: 15 workers for 2K completions/sec, add 5 workers per additional 700/sec. PostgreSQL scales via shard count (64 -> 128 -> 256 shards for 2x -> 4x driver count). Redis GEO scales by adding shards (city splitting or sub-city regions). Kafka scales by partition count (64 -> 128) and broker node count. Auto-scaling triggers: CPU > 70% for 3 minutes (services), outbox lag > 5 seconds (RideStateMachine pod count), payment queue depth > 10K (PaymentSaga workers). The architecture scales to approximately 10M active drivers with shard count increases and minimal code changes.
Key metrics to monitor: (1) State machine transition rate — should match expected ride volume x transitions/ride (~5 transitions per ride). Alert on anomalous transition patterns (e.g., 50% of rides going to CANCELLED). (2) Outbox relay lag — time between event creation and Kafka publication. Alert at >5s, critical at >30s. (3) Payment saga success rate — should be >95%. Alert if failure rate exceeds 5% sustained. (4) Payment saga latency (p99) — happy path <1s, with retries <30s. Alert if p99 exceeds 30s. (5) Surge pricing freshness — time since last surge multiplier update per geo cell. Alert if any cell is stale >120 seconds. (6) Cross-gateway traffic ratio — driver:rider should be approximately 25:1. Significant deviation indicates gateway routing issues. (7) State transition conflict rate — percentage of transitions that return 409 Conflict. Normal is <1%; higher indicates concurrent request storms. Dashboard: Grafana with panels for ride state distribution (pie chart), outbox relay lag (time series), payment saga step durations (histogram), surge pricing heatmap (geographic), and per-gateway throughput comparison. SLIs: match latency p99 < 3s, location update p99 < 30ms, payment completion < 5s (happy path), surge pricing freshness < 60s.
At 1M active drivers (global, multi-city): PostgreSQL db.r7g.2xlarge 64 shards (~$2,000/month), Redis Cluster 6 nodes (~$900/month), MSK Kafka kafka.m7g.xlarge (~$800/month), ECS Fargate: RideStateMachine 15 pods + LocationIngestion 20 pods + PaymentSaga 15 workers + ObservabilityPipeline 10 workers (~$1,500/month), API Gateway (2x) (~$300/month), NLB (~$100/month), Multi-region replication (~$900/month). Total: ~$6,500/month. This is 8x the cost of the naive variant and 2.6x the V1 variant, but it provides 99.99% availability (vs 99% and 99.9%), exactly-once payment processing, and regulatory-grade audit trails via the outbox event log. Per-driver cost: $0.0065/driver — lower than V1 at 1M scale due to more efficient sharding.
Rider/driver safety: formal ride state machine prevents invalid transitions that could leave riders stranded (e.g., ride auto-completing before actual dropoff). Location privacy: driver GPS stored in Redis with 30-second TTL (auto-expire) and in Kafka with 7-day retention. Access to location data restricted to the matched rider/driver pair during active rides. Payment security: payment saga uses idempotency keys to prevent double-charges. Payment method tokens stored via PCI-compliant gateway — no raw card data in the system. The payments table stores charge_ids for refund capability but not card numbers. Separate gateways: driver and rider API surfaces are isolated — a compromised driver token cannot access rider endpoints and vice versa. Audit trail: the outbox_events table provides a complete, immutable record of every ride state transition for regulatory and dispute resolution purposes.
Blue-green deployment for stateless services (RideStateMachine, LocationIngestion, PaymentSaga). Rolling deployment for gateways and WebSocket services. Database schema migrations using zero-downtime DDL (CREATE INDEX CONCURRENTLY, ALTER TABLE ... ADD COLUMN with DEFAULT). Outbox table partitioning changes performed during maintenance windows. Kafka topic modifications (partition count increase) performed live with client rebalancing. Multi-region deployment uses infrastructure-as-code (Terraform) with per-city modules. New city launch: deploy a new instance of the per-city stack in the nearest region, configure DNS routing, and seed RedisGeo from initial driver onboarding GPS data.
| Variant | Tier | Latency | Throughput | Cost | Complexity | Reliability |
|---|---|---|---|---|---|---|
| V0: Naive (Monolith + SQL Distance Sort) | T1 | 80-200ms match, 50-100ms location update | ~10K RPS total | $780/month | Low | 99% (single DB) |
| V1: Geo-Indexed Match (Redis GEO + Kafka) | T2 | 2ms match, 12ms location update | 265K RPS peak | $2,500/month | Medium | 99.9% (multi-AZ) |
| V3: Global Resilient (State Machine + Payment Saga) | T4 | <3s match, 15ms location update | 280K RPS peak | $6,500/month | Very High | 99.99% (multi-region) |
This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.
The ride state machine prevents data corruption that arises from concurrent, distributed operations on the same ride. Consider: a driver marks a ride as 'started' at the same moment the rider cancels it. Without a state machine, both operations succeed, leaving the ride in an impossible state (simultaneously IN_PROGRESS and CANCELLED). The state machine serializes transitions: the first operation to acquire the row lock transitions the state. The second operation attempts to transition from the new state and either succeeds (if the transition is valid) or fails with 409 Conflict. This is the core interview insight — every ride is a finite state machine, not a CRUD record.
The dual-write problem occurs when a service writes to two systems (database + Kafka) that do not share a transaction boundary. If the database write succeeds but Kafka publish fails, the state is updated but the event is lost. If Kafka succeeds but the database fails, the event is published for a state change that never happened. The outbox pattern eliminates this by writing both the state change and the event to the same database in one ACID transaction. A relay process reads uncommitted events from the outbox table and publishes them to Kafka. The relay is idempotent — publishing the same event twice is safe because consumers deduplicate by event_id.
This is the compensation scenario. The saga tracks its progress: after step 1 (charge rider) succeeds, it records CHARGE_SUCCESS. If step 2 (credit driver) fails, the saga initiates compensation: it calls the payment gateway to refund the rider (using the charge_id from step 1), records REFUND_INITIATED, and flags the ride for manual resolution. The driver credit failure is typically due to a banking API issue, not a logic error — so the compensation ensures the rider is not charged for a ride where the driver was not paid. After manual resolution (support team fixes the driver credit), the saga can be resumed.
A single gateway handling both 250K/sec driver GPS updates and 10K/sec rider requests creates a resource contention risk. Under load, driver traffic could consume all gateway capacity, starving rider ride requests. Separate gateways ensure rider requests always have dedicated capacity. Additionally, driver and rider APIs have different performance profiles: driver location updates are fire-and-forget (no response body needed), while rider ride requests need rich response bodies with driver info, ETA, and fare estimate. Separate gateways can optimize their serialization, compression, and connection handling independently.
Each city's infrastructure (RideStateMachine, LocationIngestion, RedisGeo, city-specific RideDB shard) is deployed in the nearest AWS region. A rider in NYC hits us-east-1; a rider in London hits eu-west-1. Shared services (payment gateway integration, global analytics, driver onboarding) run in a primary region (us-east-1) with cross-region read replicas. DNS-based routing (Route 53) directs traffic to the nearest regional deployment. If a region fails, Route 53 health checks detect the failure and reroute to a backup region (with some latency increase). Cross-city rides (pickup in one city, destination requiring a different city's drivers) are handled by the city of the pickup location.
Sign in to join the discussion.
Ready to design your own Ride Hailing?
Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.
Open Simulator