Hard4 componentsInterview: Very High

Distributed Counter — Naive (Single PostgreSQL Row)

Q: Why does a single PostgreSQL row have a throughput ceiling?

Every UPDATE to the same row acquires a row-level exclusive lock. Concurrent UPDATEs serialize behind the lock holder. At 50ms per UPDATE (including WAL flush), the theoretical serial throughput is just 20 updates/sec. PostgreSQL optimizes with group commit and HOT updates, pushing practical throughput to ~2K-5K updates/sec per row. Beyond that, lock wait times cascade: each waiter holds a connection pool slot, the pool saturates (150 connections), and new requests queue at the application layer.

Q: Why is the connection pool the breaking point?

With 150 max connections and 50ms lock hold time, the pool can sustain ~3K concurrent transactions. But under contention, lock waiters hold connections while waiting — they are not released back to the pool until the UPDATE completes. At 5K inc/sec on a single key, 250 transactions are waiting at any moment, exceeding the 150-connection limit. Reads cannot get connections even though they would be fast (~5ms), so the entire system degrades.

Q: What is the first optimization an interviewer expects?

Shard the counter into N sub-counter rows. Instead of one row per key, create 16 rows: counter:{key}:0 through counter:{key}:15. Each increment randomly selects a shard and UPDATEs that specific row. Since each shard has its own row lock, 16 shards provide ~16x the per-key throughput. Reads sum all 16 shards. This is the core insight of the Sharded variant and the most common first optimization in counter design interviews.

Q: How does this compare to Redis INCR?

Redis INCR on a single key can sustain ~1M operations/sec (single-threaded, in-memory). PostgreSQL UPDATE sustains ~2K-5K/sec per row (disk-backed, WAL, row locks). Redis is ~200-500x faster for single-key increments. However, Redis INCR is not durable by default — a crash loses unflushed data. The Sharded variant uses Redis for the hot write path and PostgreSQL for durable persistence, combining the speed of Redis with the durability of PostgreSQL.

Q: Why is strong consistency both the strength and weakness?

Every SELECT returns the exact current count — no staleness, no eventual consistency. This is the naive approach's one genuine advantage over sharded or CRDT variants, which trade consistency for throughput. But strong consistency requires the write to complete (lock acquired, tuple written, WAL flushed) before the value is visible. Under contention, the lock wait chain means 'current' is whatever the last completed transaction wrote — potentially hundreds of milliseconds stale relative to the request arrival time.

Q: When is the naive approach actually the right choice?

For counters with low write rates per key (under 1K inc/sec) and where strong consistency matters. Examples: internal analytics dashboards, A/B test impression counters, feature flag rollout counters. If no single key receives more than a few hundred increments per second, the row lock never becomes a bottleneck. The simplicity of a single PostgreSQL table with no cache, no workers, and no event stream is a genuine operational advantage.

The simplest possible distributed counter: one PostgreSQL row per counter key, incremented via UPDATE ... SET count = count + 1. Demonstrates why row-level lock contention makes single-row counters unworkable at scale.

StorageBeginnerBottleneck AnalysisRow Locking

Try in Simulator

Problem Statement

Designing a distributed counter is one of the most deceptively simple system design interview questions. Candidates often start with a single database row and an atomic UPDATE statement, assuming the problem is trivially solved. The naive approach forces them to confront the reality that a single row in a relational database has a hard throughput ceiling, and that the row-level locking mechanism that provides correctness is exactly what prevents scalability.

The core challenge is incrementing a counter at high throughput while maintaining correctness. In the naive approach, each counter key maps to a single row in a PostgreSQL counters table. An increment is performed via UPDATE counters SET count = count + 1 WHERE key = $1. This statement is atomic — PostgreSQL acquires a row-level exclusive lock, reads the current value, increments it, writes the new value, and releases the lock. No application-level read-modify-write race is possible.

The fatal flaw is the row-level exclusive lock. Every concurrent increment to the same counter key must wait for the lock holder to finish. At 50ms per UPDATE (network + disk + WAL flush), a single row can sustain at most ~20 updates per second serially. With PostgreSQL's optimistic locking and batched WAL writes, practical throughput reaches ~2K-5K updates per second per key before lock wait cascades begin.

The cascade is devastating. At 5K inc/sec with 50ms lock hold time, the average lock wait queue has 250 transactions. Each waiter holds a connection pool slot. With 150 connections, the pool saturates at just 150 concurrent waiters, causing all subsequent requests — including reads — to queue at the application layer. Latency spikes from 50ms to multiple seconds, timeouts cascade, and the system enters a death spiral.

Reads are cheap in isolation — SELECT count FROM counters WHERE key = $1 uses MVCC and does not acquire an exclusive lock. But under write contention, reads compete for connection pool slots with the backed-up UPDATE transactions. Even though reads themselves are fast (~5ms), they cannot get a connection when 150 connections are occupied by waiting UPDATEs.

The naive approach has zero deduplication. The same user can increment the same counter multiple times — critical for likes (one per user) but acceptable for views. A UNIQUE constraint on (user_id, counter_key) would add dedup but further increases write contention by adding an index lookup to the UPDATE transaction.

This template makes the row lock contention visible and quantifiable. Run the simulation at increasing write rates and watch the lock wait queue grow, connection pool saturate, and latency cascade. The comparison with the Sharded variant provides the concrete numbers: 16 sub-counters distribute writes across 16 rows, increasing per-key throughput by 16x. The CRDT variant goes further, eliminating cross-region coordination entirely.

Distributed counter design appears in interviews at YouTube, Instagram, TikTok, Twitter, and any company with view/like/click counting at scale. Interviewers expect candidates to start with the single-row approach, identify the row lock bottleneck, and propose sharding as the first optimization.

Architecture Overview

The naive distributed counter is a four-component linear architecture: Client, Load Balancer, Counter Service, and PostgreSQL database. There is no cache, no event stream, no sharding, no deduplication, and no separation between the read and write paths.

All traffic enters through the Load Balancer, which distributes requests across Counter Service pods using round-robin. The Load Balancer adds approximately 1.5ms of routing latency and supports up to 15,000 RPS — well above the system's actual ceiling, which is determined by the database row lock. Both increment writes and count reads flow through the same LB and service — there is no CQRS split.

The Counter Service is a stateless REST API running on 3 pods with 100 threads each (300 concurrent capacity). It handles two operations: (1) increment — execute UPDATE counters SET count = count + 1 WHERE key = $1 against PostgreSQL; (2) read — execute SELECT count FROM counters WHERE key = $1. The service processing time is approximately 5ms, but total response time is dominated by the database operation and lock wait time.

PostgreSQL stores a single counters table with one row per counter key. The primary key is the counter key (item ID), and the count column stores the current value as a BIGINT. The table also tracks updated_at and created_at timestamps. There is no partitioning, no read replicas, and no sharding — a single primary handles all reads and writes.

The write path is the bottleneck. Each UPDATE acquires a row-level exclusive lock for the duration of the transaction. Under MVCC, the UPDATE creates a new tuple version and marks the old one as dead (requiring periodic VACUUM). The WAL (Write-Ahead Log) entry for each UPDATE must be flushed to disk for durability, adding 5-15ms per transaction with synchronous_commit=on.

The system has no redundancy at the data layer. If the single PostgreSQL primary fails, both reads and writes stop entirely. There is no failover, no replication, and no cache to serve stale reads during an outage.

The concrete scaling ceiling is approximately 2K-5K increments per second per hot key. At this point, the row lock wait queue saturates the 150-connection pool, read latency degrades due to connection starvation, and the system enters a cascading failure mode. Total system throughput across all keys is higher (PostgreSQL can handle ~20K-50K transactions per second for different rows), but any single hot key hits the row lock wall.

Architecture Preview

Loading architecture preview...

Open in Simulator

Key Design Decisions

Single Row Per Counter Key

Choice

One row in the counters table per logical counter, incremented via atomic UPDATE

Rationale

The simplest possible data model: UPDATE counters SET count = count + 1 WHERE key = $1 is atomic without application-side coordination. No read-modify-write race. But it serializes all writes to the same key via row-level locking. The Sharded variant splits each counter into 16 sub-counter rows to distribute lock contention.

Single PostgreSQL Instance

Choice

One database primary for all reads and writes, no replicas

Rationale

A single PostgreSQL instance eliminates replication lag and failover complexity. ACID transactions ensure the count is always accurate. The cost is that write-heavy hot keys saturate the connection pool, starving reads. Adding read replicas would help reads but not writes — the exclusive lock contention is on the primary.

No Cache Layer

Choice

Every read hits PostgreSQL directly via SELECT

Rationale

A Redis cache with short TTL (1-5 seconds) would absorb 95%+ of reads, reducing DB connection pressure. The naive approach skips caching to demonstrate the full impact of connection pool saturation under write contention. Adding cache-aside is often the first optimization interviewers expect.

No Sharding or Sub-Counters

Choice

Single row per key, no distribution of writes across multiple rows

Rationale

The sharded counter pattern (N sub-counter rows per logical counter) distributes writes across N rows, each with its own lock. The naive approach uses a single row to make the lock bottleneck visible. The Sharded variant demonstrates the 16x throughput improvement from 16 sub-counters.

No Deduplication

Choice

Same user can increment the same counter multiple times

Rationale

For likes (one per user per item), dedup is critical. A UNIQUE constraint on (user_id, counter_key) or a Bloom filter would prevent double-counting. The naive approach skips dedup for simplicity. The CRDT variant uses HyperLogLog for approximate dedup at ~2% error rate.

Synchronous Writes on Critical Path

Choice

Every increment waits for PostgreSQL UPDATE to complete before responding

Rationale

Synchronous writes give strong consistency — every read returns the exact count. But they tie client latency to database performance. Under row lock contention, clients wait hundreds of milliseconds for their turn. An async approach (publish to Kafka, return 202) would decouple ingestion from storage at the cost of eventual consistency.

Scale & Performance

Target RPS

~2K-5K per hot key (DB row lock ceiling)

Latency (p99)

50-500ms increments (varies with contention)

Storage

~1 GB/month at modest scale

Availability

~99% (single DB, no redundancy)

Time & Space Complexity

Operation	Time	Space	Notes
Increment (POST /api/v1/counters/{key}/increment)	O(1) UPDATE + O(1) WAL flush — but serialized by row lock	O(1) per counter key (single row, ~100 bytes)	Serial throughput per key: ~2K-5K/sec. Lock wait time grows linearly with concurrent writers.
Read count (GET /api/v1/counters/{key})	O(1) SELECT by primary key (MVCC, no exclusive lock)	O(1) single row fetch	Fast in isolation (~5ms) but starved under write contention due to connection pool exhaustion.
Upsert (first increment for a new key)	O(1) INSERT ... ON CONFLICT DO UPDATE	O(1) new row + index entry	Upsert handles counter creation transparently. Index update adds ~1ms.

Database Schema (HLD)

counters

Stores one row per counter key. The hottest table: UPDATE ... SET count = count + 1 acquires a row-level exclusive lock on every increment. Under contention on a single key, the lock wait queue grows linearly with write rate. Reads are simple SELECT by primary key (MVCC, no exclusive lock). No partitioning, no sharding — all keys share the same table and connection pool.

key VARCHAR PK (counter key / item ID)count BIGINT NOT NULL DEFAULT 0 (current counter value)updated_at TIMESTAMPTZ (last increment timestamp)created_at TIMESTAMPTZ (counter creation timestamp)

Indexes: PRIMARY KEY (key)

At 5K inc/sec per hot key with 50ms lock hold time, 250 transactions queue behind the lock. Connection pool (150) saturates, cascading to read starvation.

Solution Comparison

Variant	Tier	Latency	Throughput	Cost	Complexity	Reliability
Naive (Single PostgreSQL Row)	T1	50-500ms increments (lock contention)	~2K-5K inc/sec per key	$300/month (single RDS instance)	Low — no cache, no workers, no sharding	99% (single DB, no failover)
Sharded (Redis + Async Aggregation)	T2	<10ms increments (Redis INCR)	1M+ inc/sec per key (16 shards)	$2,500/month (Redis cluster + Kafka + DB)	Medium — shards, aggregation workers, Kafka	99.9% (replicated Redis + DB)
CRDT (Multi-Region Grow-Only Counters)	T3	<5ms local writes, <10s global convergence	2M+ inc/sec global (region-local writes)	$8,000/month (multi-region Redis + Kafka + DB)	High — CRDT vectors, HLL dedup, cross-region merge	99.95% (multi-region, independent replicas)

This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.

Frequently Asked Questions

Why does a single PostgreSQL row have a throughput ceiling?

Every UPDATE to the same row acquires a row-level exclusive lock. Concurrent UPDATEs serialize behind the lock holder. At 50ms per UPDATE (including WAL flush), the theoretical serial throughput is just 20 updates/sec. PostgreSQL optimizes with group commit and HOT updates, pushing practical throughput to ~2K-5K updates/sec per row. Beyond that, lock wait times cascade: each waiter holds a connection pool slot, the pool saturates (150 connections), and new requests queue at the application layer.

Why is the connection pool the breaking point?

With 150 max connections and 50ms lock hold time, the pool can sustain ~3K concurrent transactions. But under contention, lock waiters hold connections while waiting — they are not released back to the pool until the UPDATE completes. At 5K inc/sec on a single key, 250 transactions are waiting at any moment, exceeding the 150-connection limit. Reads cannot get connections even though they would be fast (~5ms), so the entire system degrades.

What is the first optimization an interviewer expects?

Shard the counter into N sub-counter rows. Instead of one row per key, create 16 rows: counter:{key}:0 through counter:{key}:15. Each increment randomly selects a shard and UPDATEs that specific row. Since each shard has its own row lock, 16 shards provide ~16x the per-key throughput. Reads sum all 16 shards. This is the core insight of the Sharded variant and the most common first optimization in counter design interviews.

How does this compare to Redis INCR?

Redis INCR on a single key can sustain ~1M operations/sec (single-threaded, in-memory). PostgreSQL UPDATE sustains ~2K-5K/sec per row (disk-backed, WAL, row locks). Redis is ~200-500x faster for single-key increments. However, Redis INCR is not durable by default — a crash loses unflushed data. The Sharded variant uses Redis for the hot write path and PostgreSQL for durable persistence, combining the speed of Redis with the durability of PostgreSQL.

Why is strong consistency both the strength and weakness?

Every SELECT returns the exact current count — no staleness, no eventual consistency. This is the naive approach's one genuine advantage over sharded or CRDT variants, which trade consistency for throughput. But strong consistency requires the write to complete (lock acquired, tuple written, WAL flushed) before the value is visible. Under contention, the lock wait chain means 'current' is whatever the last completed transaction wrote — potentially hundreds of milliseconds stale relative to the request arrival time.

When is the naive approach actually the right choice?

For counters with low write rates per key (under 1K inc/sec) and where strong consistency matters. Examples: internal analytics dashboards, A/B test impression counters, feature flag rollout counters. If no single key receives more than a few hundred increments per second, the row lock never becomes a bottleneck. The simplicity of a single PostgreSQL table with no cache, no workers, and no event stream is a genuine operational advantage.

Related Templates

Distributed Counter — Sharded (Redis + Async Aggregation)Distributed Counter — CRDT (Multi-Region Grow-Only Counters)

Discussion

Ready to design your own Distributed Counter?

Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.

Open Simulator