System Design Interview Questions
The 10 most frequently asked system design interview questions at FAANG and top startups — with structured model answers.
-
Q1
Design a URL Shortener (like bit.ly)
Core requirements: Given a long URL, return a 7-character short code. Redirect short code to original URL in <10ms. Handle 100M URLs, 10B reads/month. Architecture: App servers behind a load balancer. Base62 encoding of a counter from a distributed ID generator (Snowflake) gives the short code. Store longUrl→shortCode in Cassandra (high write throughput, tunable consistency). Cache hot short codes in Redis with LRU eviction. CDN for redirect responses. For analytics, emit redirect events to Kafka and aggregate asynchronously.
How to answer:Structure: Requirements → Scale estimates → API → DB schema → Read/write paths → Trade-offs. Mention that 301 (permanent) redirect is cacheable but breaks click analytics; 302 (temporary) allows analytics. This detail is a strong differentiator.
-
Q2
Design Twitter's News Feed
Two approaches — Fan-out on write (push model): when a user tweets, pre-compute and insert into each follower's feed table in Cassandra. O(followers) writes per tweet; great for reads. Fan-out on read (pull model): user loads feed → fetch tweets from everyone they follow → merge + rank. O(following) reads per load; bad at scale. Hybrid (Twitter's approach): Fan-out on write for regular users; fan-out on read for celebrities (high fan-out). Feed stored in Redis sorted sets. Ranking uses a mix of recency and engagement signals.
How to answer:Interviewers expect you to identify the fan-out problem and propose the hybrid solution. The Redis sorted set data structure for the feed is a strong technical detail. Talk about eventual consistency — a slight delay in feed updates is acceptable.
-
Q3
Design a Rate Limiter
Algorithms: Fixed Window (simple, but burst at window edge). Sliding Window Log (accurate, but memory heavy). Sliding Window Counter (balance of accuracy and memory). Token Bucket (allows controlled bursts). Leaky Bucket (smooth output rate). Implementation: Store counters in Redis with TTL. For distributed systems, use Redis atomic operations (INCR + EXPIRE) or Lua scripts to avoid race conditions. Return HTTP 429 with Retry-After header. Place at API Gateway or as middleware to intercept before business logic.
How to answer:Name all four algorithms, explain Token Bucket for burst-friendly APIs, and explain the distributed race condition problem with the Redis INCR solution. Most candidates know the algorithms but miss the distributed implementation detail.
-
Q4
Design a Distributed Cache
Requirements: Low-latency reads/writes (< 1ms), high availability, horizontal scaling. Architecture: Use consistent hashing to distribute keys across cache nodes — adding/removing nodes affects only adjacent keys. Each node stores data in memory (LRU eviction). Replication: primary-replica for fault tolerance. Invalidation strategies: TTL-based (simple, eventual consistency), event-driven invalidation (DB trigger or CDC writes to Kafka → cache consumer). Write policies: Cache-aside (app manages cache), Write-through (sync cache + DB), Write-back (async, risk of data loss).
How to answer:Cover: consistent hashing (node distribution), eviction policy (LRU), replication (fault tolerance), and invalidation (the hardest part). Interviewers love candidates who say 'cache invalidation is one of the two hard problems in CS' and then actually explain a solution.
-
Q5
Design WhatsApp / a Chat System
Requirements: 1-1 messaging, delivery receipts, online status, message history. Connection: WebSocket for real-time bidirectional messaging. Each user connects to a Chat Server; users on the same server can communicate directly. Inter-server routing via a message router + Pub/Sub (Redis or Kafka) for users on different servers. Storage: Cassandra for message history (write-heavy, time-series). HBase or DynamoDB with composite key (conversationId + timestamp). Delivery: Message Service stores message in DB, pushes to recipient's server. APNs/FCM for offline push notifications.
How to answer:The WebSocket vs HTTP polling decision is the most important — explain why WebSocket is necessary (low latency, bidirectional). Cover: connection management, message routing, storage, offline delivery. Group chat is a natural extension: fan-out to all group members.