Kafka Partitions and Ordering Explained
Kafka guarantees ordering within a partition, not across the whole topic. The key idea is choosing the right message key so related events go to the same partition.
Interview Prep
Distributed systems, messaging, scalability, caching, queues, consistency, and backend architecture interview prep.
Showing 18 of 18 questions
Kafka guarantees ordering within a partition, not across the whole topic. The key idea is choosing the right message key so related events go to the same partition.
Idempotency keys prevent duplicate side effects when clients retry requests. They are critical for payments, order creation, purchases, booking systems, and any API where running the same operation twice would be dangerous.
At-least-once means messages are not lost but may be processed more than once. Exactly-once means the final processing effect happens once, which usually requires transactions, idempotency, or careful system design.
Rate limiting controls how many requests a client can make within a time period. Common strategies include fixed window, sliding window, token bucket, and leaky bucket.
Consistent hashing distributes keys across servers in a way that minimizes remapping when servers are added or removed. It is useful for caches, sharding, load balancing, and routing users to stable backend nodes.
A safe money transfer system must handle concurrency, consistency, deadlocks, retries, duplicate requests, and distributed system failures while ensuring money is never lost or duplicated.
Understand how an LRU cache works, how Java LinkedHashMap can implement it, and what interviewers expect beyond the code.
Understand when Kafka and similar messaging systems become useful: moving from direct service calls to durable event-driven communication.
Caching improves latency and reduces load by storing frequently used data closer to the application or user. Common strategies include cache-aside, read-through, write-through, write-behind, write-around, TTL-based caching, and CDN caching.
Latency in system design is the time interval between the start of a request from a client to the delivery of the result back from the server. Tail latency describes the slowest requests in a system. p99 latency means 99% of requests are faster than this value, while the slowest 1% are at or above it.
A dead letter queue stores messages that could not be processed successfully after retries or validation failures. It helps the main pipeline keep moving while preserving failed messages for debugging, alerting, and replay.
Deadlock happens when threads wait forever on each other's locks. The most practical prevention strategy is consistent lock ordering, plus timeouts, smaller lock scope, and better system design.
Availability means the system can successfully serve users when they need it. This page explains how availability is measured, improved, and discussed in system design interviews.
Timeouts prevent services from waiting forever, while retries help recover from temporary failures. Used badly, retries can overload dependencies and cause cascading failures.
A circuit breaker protects your service from repeatedly calling a failing dependency. It fails fast, gives the dependency time to recover, and helps prevent cascading failures.
Graceful degradation means keeping the most important user flows working even when optional features or dependencies fail.
Reliability means a system performs its intended function correctly and consistently over time. It includes availability, correctness, durability, recovery, observability, and predictable behavior during failures.
Health checks help infrastructure decide whether an instance is alive, ready for traffic, or should be restarted. They are essential for load balancing, deployments, and auto-recovery.