Published
-
Apache Kafka Explained

What is Kafka and Why Do We Need It?
Real-World Problem: Uber Example
Letβs understand with a simple example. Suppose there is a company called Uber. In Uber, there are:
- Drivers (pilots)
- Customers
The Problem:
- Customer books a ride
- Customer wants to track the driverβs live location
- Driver sends his location coordinates every second to the UBER server
- This location data needs to be:
- Saved in database for analysis
- Sent to customer for live tracking
- etc
Letβs do the math:
- Suppose a driver sends location every second
- If it takes 30 minutes to reach customer location
- 30 minutes = 30 Γ 60 = 1800 seconds
- This means 1800 records will be saved in database for just one ride!
Now imagine:
- 1,00,000 (1Lakh) active rides at the same time
- Each sending location every second
- Database will receive insertions 100000 Γ 1800 = 18,000,000 (18 crore) records in 30 minutes
- Database operations per second will be very high
- System might crash due to high load
Database Throughput Problem
What is Throughput?
- Throughput means how many operations (read/write) a database can handle in a particular time
- If we do too many operations, throughput becomes very high
- High throughput can crash the system
Traditional Database Issues:
- Low throughput capacity
- Cannot handle millions of real-time operations
- Gets slow with heavy traffic
Kafka - The Solution
What is Kafka?
- Primary Identity: Distributed Event Streaming Platform (A system that captures and shares data at the very moment it happens)
- Secondary Use: Can work as a Message Broker
- Kafka has very high throughput (millions of events per second)
- It is NOT a replacement for database
- Kafka provides persistent event storage with configurable retention
- Database has permanent storage but limited real-time throughput
Key Difference from Traditional Message Brokers:
- Message Broker: Messages deleted after consumption
- Event Streaming: Events stored persistently, can be replayed
How Kafka Helps:
- Handles millions of events per second
- Acts as a buffer between data producers and consumers
- Prevents database overload
- Events can be replayed for reprocessing
- Multiple consumers can read same events
- Enables real-time stream processing
Kafka Architecture
Basic Components
[Producer] -----> [Kafka Server] -----> [Consumer]
(Driver App) (Message Broker) (Location Service)
|
|
βββββββββββ
β Topic 1 β (driver-locations)
β Topic 2 β (ride-requests)
β Topic 3 β (payments)
βββββββββββ
Components:
- Producer: Sends data to Kafka (like Uber driver sending location)
- Kafka Server: Stores and manages data temporarily
- Consumer: Reads data from Kafka (like location service sending data to customer)
- Topics: Categories where messages are stored (like βdriver-locationsβ, βride-requestsβ)
Topics and Partitions
What are Topics?
- Topics are like folders where similar messages are stored
- Example topics: βdriver-locationsβ, βpayment-dataβ, βride-bookingsβ
- Definition: A topic is like a category or channel where messages are stored in Kafka.
What are Partitions?
- Each topic is divided into smaller parts called partitions
- This helps in parallel processing
- More partitions = faster processing
Topic: driver-locations
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Partition 0 Partition 1 Partition 2 β
β (North India) (South India) (East India) β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β Message1 β β Message3 β β Message5 β β
β β Message2 β β Message4 β β Message6 β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β β β β
β Consumer 1 Consumer 2 Consumer 3 β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Consumer Groups and Load Balancing
Why Do We Need Consumer Groups?
π€ Problem Without Consumer Groups:
Topic with 1 Million Messages
βββββββββββββββββββββββββββββββ
β Message 1, Message 2, ... β
β Message 999,999 β
β Message 1,000,000 β
βββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββ
β Single β β Takes 10 hours to process all messages!
β Consumer β
βββββββββββββββ
β Solution With Consumer Groups:
Topic with 1 Million Messages (4 Partitions)
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
β250K msgsβ β250K msgsβ β250K msgsβ β250K msgsβ
β P0 β β P1 β β P2 β β P3 β
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
βConsumer β βConsumer β βConsumer β βConsumer β
β 1 β β 2 β β 3 β β 4 β
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
β
Takes only 2.5 hours! (4x faster)
π― Benefits of Consumer Groups:
- β‘ Parallel Processing: Multiple consumers work simultaneously
- π Load Distribution: Work is automatically divided
- π‘οΈ Fault Tolerance: If one consumer fails, others continue
- π Scalability: Add more consumers to handle more load
- π Multiple Use Cases: Different groups can process same data differently
π’ Real-World Example (E-commerce):
Order Topic β Consumer Group 1 (Inventory Service) β Updates stock
β Consumer Group 2 (Payment Service) β Processes payment
β Consumer Group 3 (Email Service) β Sends confirmation
β Consumer Group 4 (Analytics Service) β Records metrics
Each service gets the same order data but processes it for different purposes!
Auto-Balancing Rules
Important Rule:
- One partition can be consumed by only ONE consumer at a time within the same group
- One consumer can handle multiple partitions
- This ensures no message is processed twice
Different Scenarios
Scenario 1: 1 Consumer, 4 Partitions
Topic with 4 Partitions:
βββββββ βββββββ βββββββ βββββββ
β P0 β β P1 β β P2 β β P3 β
βββββββ βββββββ βββββββ βββββββ
β β β β
βββββββββΌββββββββΌββββββββ
βββββββββ
β
ββββββββββββ
βConsumer 1β (handles ALL partitions)
ββββββββββββ
Result: Consumer 1 handles all 4 partitions (P0, P1, P2, P3)
Scenario 2: 2 Consumers, 4 Partitions
Topic with 4 Partitions:
βββββββ βββββββ βββββββ βββββββ
β P0 β β P1 β β P2 β β P3 β
βββββββ βββββββ βββββββ βββββββ
β β β β
βββββββββ βββββββββ
β β
ββββββββββββ ββββββββββββ
βConsumer 1β βConsumer 2β
β (P0, P1) β β (P2, P3) β
ββββββββββββ ββββββββββββ
Result: Each consumer handles 2 partitions (Auto-balanced)
Scenario 3: 3 Consumers, 4 Partitions
Topic with 4 Partitions:
βββββββ βββββββ βββββββ βββββββ
β P0 β β P1 β β P2 β β P3 β
βββββββ βββββββ βββββββ βββββββ
β β β β
βββββββββ β β
β β β
ββββββββββββ βββββββββββ βββββββββββ
βConsumer 1β βConsumer β βConsumer β
β (P0, P1) β β 2 β β 3 β
ββββββββββββ β (P2) β β (P3) β
βββββββββββ βββββββββββ
Result: Consumer 1 gets 2 partitions, Consumer 2 and 3 get 1 partition each
Scenario 4: 4 Consumers, 4 Partitions (Perfect Balance!)
Topic with 4 Partitions:
βββββββ βββββββ βββββββ βββββββ
β P0 β β P1 β β P2 β β P3 β
βββββββ βββββββ βββββββ βββββββ
β β β β
β β β β
βββββββ βββββββ βββββββ βββββββ
βCon 1β βCon 2β βCon 3β βCon 4β
β(P0) β β(P1) β β(P2) β β(P3) β
βββββββ βββββββ βββββββ βββββββ
Result: Perfect balance - each consumer gets exactly 1 partition
Scenario 5: 5 Consumers, 4 Partitions (1 Consumer IDLE!)
Topic with 4 Partitions:
βββββββ βββββββ βββββββ βββββββ
β P0 β β P1 β β P2 β β P3 β
βββββββ βββββββ βββββββ βββββββ
β β β β
β β β β
βββββββ βββββββ βββββββ βββββββ βββββββββββ
βCon 1β βCon 2β βCon 3β βCon 4β βConsumer β
β(P0) β β(P1) β β(P2) β β(P3) β β 5 β
βββββββ βββββββ βββββββ βββββββ β (IDLE) β
βββββββββββ
Result: Consumer 5 remains IDLE (no work to do)
Multiple Consumer Groups
Important Concept:
- Different consumer groups can read the same data
- Each group processes data independently
Topic: driver-locations
βββββββ βββββββ βββββββ βββββββ
β P0 β β P1 β β P2 β β P3 β
βββββββ βββββββ βββββββ βββββββ
β β β β
βββββββββΌββββββββΌββββββββ€ (Both groups read same data)
β β β β
ββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
β Consumer Group 1 β
β (Location Service) β
β βββββββ βββββββ βββββββ βββββββ β
β βCon 1β βCon 2β βCon 3β βCon 4β β
β β(P0) β β(P1) β β(P2) β β(P3) β β
β βββββββ βββββββ βββββββ βββββββ β
ββββββββββββββββββββββββββββββββββββββββββββ
β β β β
ββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
β Consumer Group 2 β
β (Analytics Service) β
β βββββββββββββββββββ β
β β Consumer 1 β β
β β (All P0-P3) β β
β βββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββ
Use Case:
- Group 1: Sends live location to customers (4 consumers for speed)
- Group 2: Stores data for analytics and reporting (1 consumer is enough)
Queue vs Pub/Sub in Kafka
Traditional Systems
- SQS/RabbitMQ: Only Queue system
- Kafka: Supports both Queue and Pub/Sub
Queue Behavior in Kafka (Within Same Group)
Producer βββ
β
Producer βββ€ββββ Topic (4 Partitions) ββββ Consumer Group (4 Consumers)
β P0, P1, P2, P3 β
Producer βββ ββ Consumer 1 β P0
ββ Consumer 2 β P1
ββ Consumer 3 β P2
ββ Consumer 4 β P3
π Each partition goes to ONLY ONE consumer in the group
π Each consumer handles exactly one partition (perfect balance!)
π Work is divided among consumers (no duplication)
π Perfect for task distribution
β Ideal Scenario: Topic Partitions = Consumer Count (4 = 4)
Detailed Queue Behavior Cases:
Case 1: Consumers = Partitions (Perfect Balance)
- Each consumer gets exactly one partition.
- Messages in a partition are read by only one consumer.
- This behaves just like a queue:
- Work is evenly distributed.
- No duplication inside the group. β Queue-like behavior achieved.
Case 2: Consumers < Partitions (Some consumers handle more)
- Some consumers handle multiple partitions.
- Still queue-like, but less parallelism (some consumers get more load).
- Example: 2 consumers, 4 partitions = each consumer handles 2 partitions
Case 3: Consumers > Partitions (Some consumers idle)
- Some consumers will be idle.
- Kafka wonβt assign more than one consumer to the same partition (inside a group).
- This means adding consumers beyond partitions wonβt increase throughput.
- Example: 6 consumers, 4 partitions = 2 consumers will be idle
Pub/Sub Twist:
Even in the βqueueβ case above (1 consumer per partition), if you add another consumer group, they will all get their own copy of the data (thatβs the pub/sub behavior).
Pub/Sub Behavior in Kafka (Different Groups)
Producer βββ
β
Producer βββ€ββββ Topic ββββββ¬ββββ Consumer Group 1 (Location Service)
β β
Producer βββ βββββ Consumer Group 2 (Analytics Service)
β
βββββ Consumer Group 3 (Billing Service)
π Each GROUP gets ALL messages
π Same message goes to all groups
π Perfect for broadcasting information
KafkaJS Implementation Example
Before we dive into Kafkaβs architecture evolution, letβs see how to actually use Kafka in a Node.js application using KafkaJS library.
Installation
npm install kafkajs
πΉ Producer (Send Messages)
// producer.js
const { Kafka } = require('kafkajs')
async function runProducer() {
const kafka = new Kafka({
clientId: 'my-producer',
brokers: ['localhost:9092'] // match your Kafka advertised listener
})
const producer = kafka.producer()
await producer.connect()
for (let i = 0; i < 5; i++) {
await producer.send({
topic: 'demo-topic',
messages: [{ key: `key-${i}`, value: `Hello KafkaJS ${i}` }]
})
console.log(`β
Sent message ${i}`)
}
await producer.disconnect()
}
runProducer().catch(console.error)
π€ What if Topic Doesnβt Exist?
πΉ Case 1: auto.create.topics.enable=true
(Bitnami default)
- Kafka automatically creates the topic on first message
- Uses default settings:
1 partition
,replication.factor=1
- β Your producer works without manual topic creation
πΉ Case 2: auto.create.topics.enable=false
(Production setup)
- Kafka rejects the message with error:
UNKNOWN_TOPIC_OR_PARTITION
- β You must manually create topic first using Kafka UI or CLI
π‘ Best Practice: Always create topics manually in production for better control over partitions and replication.
πΉ Consumer (Consume with a Group)
// consumer.js
const { Kafka } = require('kafkajs')
async function runConsumer() {
const kafka = new Kafka({
clientId: 'my-consumer',
brokers: ['localhost:9092']
})
const consumer = kafka.consumer({ groupId: 'demo-group' })
await consumer.connect()
await consumer.subscribe({ topic: 'demo-topic', fromBeginning: true })
await consumer.run({
eachMessage: async ({ topic, partition, message }) => {
console.log(
`π© Received: topic=${topic} partition=${partition} key=${message.key?.toString()} value=${message.value.toString()}`
)
}
})
}
runConsumer().catch(console.error)
πΉ Run the Demo
Step 1: Make sure your Kafka is running (using the Docker setup above)
Step 2: Start consumer first:
node consumer.js
Step 3: In another terminal, start producer:
node producer.js
Expected Output:
Producer Terminal:
β
Sent message 0
β
Sent message 1
β
Sent message 2
β
Sent message 3
β
Sent message 4
Consumer Terminal:
π© Received: topic=demo-topic partition=0 key=key-0 value=Hello KafkaJS 0
π© Received: topic=demo-topic partition=0 key=key-1 value=Hello KafkaJS 1
π© Received: topic=demo-topic partition=0 key=key-2 value=Hello KafkaJS 2
π© Received: topic=demo-topic partition=0 key=key-3 value=Hello KafkaJS 3
π© Received: topic=demo-topic partition=0 key=key-4 value=Hello KafkaJS 4
πΉ Key Points to Notice
- Client ID: Identifies your application to Kafka
- Brokers: List of Kafka server addresses
- Group ID: Consumer group for load balancing
- Topic: Where messages are stored
- fromBeginning: true: Read all messages from start of topic
- Key-Value: Messages can have both key and value
- Partition Info: Consumer shows which partition message came from
πΉ Testing Different Consumer Groups
Create another consumer with different group ID:
// consumer2.js
const consumer = kafka.consumer({ groupId: 'demo-group-2' }) // Different group!
Run both consumers and one producer - both consumers will receive ALL messages (Pub/Sub behavior)!
Producer β Topic β¬β Consumer Group 1 (gets all messages)
ββ Consumer Group 2 (gets all messages)
ZooKeeper vs KRaft
Old Architecture (With ZooKeeper) - Before 2021
βββββββββββββββββββ
β ZooKeeper β β External dependency
β Cluster β (Extra complexity)
βββββββββββ¬ββββββββ
β
βββββββββββββββΌββββββββββββββ
β β β
ββββββββββββ ββββββββββββ ββββββββββββ
β Kafka β β Kafka β β Kafka β
β Broker 1 β β Broker 2 β β Broker 3 β
ββββββββββββ ββββββββββββ ββββββββββββ
ZooKeeper was needed for:
β Managing Kafka brokers
β Storing configuration data
β Leader election for partitions
New Architecture (KRaft Mode) - After 2021
βββββββββββββββββββββββββββββββββββββββββββββββ
β Kafka Cluster (KRaft Mode) β
β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β Kafka β β Kafka β β Kafka β β
β β Broker 1 β β Broker 2 β β Broker 3 β β
β βControllerβ β β β β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββ
KRaft Benefits:
β No external ZooKeeper needed
β Simpler deployment
β Better performance
β Self-managing
Docker Setup for Kafka
Complete Docker Compose Configuration
# docker-compose.yml
version: '3.8'
services:
kafka:
image: bitnami/kafka:3.6.1
container_name: kafka
ports:
- '9092:29092' # External port for your applications
environment:
# KRaft mode (no ZooKeeper needed!)
- KAFKA_ENABLE_KRAFT=yes
- KAFKA_CFG_NODE_ID=1
- KAFKA_CFG_PROCESS_ROLES=broker,controller
# Network configuration
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093,EXTERNAL://:29092
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,EXTERNAL://localhost:9092
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT
# Controller settings
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1@kafka:9093
- KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT
# Development settings (not for production!)
- ALLOW_PLAINTEXT_LISTENER=yes
networks:
- kafka-network
# Web UI to manage Kafka easily
kafka-ui:
image: provectuslabs/kafka-ui
container_name: kafka-ui
ports:
- '8080:8080' # Access at http://localhost:8080
environment:
- KAFKA_CLUSTERS_0_NAME=my-kafka-cluster
- KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS=kafka:9092
depends_on:
- kafka
networks:
- kafka-network
networks:
kafka-network:
driver: bridge
How to Run
# 1. Save above content in docker-compose.yml file
# 2. Start Kafka cluster
docker-compose up -d
# 3. Check if containers are running
docker ps
# You should see:
# - kafka container running on port 9092
# - kafka-ui container running on port 8080
# 4. Access Kafka UI in browser
# Open: http://localhost:8080
# 5. Stop the cluster when done
docker-compose down
Real-World Example: Uber Location System
Letβs see how Uber actually uses Kafka:
Step 1: Driver sends location
βββββββββββββββ
β Driver App β ββββ GPS Location βββββ
βββββββββββββββ β
βΌ
Step 2: Data goes to Kafka ββββββββββ
βββββββββββββββ β Kafka β
β Producer β ββββ Location ββΆβ Topic β
β (Driver API)β Message ββββββ¬ββββ
βββββββββββββββ β
β
Step 3: Multiple services consume β
βββββββββββββββ ββββββββββββββββββββββ€
β Customer β Live location β
β App API β updates β
βββββββββββββββ β
β
βββββββββββββββ ββββββββββββββββββββββ€
β Analytics β Store location β
β Service β for reporting β
βββββββββββββββ β
β
βββββββββββββββ ββββββββββββββββββββββ
β Billing β Calculate fare
β Service β based on route
βββββββββββββββ
Benefits:
- High Speed: Can handle millions of location updates per second
- No Data Loss: Even if customer app is down, location data is saved
- Scalable: Can add more consumers as business grows
- Multiple Uses: Same location data used for tracking, analytics, billing
Key Concepts Summary
π Throughput Comparison
Traditional Database: π Slow
βββββββββββββββββββββββββββββββββββ
β βββ (1000 operations/sec) β
βββββββββββββββββββββββββββββββββββ
Apache Kafka: π Super Fast
βββββββββββββββββββββββββββββββββββ
β βββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββ
(Millions of operations/sec)
π Partitions Enable Parallel Processing
- More partitions = Better performance
- One partition = One consumer per group
- Perfect balance when partitions = consumers
π₯ Consumer Groups Enable Flexibility
- Same Group: Queue behavior (work sharing)
- Different Groups: Pub/Sub behavior (data copying)
- Automatic load balancing
- Fault tolerance
ποΈ KRaft vs ZooKeeper
- Old Way: Kafka + ZooKeeper (complex)
- New Way: Kafka only (simple)
- Better performance and easier management
Common Use Cases
1. π Real-time Location Tracking
Driver/Delivery Apps β Kafka β Customer Apps + Analytics
2. π E-commerce Order Processing
Order β Kafka β Inventory + Payment + Shipping + Email
3. π± Social Media Feed
User Posts β Kafka β Timeline + Recommendations + Analytics
4. π¦ Banking Transactions
ATM/App β Kafka β Account Update + Fraud Check + SMS + Email
5. π Log Management
App Logs β Kafka β Monitoring + Storage + Alerts
Best Practices (Important for Interviews!)
1. π Topic Design
- Use clear topic names:
user-events
,order-created
- Plan partitions based on expected message volume
- More partitions = better parallelism (but not too many!)
2. π₯ Consumer Groups
- Use meaningful group IDs:
payment-service
,notification-service
- Monitor consumer lag (how far behind consumers are)
- Handle rebalancing properly
3. β‘ Performance Tips
- Right number of partitions (usually 2-3 times number of consumers)
- Monitor disk space regularly
- Use appropriate batch sizes for better throughput
4. π‘οΈ Reliability
- Set replication factor to at least 3 for production
- Use proper acknowledgment settings for critical data
- Implement retry logic for failed messages
Quick Interview Questions & Answers
Q1: What is Kafka and why use it?
Answer: Kafka is a high-throughput message broker that handles millions of messages per second. We use it when databases canβt handle real-time data load, like in Uberβs live tracking system.
Q2: Can one partition be consumed by multiple consumers?
Answer: No! Within the same consumer group, one partition can be consumed by only ONE consumer. But different consumer groups can read from the same partition.
Q3: What happens if we have more consumers than partitions?
Answer: Extra consumers become IDLE. If we have 4 partitions and 6 consumers, 2 consumers will have no work.
Q4: Queue vs Pub/Sub in Kafka?
Answer:
- Queue: Same consumer group, messages divided among consumers
- Pub/Sub: Different consumer groups, all groups get all messages
Q5: ZooKeeper vs KRaft?
Answer: Old Kafka needed external ZooKeeper for coordination. New Kafka (KRaft mode) manages everything internally, making it simpler and faster.
Q6: Does Kafka act as a message broker or distributed event streaming platform?
Answer: Kafka is primarily a distributed event streaming platform, not just a message broker. Unlike traditional message brokers that delete messages after consumption, Kafka stores events persistently and allows multiple consumers to replay the same events. This enables real-time analytics, stream processing, and event-driven architectures. While it can work as a message broker, thatβs just one of its capabilities.