Tech Unpacked – Research & Fundamentals with Nitin Sharma

Friday, May 6, 2022

Kafka - Core Concepts

Let's discuss about common terms been used in kafka and about their roles in distributed architecture.

Producer

An application that sends message to Kafka

Message

Small to medium sized piece of data

Consumer

An application that reads data from Kafka

Cluster

A group of computers sharing workload for a common purpose

Topic

A topic is a unique name for Kafka stream

Partition

Kafka topics are divided into several partitions. While the topic is a logical comcept in Kafka, a partitin is the smallest storage unit that holds a subset of records owned by a topic. Each partition is a single log file where records are written to it in an append-only fashion.

Offset

A sequence id given to messages as they arrive in a partition

Global Unique identifier of the a message?

Topic Name -> Partition Number -> Offfset

Consumer Group

A group of consumers acting as a single logical unit

Can multiple kafka consumers read same message from the partition?

It depends on group ID. Suppose you have a topic with 12 partitions. If you have 2 Kafka consumers with the same Group Id, they will both read 6 partitions, meaning they will read different set of partitions = different set of messages. If you have 4 Kafka cosnumers with the same Group Id, each of them will all read three different partitions etc.
But when you set different Group Id, the situation changes. If you have two Kafka consumers with different Group Id they will read all 12 partitions without any interference between each other. Meaning both consumers will read the exact same set of messages independently. If you have four Kafka consumers with different Group Id they will all read all partitions etc.

Within same group: NO

Two consumers (Consumer 1, 2) within the same group (Group 1) CAN NOT consume the same message from partition (Partition 0).

Across different groups: YES

Two consumers in two groups (Consumer 1 from Group 1, Consumer 1 from Group 2) CAN consume the same message from partition (Partition 0).

Wednesday, May 4, 2022

Load Balancing with Consistent Hashing Approach

Load Balancing is a key concept to system design. One simple way would be hashing all requests and then sending them to the assigned server.

The standard way to hash objects is to map them to a search space, and then transfer the load to the mapped computer. A system using this policy is likely to suffer when new nodes are added or removed from it.

One of the popular ways to balance load in a system is to use the concept of consistent hashing. Consistent Hashing allows requests to be mapped into hash buckets while allowing the system to add and remove nodes flexibly so as to maintain a good load factor on each machine.

Consistent Hashing maps servers to the key space and assigns requests(mapped to relevant buckets, called load) to the next clockwise server. Servers can then store relevant request data in them while allowing the system flexibility and scalability.

Some terms you would hear in system design discussions are Fault Tolerance, in which case a machine crashes. And Scalability, in which case machines need to be added to process more requests. These two principles are allowed by Consistent Hashing, and hence it is an important building block to a system design architect's toolbox.

Another term used often is request allocation. This means assigning a request to a server. Consistent hashing assigns requests to the servers in a way that the load is balanced are remains close to equal.

Server architecture is a subjective concept, and there are outliers for many cases. Don't think of Consistent Hashing as a silver bullet for fault tolerance and scalability, but a useful concept for request allocation.

Tech Unpacked – Research & Fundamentals with Nitin Sharma

Popular Posts

Search This Blog

Friday, May 6, 2022

Kafka - Core Concepts

Wednesday, May 4, 2022

Load Balancing with Consistent Hashing Approach

My Profile

Featured Post

🚀 Introducing the Universal API Testing Tool — Built to Catch What Manual Testing Misses

!! IMPORTANT LINKS !!

!! INTERESTING TALKS !!

Contact Form

Labels

Total Pageviews