by Sharad Regoti & Zbynek Roubalik, Founder & CTO, Kedify
December 18, 2023
Apache Kafka has become the de facto standard for handling large data streams, known for its high throughput and scalability. It now serves as the backbone for a range of applications, particularly in microservices where it is used to decouple microservice communication.
However, operating Kafka is not without its complexities. The challenges span across Kafka servers, producers and most notably its consumers.
Consumer challenges range from handling large volumes of incoming data during peak traffic periods to ensuring consistent throughput during low activity. This dynamic nature of incoming data influences consumer lag (measures the delay between message production and consumption), further impacting resource utilization and operational costs.
In modern cloud-native environments, Kafka consumers are increasingly deployed within Kubernetes. This setup offers benefits in scalability and deployment ease but also introduces the need for sophisticated scaling strategies that can adapt to the volatile nature of Kafka’s data streams.
This is where Kubernetes Event-Driven Autoscaling (KEDA) comes into play. KEDA extends Kubernetes capabilities by allowing event-driven auto scaling. When integrated with Kafka, KEDA enables the consumer applications to scale out or in automatically based on actual workload.
This blog post demonstrates how KEDA enhances Kafka consumer performance & overall resource utilization.
Consumer lag in Kafka allows you to monitor the delay between the offset consumed by your consumers and the offset of the most recently added messages. When the lag is growing, it indicates that the consumers are slower than the producers. If your applications are supposed to work in near real-time, consumer lag should be a small value.
NOTE: There is no ideal value for consumer lag, it depends on the use case & what is your permissible value, but regardless it should be a small value.
Consumer lag is a direct indication of how effectively your consumers process incoming data. In our experiment, we’ll examine two distinct scenarios:
Consumer Lag Without KEDA (Constant Replicas): Here, the Kafka consumer application operates with a fixed number of replicas. We’ll observe the consumer lag when the Kafka producer application generates messages at a constant rate (10 messages per second). This setup will provide us with a baseline for consumer lag in a static environment.
Consumer Lag With KEDA (Dynamic Scaling): In this scenario, we’ll employ KEDA to dynamically scale the consumer replicas based on real-time consumer lag. By setting a threshold for consumer lag, KEDA will adjust the number of consumer replicas to efficiently process the incoming data. We’ll again have the Kafka producer generate messages at the same rate as in the first scenario.
We will measure the average consumer lag in both scenarios. This comparison will illustrate the impact of using KEDA for dynamic scaling versus a static number of consumer replicas.
To begin with, let’s proceed to set up our Kafka cluster and consumer application.
Let’s start with an initial configuration that sets up the Kafka cluster, producer & consumer application as shown in the below image.
The below command creates a Kafka namespace & installs Strimzi Kafka operator in it.
Once the operator is in running state, execute the below command to create a Kafka cluster. The below configuration utilizes the Kafka CRD of the Strimzi operator and creates a cluster called my-cluster.
The cluster takes 2-3 minutes to be in running state, you can check status using the below command. Wait till the ready column in the output is set to true & all pods are up & running.
The below command creates a Kafka topic called my-topic by utilizing KafkaTopic CRD provided by the Strimzi operator.
With Kafka up and running, let’s deploy Kafka consumer and producer applications. Use the below command to deploy a consumer application.
Once the consumer is up, send messages to the Kafka server using the producer application to test the consumer.
Check the logs of the consumer using the below command to test if the setup is working fine.
With consumer & producer application up & running, Lets setup monitoring to observe the impact on Consumer Lag.
Once the Kafka cluster is up & running, install Prometheus & Grafana in the cluster using the below helm chart.
Wait for all the pods to be in running state.
To get additional metrics apart from the standard JVM metrics of Kafka, we configured Strimzi Kafka resource to also set up Kafka Exporter which provides additional Kafka metrics in Prometheus format.
The below command creates PodMonitor resource (a CRD of Prometheus operator) which adds Kafka Exporter as a Prometheus target.
Once the target is registered in Prometheus, access Grafana UI on the browser at http://localhost:3000 using the below command. (Default username and passwords are admin prom-operator).
Import the Kafka Exporter dashboard to view graphical visualization of additional metrics that we added while setting up Kafka.
With monitoring being setups, Let’s compare the effect on consumer Lag.
In this scenario we will keep consumer application replicas to one & Kafka producer application produces 10 msg / sec (5000 msg total, with 100ms delay).
With everything already setup, execute the below command to start producing messages & observe the output.
The below image shows the graphical representation of consumer lag over 10 minute duration. The average Consumer Lag comes out to be 5.27.
In this scenario we will use KEDA to dynamically scale consumer applications on the basis of Kafka Consumer Lag.
1. Install KEDA
The below ScaledObject (a KEDA CRD) tells KEDA to monitor a Kafka topic (my-topic) and scale the deployment (kafka-amqstreams-consumer) based on the consumer lag. If the lag goes above one, KEDA will start to scale up the deployment, until it reaches a maximum of 5 replicas. It scales down when the lag is less than one, with a cooldown period of 5 seconds to avoid rapid scaling down.
Explanation of selected fields in the configuration:
Now apply the above scaled object using the below command.
After applying the above config, If there are no pending messages in Kafka (that is Consumer Lag less than 1). KEDA will scale down the consumer application to zero.
Initially we have zero replicas of consumers (because of the scale down to zero action of KEDA), as producers produce messages in Kafka the consumer lag will start to change. KEDA will trigger scale out action once the consumer lag crosses the threshold. It can reach up to a maximum of five replicas.
In Kafka, the number of consumers for a particular consumer group should not be greater than the number of partitions. That is why we have set the limit of max 5 replicas, as our cluster only has 5 partitions. Refer to this documentation to know more.
The below image shows the graphical representation of consumer lag over 10 minute duration. The average Consumer Lag comes out to be 3..
To calculate the percentage change between these two averages, we use the below formula, which comes out to be 62.15%.
Num1 is consumer lag with KEDA
Num2 is consumer lag without KEDA
Calculating percentage change
From this, we conclude that there is a 62.15% decrease in consumer lag when messages are consumed using KEDA.
In summary, integrating KEDA with Kafka in Kubernetes significantly improves consumer performance and resource efficiency, as highlighted by the 62.15% reduction in consumer lag. we encourage you to tailor these methods to your setup and share your findings. Your feedback and experiences are invaluable to our community’s learning.
For more insights into autoscaling and KEDA, get started with Kedify.
Share: