Kafka- Best practices & Lessons Learned | By Inder

  1. Kafka hardware requirements
  2. Broker configurations
  3. Producer configurations
  4. Kafka Security (TLS)--encryption reduces performance by 30%(approx)
  5. Capacity of cluster
  6. Monitoring and alerting (esp consumer lag)
  7. DR/HA Setup

Know Kafka hardware requirements


  • If SSL is enabled, the CPU requirements can be significantly higher (the exact details depend on the CPU type and JVM implementation).
  • A higher replication factor consumes more disk and CPU to handle additional requests.
  • If compression is used, then producers and consumers must commit some CPU cycles for compressing data and decompressing data. More cores also lead to more parallelization.


Broker Configuration

Kafka topic configuration

Use parallel processing

  1. Lets call the throughput from producer to a single partition is P
  2. Throughput from a single partition to a consumer is C
  3. Target throughput is T
  4. Required partitions = Max (T/P, T/C)

Partitions = Desired Throughput / Partition Speed

Number of replicas

Partitions density

Producer Configurations

Producer required acks — Configure your producer to wait for acknowledgments

For high-throughput producers, tune Batch size


Configure Kafka with security in mind

  • Encryption of data in-flight using SSL / TLS: This allows your data to be encrypted between your producers and Kafka and your consumers and Kafka. This is a very common pattern everyone has used when going on the web. That’s the “S” of HTTPS (that beautiful green lock you see everywhere on the web).
  • Authentication using SSL or SASL: This allows your producers and your consumers to authenticate to your Kafka cluster, which verifies their identity. It’s also a secure way to enable your clients to endorse an identity. Why would you want that? Well, for authorization!
  • Authorization using ACLs: Once your clients are authenticated, your Kafka brokers can run them against access control lists (ACL) to determine whether or not a particular client would be authorised to write or read to some topic.

What should the capacity of my kafka cluster be?

  • the topic’s retention period
  • the average size of your Kafka messages
  • the amount of messages you expect to push through the system.
  • replication factor

Monitoring & Alerting

  • Retention: How much data could be stored on disk for each topic partition?
  • Replication: How many copies of the data could be made?
  • Consumer Lag: How to monitor how far behind our consumer applications are from the producers?
  • Monitoring system metrics such as network throughput, open file handles, memory, load, disk usage, and other factors — is essential, as is keeping an eye on JVM stats, including GC pauses and heap usage.

Disaster Recovery Plan

Key Takeways

  • Low overhead and horizontal-scaling-friendly design of Kafka makes it possible to use inexpensive commodity hardware and still run it quite successfully.
  • For sustained, high-throughput brokers, provision sufficient memory to avoid reading from the disk subsystem.
  • More partitions mean a greater parallelization and throughput but partitions also mean more replication latency, rebalances, and open server files.
  • Increase Kafka’s default replication factor to three, which is appropriate in most production environments.
  • Monitor consumer lag.




Enterprise Modernization, Platforms & Cloud | CKA | CKS | 3*AWS | GCP | Vault | Istio | EFK | CICD | https://www.linkedin.com/in/inder-pal-singh-6a203b66/

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Dare to dream and dream BIG! (backstory and week 1)

Using namespaces and autoloading with Composer in WordPress plugins

4 Helpful tips for developer students

Difference between merge() and join() in Pandas.

Authentication in Rails

Is Managing Your Company Data Becoming A Challenge? Here Are 3 Reasons To Get A Data Fabric

SAFe: Scaled Agile Framework… vom…but…

Visualising ammo count in Unity

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Inder Singh

Inder Singh

Enterprise Modernization, Platforms & Cloud | CKA | CKS | 3*AWS | GCP | Vault | Istio | EFK | CICD | https://www.linkedin.com/in/inder-pal-singh-6a203b66/

More from Medium

Top 5 Apache Kafka Use Cases for 2022

Multi-Tenancy Systems: Apache Pulsar vs. Kafka

Scaling microser through sidecars

Solving Four Kubernetes Networking Challenges