Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Setting Up Standalone Kafka with Docker Compose: Including UI, JMX Exporter, and Monitoring

Tech May 18 2

Kafka Terminology Overview

Before diving into deployment, it's essential to understand core Kafka concepts such as brokers, topics, partitions, producers, consumers, and replication. These are well-documented in the official Apache Kafka documentation and Confluent’s developer resources.

Selecting a Kafka Docker Image

Official Kafka Images (Post-3.7.0)

As of Kafka 3.7.0, the Apache project has introduced official Docker images. For new deployments, these are recommended due to better alignment with upstream releases.

Using Bitnami's Community Image

This guide uses bitnami/kafka, one of the most widely adopted community-maintained images on Docker Hub. A key feature is its environment variable mapping:

  • Any variable prefixed with KAFKA_CFG_ maps directly to Kafka server properties. For example, KAFKA_CFG_LOG_DIRS sets log.dirs.
  • To enable debugging output, set BITNAMI_DEBUG=true.

Detailed configuration options can be found in Bitnami’s GitHub repository, which provides more comprehensive documentation than the Docker Hub page allows.

Choosing a Kafka Web Interface

Kafka UI offers a lightweight, open-source dashboard for managing and monitoring Kafka clusters. It supports multiple cluster connections, topic browsing, message inspection, and basic metrics visualization.

The project’s GitHub includes several docker-compose examples covering various setups including KRaft-based clusters—ideal for learning and development environments.

Docker Compose Configuration

version: '3'
services:
  kafka:
    image: bitnami/kafka:latest
    container_name: kafka
    ports:
      - "9092:9092"
      - "9093:9093"
      - "9998:9998"
      - "9095:9095"
    volumes:
      - kafka_data:/bitnami/kafka
      - ./jmx_prometheus_javaagent-1.0.1.jar:/opt/bitnami/kafka/config/jmx_prometheus_javaagent-1.0.1.jar:ro
      - ./kafka-kraft-3_0_0.yml:/opt/bitnami/kafka/config/kafka-kraft-3_0_0.yml:ro
    environment:
      BITNAMI_DEBUG: "true"
      KAFKA_HEAP_OPTS: "-Xmx2048m -Xms2048m"
      # KRaft Mode Configuration
      KAFKA_CFG_NODE_ID: "1"
      KAFKA_CFG_PROCESS_ROLES: "broker,controller"
      KAFKA_CFG_CONTROLLER_LISTENER_NAMES: "CONTROLLER"
      KAFKA_BROKER_ID: "1"
      # Listener Setup
      KAFKA_CFG_LISTENERS: "CONTROLLER://:9094,BROKER://:9092,EXTERNAL://:9093"
      KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: "CONTROLLER:PLAINTEXT,BROKER:PLAINTEXT,EXTERNAL:PLAINTEXT"
      KAFKA_CFG_ADVERTISED_LISTENERS: "BROKER://kafka:9092,EXTERNAL://192.168.0.101:9093"
      KAFKA_CFG_INTER_BROKER_LISTENER_NAME: "BROKER"
      KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: "1@kafka:9094"
      ALLOW_PLAINTEXT_LISTENER: "yes"
      # JMX & Prometheus Integration
      JMX_PORT: "9998"
      KAFKA_JMX_OPTS: >
        -Dcom.sun.management.jmxremote
        -Dcom.sun.management.jmxremote.authenticate=false
        -Dcom.sun.management.jmxremote.ssl=false
        -Djava.rmi.server.hostname=kafka
        -Dcom.sun.management.jmxremote.rmi.port=9998
      KAFKA_OPTS: >-
        -javaagent:/opt/bitnami/kafka/config/jmx_prometheus_javaagent-1.0.1.jar=9095:/opt/bitnami/kafka/config/kafka-kraft-3_0_0.yml
    deploy:
      resources:
        limits:
          memory: 4G
    memswap_limit: -1

  kafka-ui:
    container_name: kafka-ui
    image: provectuslabs/kafka-ui:latest
    ports:
      - "9095:8080"
    depends_on:
      - kafka
    environment:
      KAFKA_CLUSTERS_0_NAME: local-kraft
      KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:9092
      KAFKA_CLUSTERS_0_METRICS_PORT: 9998
      SERVER_SERVLET_CONTEXT_PATH: /kafkaui
      AUTH_TYPE: LOGIN_FORM
      SPRING_SECURITY_USER_NAME: admin
      SPRING_SECURITY_USER_PASSWORD: securepass123
      DYNAMIC_CONFIG_ENABLED: 'true'

volumes:
  kafka_data:
    driver: local

Configuration Highlights

KRaft vs ZooKeeper

Kafka has transitioned from relying on ZooKeeper for metadata management to using its internal consensus protocol called KRaft (Kafka Raft Metadata mode). This change simplifies deployment by removing an external dependency and improves scalability and recovery times.

Key configurations enabling KRaft:

  • node.id: Unique identifier for this instance.
  • process.roles: Defines whether the node acts as a broker, controller, or both.
  • controller.quorum.voters: Specifies the list of controllers participating in voting.

Understanding Listeners

Kafka uses different listeners for distinct communication paths:

  1. Controller Listener: Used internally among controller nodes for metadata coordination.
  2. Internal Broker Listener: For inter-broker data replication and communication.
  3. External Listener: Exposes the broker to client applications outside Docker.

The advertised.listeners must correctly map internal container addresses to externally reachable ones, especially in bridged networking scenarios.

Memory and JVM Settings

The service allocates 4GB of total memory via Docker, while limiting the JVM heap to 2GB using KAFKA_HEAP_OPTS. The remaining memory is available for OS-level page caching, improving disk I/O performance.

A mismatch here—such as setting JVM and container limits too close—can lead to excessive swapping and degraded performance. If needed, adjust dynamically without restart:

docker update --memory 4g --memory-swap -1 kafka

Monitoring with Prometheus JMX Exporter

Kafka exposes JMX metrics that can be scraped by Prometheus using the JMX Exporter. The configuration file kafka-kraft-3_0_0.yml comes from the exporter’s example configs and captures key broker-level metrics.

To verify:

  • Start the stack and navigate to http://localhost:9095/kafkaui to access Kafka UI.
  • Check raw metrics at http://localhost:9095/metrics (if exposed through Kafka UI or another endpoint).

Kafka UI Environment Notes

Note that KAFKA_CLUSTERS_0_METRICS_PORT refers only to the port because Kafka UI automatically resolves the hostname of the linked Kafka container. This abstraction simplifies configuration but relies on proper Docker networking setup.

Testing the Deployment

Launch the services:

docker-compose up -d

Access the web interface:

Resolving JMX Port Conflicts When Running CLI Tools

When JMX is enabled, running Kafka CLI tools inside the container may fail due to port conflicts:

Address already in use

Solusions:

  1. Unset environment variables before running commands: ``` docker exec -it kafka bash unset JMX_PORT KAFKA_OPTS /opt/bitnami/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic test-topic --partitions 1 --replication-factor 1
  2. Run commands with clean environment: ``` docker exec -e JMX_PORT= -e KAFKA_OPTS= kafka
    /opt/bitnami/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe
    
    

Scaling to Multi-Node Clusters

While this setup focuses on a single-node standalone environment, production systems require redundancy. A minimal high-availability KRaft cluster should include:

  • Three controller nodes for fault tolerance (supporting up to one failure).
  • At least two brokers with replication factor ≥ 2 and min.insync.replicas=2 to ensure durability.

With three combined controller/broker nodes, you achieve resilience against single-node outages while maintaining quorum.

Example Architectures

  • Confluent Official Examples: Use dedicated controllers and separate brokers for large-scale deployments.
  • Bitnami Cluster Example: All-in-one nodes acting as both controller and broker.
  • Kafka-In-A-Box: Development-focused multi-container setup with full observability.

Performence Comparison: Kafka vs RabbitMQ vs Pulsar

For throughput, latency, and scalability benchmarks, refer to performance studies published by Confluent and indepandent testing groups. Generally:

  • Kafka: Optimized for high-throughput, durable logging and event streaming.
  • RabbitMQ: Better suited for complex routing, low-latency messaging, and traditional queuing patterns.
  • Pulsar: Offers geo-replication, tiered storage, and unified queues/streaming, at increased operational complexity.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.