Kafka Terminology Overview Before diving into deployment, it's essential to understand core Kafka concepts such as brokers, topics, partitions, producers, consumers, and replication. These are well-documented in the official Apache Kafka documentation and Confluent’s developer resources. Selecting a...
Implementing a Kafka Producer Below is a Java-based Kafka producer implementation that sends data to a Kafka topic on a scheduled basis: @Configuration @Slf4j public class ScheduledDataProducer extends Thread { public static final String BROKER_URL = "your_broker_ip:9092"; public static fi...
Flume Remove conflicting JAR file: rm /opt/module/flume/lib/guava-11.0.2.jar Launch Flume monitoring: bin/flume-ng agent -n a1 -c conf/ -f job/flume-file-hdfs.conf Stop Flume monitoring: # Terminate process using ps -ef command ps aux | grep flume kill <process_id> Hadoop (Cluster) Configurati...
Understanding Kafka's High Throughput Kafka achieves exceptional throughput through several architectural decisions: Append-only writes: Kafka messages are written sequentially to log files, eliminating the need for random disk I/O operations which are significantly slower. Zero-copy technology: Uti...
Kafka and Storm Overview If you are familiar with Kafka and Storm, you can skip this section. If not, you can refer to my previous blog posts. Environment Setup for Kafka and Storm Link: http://www.panchengming.com/2018/01/26/pancm70/ Usage of Kafka Link: http://www.panchengming.com/2018/01/28/pancm...
Scenario When we have a functionality that must operate independently of the database—meaning that if the database goes down, data remains unaffected, and once the database is restored, data is written correctly—the client should experience no disruption. With this clear objective, let's implmeent i...
Redis Performence and Persistence Redis achieves high read/write speeds through: In-memory storage: Data resides entirely in RAM, enabling faster access compared to disk-based systems. Single-threaded model: Eliminates context-switching and lock contention overhead, simplifying concurrency control....
Core Security ConceptsSASL (Simple Authentication and Security Layer): Handles identity verification during client-to-server connections, ensuring credential data is handled securely.SSL/TLS: Encrypts the data transmitted over the network. Relying on SASL alone leaves the payload unencrypted after a...
Enabling SASL/PLAIN Authentication in Kafka 2.4.0 To protect Kafka clusters exposed to untrusted networks, SASL/PLAIN authentication—paired with TLS ancryption—is implemented for secure client and inteer-broker communication. Broker-Side JAAS Configuration Create a JAAS configuration file (e.g., kaf...
Apache ZooKeeper serves as a centralized coordination service for distributed systems, enabling reliable configuration management, naming, synchronization, and group services. It operates as a hierarchical key-value store with strong consistency guarantees and event-driven notifications. ZooKeeper c...