179 total views, 3 views today
In this tutorial, I will guide you on How to Install Apache Kafka in Ubuntu 14.04/16.04/18.04 and Debian. These below steps are common for ubuntu 14.04/16.04/18.04 and Debian versions. So let us get started.
Apache Kafka is a distributed messaging system based on the publishing and subscribing model. It is similar to a message queue such as ActiveMQ or other enterprise messaging systems. Apache Kafka is mainly used in real-time data processing and analytics. It is efficient in handling huge volumes of data in real-time in a fault-tolerant way with high throughput. It considered that Apache Kafka is reliable and efficient when compared to other messaging systems. Apache Kafka is purely a distributed streaming platform. Apache Kafka is not only a message broker for the pub/sub model, In addition, but Apache Kafka has also introduced a new API called Kafka streams which can now react to data by applying a transformation on the incoming messages. It is now possible to process incoming data via Kafka Streams. In a nutshell, Apache Kafka is a very fast, reliable, highly scalable, High throughput and fault tolerant messaging system and a stream processor. Let us now see How to Install Apache Kafka in Ubuntu 14.04/16.04/18.04 and Debian
Apache Kafka is developed in Java and it runs on JVM.So you need to have Java installed on the machine. If you already have Java, you are good to skip this step. If you do not have Java, please install it by executing the below command. It is always advisable to install Java 8 as everyone right now is moving towards Java 8 🙂
3. If you get an error as stated in the image, then Java is not available.
4. Run “sudo apt install openjdk-8-jdk” and continue with “y”
5. Now Java 8 would have been installed successfully. To verify, please run “java -version“
Download the official package of Apache Kafka from the official site.
wget https://www-us.apache.org/dist/kafka/2.2.0/kafka_2.12-2.2.0.tgz -O kafka.tgz
We here wget command to download the package from Apache Kafka’s official site.
Run below command to extract the tar file
tar xzf kafka.tgz
Now move the extracted folder to /opt/ directory. (You can keep apache Kafka extracted folder at your convenient location. But just remember the path)
mv kafka/ /opt/
Apache Kafka depends on Apache Zookeeper. The main role of apache zookeeper in Kafka cluster is to co-ordinate between the Apache Kafka brokers and store metadata regarding Apache Kafka topics. With Zookeeper, we cannot start the Apache Kafka server.
Apache Zookeeper is mandatory.
Start Zookeeper by executing the below command.
bin/zookeeper-server-start.sh cong/zookeeper.properties [2019-10-24 17:48:19,087] INFO Server environment:user.dir=/opt/kafka (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-24 17:48:19,107] INFO tickTime set to 3000 (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-24 17:48:19,107] INFO minSessionTimeout set to -1 (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-24 17:48:19,107] INFO maxSessionTimeout set to -1 (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-24 17:48:19,119] INFO Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory (org.apache.zookeeper.server.ServerCnxnFactory) [2019-10-24 17:48:19,123] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)
Now it is time to start our Apache Kafka server also called Apache Kafka broker. Start Kafka by executing the below command.
bin/kafka-server-start.sh config/server.properties [2019-10-24 17:51:46,001] INFO [ThrottledChannelReaper-Fetch]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2019-10-24 17:51:46,002] INFO [ThrottledChannelReaper-Produce]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2019-10-24 17:51:46,003] INFO [ThrottledChannelReaper-Request]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
A Kafka topic is where the incoming records will be stored. Each topic will be partitioned across a Kafka cluster and will be replicated to achieve fault-tolerant. A topic can have 0 to N number of producers publishing records to the topic. A topic can have 0 to N consumers to consume the messaged. Execute the below command to create a Topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic sampletopic
Here we have to give zookeeper host and port details. The replication factor is 1 because we have started only one Kafka broker so we can have only 1 replication factor. We have created a topic with the only partition and the topic name is “sampletopic”.
Now you have created topic,it is time to check if the topic has been created successfully in your Kafka cluster. Run the below command to list the topics available.
bin/kafka-topics.sh --list --zookeeper localhost:2181 sampletopic testTopic
We are all set with Kafka setup and we are the final stage to check and verify if messages are being published and consumed successfully. “Kafka Producer” will be producing the messages in the Kafka cluster and Kafka has provided a script to produce the message via shell script. But in realtime, we produce the messages via our code. Kafka does have provided producer and consumer APIs to post and consume messages in our code. In this example, we will be using the Kafka producer script bundled with the Kafka package. Let us now run the producer and send a few messages.
bin/kafka-console-producer.sh --bker-list localhost:9092 --topic sampletopic >Hello >I am producing message >Hope you received >Thank You
If you notice, I have produced four messages separated to the topic “sampletopic” by a line separator. Each line will be treated as a separate message. Let us now start our consumers.
Kafka has provided us a script as well as consumer API to start our consumers via command line and in our code. For this example, I am going to use the Kafka consumer script provided by Kafka in the bundled package. Let us now start our consumers and see if the messages are consumed.
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic sampletopic --from-beginning Hello I am producing message Hope you received Thank You
If you notice, I have mentioned the bootstrap-server details which is nothing but you Kafka server host and port details. Also, I have mentioned the topic from where I consume the messaged. –from-beginning is to read all messages from the beginning in the topic.
In order to stop the Kafka server, we have to follow shutdown consumer, producer, zookeeper as per the below order. If not, you will face errors.
That’s it. Apache Kafka has been successfully installed on your machine.
I hope this tutorial is useful and you are able to understand and learn How to Install Apache Kafka in Ubuntu 14.04/16.04/18.04 and Debian. Please let us know your feedback and leave us a comment 🙂