1. Introduction

This document shows you some Kafka labs which I made in my own environment (using macOS or Linux in in my home network) by following (and sometimes adapting) tutorials that I read on the Internet. I let the references available inside the labs in order to give you a chance to explore more.

When I develop my Labs, I usually add some functions written in Bash to make them more fluent in its execution through a command line (inside a tmux session) on my environment(s). By doing this I can repeat my steps in a more practical, quick, and effective way.

1.1. My environment

1.1.1. macOS

In these labs, I am using the macOS Operating System (OS) with the following softwares (and versions):

$ sw_vers
ProductName:	macOS
ProductVersion:	11.2.1
BuildVersion:	20D74

Bash installed:

$ echo $BASH_VERSION
5.1.4(1)-release

Git installed:

$ git --version
git version 2.30.1

GNU versions of sed and grep (installed using Homebrew):

$ gsed --version | head -1
gsed (GNU sed) 4.8
On my environment sed is just a Bash function that calls gsed.
$ ggrep --version | head -1
ggrep (GNU grep) 3.6
In the same way, grep in my environment is just a Bash function that calls ggrep.

1.1.2. Linux

TODO

1.1.3. Notes

  1. Read this if you want know more details about my home network.

  2. I don’t use the Microsoft Windows operating systems. I can’t help you if you are using them to reproduce my Labs.

  3. Some of these labs have specific prerequisites.

2. Labs download and installation

$ git clone https://github.com/paulojeronimo/kafka-labs
$ cd `basename $_` && source functions.sh

Maybe you have to install some specific software when running the last command (source functions.sh). Especially if you are running it on a macOS environment, so please pay attention to its output.

3. Lab1: Quickstart (using a JVM)

3.1. Main goals

Understand in practice …​

  1. The main components of a Kafka infrastructure.

  2. How to start/stop a simple Kafka infrastructure.

  3. How to create a topic.

  4. How to do a cleanup (useful in development environments, ie).

  5. How to quickly start an environment to test Kafka (using tmux).

3.2. Prerequisites

  1. An Java Virtual Machine (JVM) 1.8 installed:

    $ java -version
    openjdk version "1.8.0_282"
    OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_282-b08)
    OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.282-b08, mixed mode)

3.3. Downloading and extracting Kafka

Download, extract and change to the Kafka directory:

$ kafka-labs && kafka-download && kafka-extract && lab1-dir
These are functions that I created in functions.sh.

3.4. Starting tmux

Start tmux:

$ tmux new -s lab1
Subsequent terminal creations through tmux will be set the current directory to where the session where started (in this case, the directory where the Kafka package were extracted).

3.5. Starting the ZooKeeper server

$ # terminal 1
$ bin/zookeeper-server-start.sh config/zookeeper.properties

3.6. Starting the Kafka server

Open another terminal (terminal 2 - Ctrl+b+" on tmux) and start a broker instance:

$ # terminal 2
$ bin/kafka-server-start.sh config/server.properties

3.7. Creating a topic

Open another terminal (terminal 3 - Ctrl+b+c on tmux) and create a topic:

$ # terminal 3
$ bin/kafka-topics.sh \
--create --topic quickstart-events --bootstrap-server localhost:9092

See the topic details:

$ # terminal 3
$ bin/kafka-topics.sh \
--describe --topic quickstart-events --bootstrap-server localhost:9092

3.8. Starting a producer

$ # terminal 3
$ bin/kafka-console-producer.sh \
--topic quickstart-events --bootstrap-server localhost:9092

Send some events:

>This is my first event
>This is my second event
>

3.9. Starting a consumer

Open another terminal (terminal 4 - Ctrl+b+" on tmux) and start a consumer:

$ # terminal 4
$ bin/kafka-console-consumer.sh \
--topic quickstart-events --bootstrap-server localhost:9092 \
--from-beginning

This is the expected output:

This is my first event
This is my second event

3.10. Doing the cleanup

Type Ctrl+C then Ctrl+D on terminals 4, 3, 2 (in this order). Type Ctrl+C on terminal 1 to terminate the ZooKeeper instance.

Delete all the data:

$ # terminal 1
$ rm -rf /tmp/kafka-logs /tmp/zookeeper

Finally, finish the tmux session (lab1) by typing Ctrl+D.

3.11. Reviewing

Start the lab using one single line:

$ lab1-start

Stop all the opened windows manually (using Ctrl+C and Ctrl+D).

Delete all the data:

$ lab1-cleanup

3.12. References

4. Lab2: Quickstart (using Docker)

4.1. Main goals

Understand in practice …​

  1. How to start/stop Kafka using Docker Compose.

  2. How to see their components through a browser.

  3. How to produce and consume records with key-value pairs.

4.2. Prerequisites

  1. Docker installed:

    $ docker -v
    Docker version 20.10.2, build 2291f61
  2. Docker Compose installed:

    $ docker-compose -v
    docker-compose version 1.27.4, build 40524192

4.3. Launching Confluent Platform

$ lab2-dir && tmux new -s lab2
$ cat docker-compose.yml
$ docker-compose up -d
$ docker-compose ps
$ docker-compose logs -f

4.4. Creating a topic

Start a new terminal (terminal 2 - Ctrl+b+c on tmux) and type:

$ # terminal 2
$ docker-compose exec kafka kafka-topics \
--create --topic example-topic --bootstrap-server kafka:9092 \
--replication-factor 1 --partitions 1

Describe the topic:

$ # terminal 2
$ docker-compose exec kafka kafka-topics \
--describe --topic example-topic --bootstrap-server kafka:9092

Expected output for the command above:

Topic: example-topic    PartitionCount: 1       ReplicationFactor: 1 Configs:
        Topic: example-topic    Partition: 0    Leader: 1 Replicas: 1     Isr: 1

4.5. Starting a console consumer

$ # terminal 2
$ docker-compose exec kafka bash
$ kafka-console-consumer \
--topic example-topic --bootstrap-server kafka:9092

4.6. Producing your first records

Start a new terminal (terminal 3 - Ctrl+b+" on tmux) and type:

$ # terminal 3
$ docker-compose exec kafka bash
$ kafka-console-producer \
--topic example-topic --broker-list kafka:9092

Send some events:

>a
>b
>c
>

4.7. Seeing the Kafka environment through your web browser

Open Confluent Control Center at http://localhost:9021.

Open the tab Messages on topic example-topic. Type some messages on terminal 3 and note that they will appear on the Control Center.

4.8. Starting a new consumer to read all records

Start a new terminal (terminal 4 - Ctrl+b+" on tmux) and type:

$ # terminal 4
$ docker-compose exec kafka bash
$ kafka-console-consumer \
--topic example-topic --bootstrap-server kafka:9092 --from-beginning

4.9. Producing and consuming records with full key-value pairs

On terminal 4, finish the current consumer:

$ # terminal 4
$ # Type Ctrl+C

Start a new producer:

$ # terminal 4
$ kafka-console-producer \
--topic example-topic --broker-list kafka:9092 \
--property parse.key=true --property key.separator=":"

On terminal 3, finish the current producer:

$ # terminal 3
$ # Type Ctrl+C

Start a new consumer:

$ # terminal 3
$ kafka-console-consumer \
--topic example-topic --bootstrap-server kafka:9092 \
--property print.key=true --property key.separator="-"

Enter these records either one at time or copy-paste all of them into the terminal 4 and hit enter:

key1:what a lovely
key1:bunch of coconuts
foo:bar
fun:not quarantine

Observe the output shown on terminal 2, terminal 3 and Control Center.

4.10. Doing the cleanup

$ # terminal 1
$ # Type Ctrl+C to stop "docker-compose logs -f"
$ docker-compose down -v

Also, terminate the tmux session (lab2) by typing Ctrl+D.

5. Lab3: Starting Kafka with 3 Brokers and 3 Partitions (using a JVM)

5.1. Main goals

Understand in practice …​

  1. How to configure, start and stop a Kafka cluster.

  2. How to see their components through a browser.

  3. What are partitions and their relation with topics.

5.2. Doing a fresh Kafka installation

$ kafka-labs && kafka-download && kafka-extract && lab3-dir

5.3. Configuring the file server.properties for the Kafka servers

$ (cd config; for n in 2 3
do
	patch server.properties \
		../../patches/lab3/server.$n.properties.diff \
		-o server.$n.properties
	vim -d server.properties server.$n.properties
done)

5.4. Starting the Kafka cluster (1 ZooKeeper + 3 Brokers)

$ tmux new-session -s lab3 \
	'bin/zookeeper-server-start.sh config/zookeeper.properties' \; \
	split-pane 'sleep 2; bin/kafka-server-start.sh config/server.properties' \; \
	split-pane 'sleep 2; bin/kafka-server-start.sh config/server.2.properties' \; \
	split-pane 'sleep 2; bin/kafka-server-start.sh config/server.3.properties' \; \
	select-layout even-vertical
The sleep of 2 seconds for each kafka server is to wait for the ZooKeeper server to be started.

5.5. Creating a topic

Create a new tmux window (Ctrl+b+c on tmux) and create a topic:

$ bin/kafka-topics.sh --create --topic my-topic \
--zookeeper localhost:2181 \
--replication-factor 3 --partitions 3

See the topic details:

$ bin/kafka-topics.sh --describe —topic my-topic \
--zookeeper localhost:2181

Expected output for the command above:

Topic: my-topic PartitionCount: 3       ReplicationFactor: 3    Configs:
        Topic: my-topic Partition: 0    Leader: 1       Replicas: 1,2,0 Isr: 1,0,2
        Topic: my-topic Partition: 1    Leader: 0       Replicas: 2,0,1 Isr: 0,1,2
        Topic: my-topic Partition: 2    Leader: 0       Replicas: 0,1,2 Isr: 0,1,2

Kill the third Kafka broker.

To do this …​
  1. Type Ctrl+b+n and Ctrl+C.

  2. Type Ctrl+b+n to go back to the previous window.

Repeat the command describe (using the up arrow key or by typing !! at the command prompt). You should now notice the following output:

Topic: my-topic PartitionCount: 3       ReplicationFactor: 3    Configs:
        Topic: my-topic Partition: 0    Leader: 1       Replicas: 1,2,0 Isr: 1,0
        Topic: my-topic Partition: 1    Leader: 0       Replicas: 2,0,1 Isr: 0,1
        Topic: my-topic Partition: 2    Leader: 0       Replicas: 0,1,2 Isr: 0,1

5.6. Running a built in Kafka producer to generate random test messages

$ bin/kafka-producer-perf-test.sh --topic my-topic \
--num-records 50 --record-size 1 --throughput 10 \
--producer-props \
	bootstrap.servers=localhost:9092,localhost:9093,localhost:9094 \
	key.serializer=org.apache.kafka.common.serialization.StringSerializer \
	value.serializer=org.apache.kafka.common.serialization.StringSerializer

Sample output:

50 records sent, 10.040161 records/sec (0.00 MB/sec), 15.00 ms avg latency, 398.00 ms max latency, 3 ms 50th, 48 ms 95th, 398 ms 99th, 398 ms 99.9th.

5.7. Doing the cleanup

Kill the Kafka server brokers (by typing Ctrl+C on their panes) and lastly kill the ZooKeeper server.

Also, finish the tmux session (lab3) by typing Ctrl+D.

After that, type:

$ rm -rf /tmp/kafka-logs* /tmp/zookeeper

6. Lab4: Creating Kafka producer/consumer applications in Java

6.1. Main goals

Understand in practice …​

  1. How to develop Java applications that uses the Kafka infrastructure.

6.2. Execution

6.2.1. Preparing the environment

Make sure you did the cleanup for the lab3.

Start the ZooKeeper server and one Kafka broker with the following command:

$ start-servers lab4

This last command will start a new tmux session running each server in their own pane. Attach to this tmux session:

$ tmux attach

6.2.2. Creating a topic

Create a new tmux window (Ctrl+b+c on tmux) and create a topic:

$ bin/kafka-topics.sh --create --topic user-tracking \
--replication-factor 1 --partitions 1 \
--zookeeper localhost:2181
Expected output:
Created topic user-tracking.

6.2.3. Running the consumer application

$ lab4-dir && cd user-tracking-consumer
$ mvn compile exec:java -Dexec.mainClass="com.example.kafka.consumer.Main"

6.2.4. Running the producer application

Start a new terminal (with Ctrl+b+" on tmux) and type:

$ lab4-dir && cd user-tracking-producer
$ mvn compile exec:java -Dexec.mainClass="com.example.kafka.producer.Main"

Repeat the last command many times while follow the output procuced by the consumer application.

6.3. Tutorial

TODO

6.4. Doing the cleanup

TODO

7. Lab5: Kafka SSL Encryption (using LXD)

8. Lab6: Kafka SSL Authentication (using LXD)

9. Lab7: Kafka SASL Authentication (Kerberos GSSAPI) (using LXD)

10. Lab8: Running Kafka on Kubernetes