Skip to content

Lab: Creating a Strimzi Cluster

Introduction

A basic Kafka cluster consists of two resources in Strimzi:

  • A single kind: Kafka, which is the definition and configuration of a Kafka cluster
  • A minimum of one kind: KafkaNodePool, each of which defines a (virtual) pool of pods used to execute Kafka workloads

Both resources are part of a named cluster and together will lead to the Strimzi Operator to create the actual workloads with the specific configuration.

Resource kind: KafkaNodePool

Because Kafka pods are not defined directly, a KafkaNodePool CR is the smallest unit to be defined for Kafka workloads and represents a virtual pool of worker pods used to run Kafka controllers and/or brokers. Each pod started by the Strimzi Operator is part of a KafkaNodePool and thus bound to its configuration and limits. It alone will not cause any resource allocation but will serve as a template for resources created for a specific cluster.

Specifically, a KafkaNodePool allows to:

  • Define the number of replicas
  • Set roles (controller, broker or both) which are passed down to each pod
  • Define the storage setup used for each pod
  • Modify Kubernetes resource configurations via a podTemplate

As each Kafka Cluster requires brokers as well as controllers, we must ensure to provide at least one Kafka node pool for each role. As a KafkaNodePool can be assigned a single role or both (=dual role), we can choose to create separate node pools for brokers and controllers or use the same pods for both roles.

Example: a basic Kafka node pool with dual-role configuration:

ref.: Strimzi documentation

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaNodePool
metadata:
  name: kraft-dual-role
  labels:
    strimzi.io/cluster: my-cluster
spec:
  replicas: 3
  roles:
    - controller
    - broker
  storage:
    type: jbod
    volumes:
      - id: 0
        type: persistent-claim
        size: 100Gi
        deleteClaim: false
  resources:
    requests:
      memory: 2Gi
      cpu: "200m"
    limits:
      memory: 4Gi
      cpu: "500m"

In this example, a node pool is defined with:

  • three replicas, which will lead to 3 pods being scheduled
  • resource limits of (applied to each pod):
    • memory: 2Gi/4Gi (request/limits
    • CPU: 200m/500m (request/limits
  • Storage of type JBOD with a single persistent volume of 100GiB (see note below)

While other storage configuration types are possible, Strimzi strongly recommends to use a JBOD configuration. Similar to classic storage clusters, JBOD means an extensible array of Kubernetes volumes (but not physical disks) which are mounted into each pod and used to store Kafka log data. As with the resource limits, the volumes defined within the JBOD configuration are applied for each pod.

For example: a configuration specifying two 100Gi volumes within a Kafka node pool of three replicas will lead to three pods being scheduled, each with two unique 100 Gi volumes mounted at their respective data directories.

Resource kind: Kafka

A resource of kind Kafka represent the definition of an abstract Kafka Cluster. While Kafka node pools are the main entity to configure the workloads of a cluster in terms of Kubernetes resources ( memory, storage, CPU), the cluster resource is the single definition of connectivity, authentication and authorization, logging and other features which are part of an abstract Kafka cluster.

Common definitions include:

  • Listeners, defining endpoints to allow clients to connect with options for authentication and encryption
  • Authorization mechanisms to restrict permissions of connected clients
  • Logging
  • Metrics
  • Dedicated Strimzi Operators for a cluster
  • Additional features of a Strimzi cluster which are not part of vanilla Kafka deployment (e.g. Cruise Control, jmxExporter)

Because this definition does not include any configuration of workloads, a cluster needs an appropriate number of KafkaNodePool to schedule workload execution.

Example: a basic Kafka Cluster definition

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
  namespace: strimzi
  annotations:
    strimzi.io/node-pools: enabled
    strimzi.io/kraft: enabled
spec:
  kafka:
    version: 4.0.0
    metadataVersion: 4.0-IV3
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
  entityOperator:
    topicOperator: { }
    userOperator: { }

In the above example we declare a simple Kafka Cluster which requires dedicated Kafka node pools and runs in KRaft mode instead of Zookeeper. We also declare a single listener on TCP port 9092 of type internal, which will cause a service of type ClusterIP to be created. This service will expose the given TCP port and can be used by Kafka clients within the Kubernetes cluster as the Kafka bootstrap server.

More information regarding listener configurations can be found in the Strimzi configuration.

Lab Exercise: Creating a Basic Cluster

We would like to declare and deploy a basic Kafka cluster in a dedicated Kubernetes namespace.

Exercise 1: Define a KafkaNodePool

Declare a new Kafka node pool for a simple Kafka cluster with a replication factor of three and using the same nodes both as brokers and controllers. The pods should have an initial resource request of 512 MiB memory and 100m CPU, which should be limited to a maximum of 1 GiB memory and 500m CPU. Every node should receive a single persistent volume with a capacity of 20 GiB. The node pool should be part of a cluster named cluster-1 (which does not exist yet).

Start with this resource stub:

apiVersion: kafka.strimzi.io/v1
kind: KafkaNodePool
metadata:
  name: cluster-1-dual-role
spec:
Hint 1: Replication Factor and Node Roles
apiVersion: kafka.strimzi.io/v1
kind: KafkaNodePool
metadata:
  name: cluster-1-dual-role
  labels:
    strimzi.io/cluster: cluster-1
spec:
  replicas: 3
  roles:
    - controller
    - broker
Hint 2: Resource Limits
apiVersion: kafka.strimzi.io/v1
kind: KafkaNodePool
metadata:
  name: cluster-1-dual-role
  labels:
    strimzi.io/cluster: cluster-1
spec:
  replicas: 3
  roles:
    - controller
    - broker
  resources:
    requests:
      memory: 512Mi
      cpu: "100m"
    limits:
      memory: 1Gi
      cpu: "500m"
Final Solution

file ./cluster-1.nodepool.yaml

apiVersion: kafka.strimzi.io/v1
kind: KafkaNodePool
metadata:
  name: cluster-1-dual-role
  labels:
    strimzi.io/cluster: cluster-1
spec:
  replicas: 3
  roles:
    - controller
    - broker
  storage:
    type: jbod
    volumes:
      - id: 0
        type: persistent-claim
        size: 20Gi
        deleteClaim: false
  resources:
    requests:
      memory: 512Mi
      cpu: "100m"
    limits:
      memory: 1Gi
      cpu: "500m"

Now deploy the node pool in your personal namespace.

Solution
kubectl apply -n NAMESPACE -f ./cluster-1.nodepool.yaml

Exercise 2: Define a cluster (kind: Kafka)

Declare a new Kafka cluster within your personal namespace of version 4.0.0, enabling KRaft and using the name cluster-1. The cluster should use the Kafka node pool created in the previous exercise and utilize Strimzi entity operators for managing topics and users. For connectivity, the cluster should have a single internal, non-encrypted listener accepting connections on TCP port 9093 without requiring authentication.

Start with this stub resource:

apiVersion: kafka.strimzi.io/v1
kind: Kafka
metadata:
spec:
Hint: Base Cluster Properties

file ./cluster-1.yaml

apiVersion: kafka.strimzi.io/v1
kind: Kafka
metadata:
  name: cluster-1
  annotations:
    strimzi.io/node-pools: enabled
    strimzi.io/kraft: enabled
spec:
  kafka:
    version: 4.0.0
    metadataVersion: 4.0-IV3
  entityOperator:
    topicOperator: {}
    userOperator: {}
Final Solution

file ./cluster-1.yaml

apiVersion: kafka.strimzi.io/v1
kind: Kafka
metadata:
  name: cluster-1
  annotations:
    strimzi.io/node-pools: enabled
    strimzi.io/kraft: enabled
spec:
  kafka:
    version: 4.0.0
    metadataVersion: 4.0-IV3
    listeners:
      - name: plain
        port: 9093
        type: internal
        tls: false
  entityOperator:
    topicOperator: {}
    userOperator: {}

Now deploy the cluster in your personal namespace.

Solution
kubectl apply -n NAMESPACE -f ./cluster-1.yaml

Exercise 3: Test Access to Your Cluster

Test access to your new cluster using the Debug CLI by executing a simple Kafka admin command.

Please be patient

Please be aware that the official Kafka CLI tools are incredibly slow to start. Therefore, commands can take several seconds to complete.

Replace SERVICE_NAME with the name of your cluster’s bootstrap service and NAMESPACE_NAME with your namespace:

kafka-broker-api-versions --bootstrap-server "SERVICE_NAME.NAMESPACE_NAME.svc:9093" | grep '^cluster.*'

You should see three brokers listed in the output.

Lab Exercise: Creating Properties Files for Your Clients

With increasing complexity in the configuration of your cluster, supplying the correct parameters over the command line will become cumbersome. To make (future) configuration easier, we should create a Kubernetes kind: ConfigMap which will be used for storing Java .properties files we can easily extend and mount into our debug-cli pod.

Exercise 1: Create a ConfigMap

Create a ConfigMap in your namespace using the name cluster-1-client-cfg and a single key producer-plaintext-noauth.properties, which should have the following value (replacing MY_NAMESPACE with the name of your namespace):

bootstrap.servers=cluster-1-kafka-bootstrap.MY_NAMESPACE.svc:9093
sasl.mechanism=PLAIN
security.protocol=PLAINTEXT
parse.key=true
key.separator=:
Solution
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-1-client-cfg
  namespace: MY_NAMESPACE
data:
  client-plaintext-noauth.properties: |
    bootstrap.servers=cluster-1-kafka-bootstrap.MY_NAMESPACE.svc:9093
    sasl.mechanism=PLAIN
    security.protocol=PLAINTEXT
    parse.key=true
    key.separator=:

Exercise 2: Mounting the ConfigMap

Using your existing deployment for the debug-cli, mount the previously created ConfigMap into the pod at the path /config/cluster-1. Reuse the previous deployment specification.

Hint 1: Referencing the ConfigMap
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kafka-debug-cli
  #...
spec:
  #...
  template:
    #...
    spec:
      containers:
      #...
      volumes:
        - name: cluster-1-config
          configMap:
            name: cluster-1-client-cfg
Hint 2: Mounting the ConfigMap
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kafka-debug-cli
  #...
spec:
  #...
  template:
    #...
    spec:
      containers:
        - name: debug-cli
          #...
          volumeMounts:
            - name: cluster-1-config
              mountPath: /config/cluster-1
      volumes:
        - name: cluster-1-config
          configMap:
            name: cluster-1-client-cfg
Hint 3: Solution
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-1-client-cfg
data:
  client-plaintext-noauth.properties: |
    bootstrap.servers=cluster-1-kafka-bootstrap.{YOUR_NAMESPACE}.svc:9093
    sasl.mechanism=PLAIN
    security.protocol=PLAINTEXT
    parse.key=true
    key.separator=:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: debug-cli-user-conf
spec:
  storageClassName: default
  resources:
    requests:
      storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kafka-debug-cli
  labels:
    app: debug-cli
spec:
  replicas: 1
  selector:
    matchLabels:
      app: debug-cli
  template:
    metadata:
      labels:
        app: debug-cli
    spec:
      initContainers:
        - name: debug-cli-chown
          image: krassestecontainerreistry.azurecr.io/kafka-oauth-client:latest
          securityContext:
            privileged: true
            runAsUser: 0
            runAsGroup: 0
          command: [ "chown", "-R", "user:user", "/opt/user_conf" ]
          volumeMounts:
            - name: user-conf
              mountPath: /opt/user_conf
              readOnly: false
      containers:
        - name: debug-cli
          image: krassestecontainerreistry.azurecr.io/kafka-oauth-client:latest #TODO: replace ACR name
          resources:
            limits:
              cpu: 200m
              memory: 200Mi
            requests:
              cpu: 200m
              memory: 200Mi
          volumeMounts:
            - name: user-conf
              mountPath: /opt/user_conf
              readOnly: false
            - name: cluster-1-config
              mountPath: /config/cluster-1
      volumes:
        - name: user-conf
          persistentVolumeClaim:
            claimName: debug-cli-user-conf
        - name: cluster-1-config
          configMap:
            name: cluster-1-client-cfg