xgerman

xgerman's technology blog

Cassandra Kind Stateful Set

Introduction

I decided to play a bit with Cassandra and Kuberntes-in-Docker (kind). The tutorial (https://kubernetes.io/docs/tutorials/stateful-application/cassandra/ ) assumes you run either minikube or a real cluster and kind has some limitations when it comes to sorage classes.

Start your kind cluster

If you are new to kind, review the Kind Quickstart. The example asks for three cassandra nodes to form the ring so I set up three workers in my kind-config.yaml:

# four node (three workers) cluster config
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker

You can start with kind create cluster --config kind-config.yaml Don't forget to review your docker settings: Both cassandra and a four node kube needs significant cpu and memory.

Create the Cassandra service

This is taken 1:1 from the tutorial:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: cassandra
  name: cassandra
spec:
  clusterIP: None
  ports:
  - port: 9042
  selector:
    app: cassandra

Run kubectl apply -f cassandra-service.yaml Note for simplicity I am running it in the default namespace.

Create the persistent volumes

I wanted a simple volume strategy and not dabble with NFS or similar. So just create some persistent volumes by hand:

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: cassandra-data-cassandra-0
  labels:
    type: local
    app: cassandra
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /tmp/data/cassandra-data-1
  persistentVolumeReclaimPolicy: Recycle
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: cassandra-data-cassandra-1
  labels:
    type: local
    app: cassandra
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /tmp/data/cassandra-data-2
  persistentVolumeReclaimPolicy: Recycle
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: cassandra-data-cassandra-2
  labels:
    type: local
    app: cassandra
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /tmp/data/cassandra-data-3
  persistentVolumeReclaimPolicy: Recycle

Again apply: kubectl apply -f local-volumes.yaml

Create the stateful set

Here is the modified cassandra-statefulset.yaml. In addition to the local volumes I also made the timeouts on the readiness probe more freindly to my Macbook Air (aka slower). If you have better hardware, you can be more aggressive there:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cassandra
  labels:
    app: cassandra
spec:
  serviceName: cassandra
  replicas: 3
  selector:
    matchLabels:
      app: cassandra
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      terminationGracePeriodSeconds: 1800
      containers:
      - name: cassandra
        image: gcr.io/google-samples/cassandra:v13
        imagePullPolicy: Always
        ports:
        - containerPort: 7000
          name: intra-node
        - containerPort: 7001
          name: tls-intra-node
        - containerPort: 7199
          name: jmx
        - containerPort: 9042
          name: cql
        resources:
          limits:
            cpu: "500m"
            memory: 1Gi
          requests:
            cpu: "500m"
            memory: 1Gi
        securityContext:
          capabilities:
            add:
              - IPC_LOCK
        lifecycle:
          preStop:
            exec:
              command:
              - /bin/sh
              - -c
              - nodetool drain
        env:
          - name: MAX_HEAP_SIZE
            value: 512M
          - name: HEAP_NEWSIZE
            value: 100M
          - name: CASSANDRA_SEEDS
            value: "cassandra-0.cassandra.default.svc.cluster.local"
          - name: CASSANDRA_CLUSTER_NAME
            value: "K8Demo"
          - name: CASSANDRA_DC
            value: "DC1-K8Demo"
          - name: CASSANDRA_RACK
            value: "Rack1-K8Demo"
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
        readinessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - /ready-probe.sh
          initialDelaySeconds: 60
          timeoutSeconds: 30
        # These volume mounts are persistent. They are like inline claims,
        # but not exactly because the names need to match exactly one of
        # the stateful pod volumes.
        volumeMounts:
        - name: cassandra-data
          mountPath: /cassandra_data
  volumeClaimTemplates:
    - metadata:
        name: cassandra-data
        annotations:  # comment line if you want to use a StorageClass
          # or specify which StorageClass
          # volume.beta.kubernetes.io/storage-class: ""   # comment line if you
          # want to use a StorageClass or specify which StorageClass
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: ""
        resources:
          requests:
            storage: 1Gi

Apply and we are in business.

Next steps

This should bring up a cluster and we should check with kubectl get nodes until they are all up. kubectl exec -it cassandra-0 -- nodetool status will show the status with nodetool and we can also drop in that container to run nodetool interactively.