Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,3 +95,35 @@ Use the `update-version.sh` script to manage operator versions:
Supported components: `strimzi`, `apicurio-registry`, `streamshub-console`, `prometheus-operator`

The script updates the remote resource URLs in the relevant `kustomization.yaml` files to point to the new version's release artifacts.

## Scaling the Kafka Cluster

The default deployment uses the upstream Strimzi [`kafka-single-node.yaml`](https://github.com/strimzi/strimzi-kafka-operator/blob/0.51.0/examples/kafka/kafka-single-node.yaml) example with a single broker.
To scale to 3 replicas, edit `components/core/stack/kafka/kustomization.yaml` and change the resource URL to use Strimzi's [`kafka-with-dual-role-nodes.yaml`](https://github.com/strimzi/strimzi-kafka-operator/blob/0.51.0/examples/kafka/kafka-with-dual-role-nodes.yaml) example instead:

```yaml
resources:
- https://raw.githubusercontent.com/strimzi/strimzi-kafka-operator/refs/tags/0.51.0/examples/kafka/kafka-with-dual-role-nodes.yaml
- namespace.yaml
```

This example is structurally identical to the single-node version (same KafkaNodePool name, listeners, and storage) but configures 3 replicas with the following replication settings:

| Property | Value | Notes |
|----------|-------|-------|
| `offsets.topic.replication.factor` | 3 | |
| `transaction.state.log.replication.factor` | 3 | |
| `transaction.state.log.min.isr` | 2 | replicas − 1 |
| `default.replication.factor` | 3 | |
| `min.insync.replicas` | 2 | replicas − 1 |

All existing patches (cluster rename, resource limits, entity operator config) apply without changes.

For replica counts other than 1 or 3, start from either example and add patches for `spec.replicas` on the KafkaNodePool and the replication config values on the Kafka CR.

**Considerations:**

- **ISR values** should be `replicas − 1`, not equal to `replicas`. Setting `min.insync.replicas` equal to the replica count means a single broker failure blocks all writes
- **KRaft quorum** — the cluster uses KRaft (no ZooKeeper) with dual-role nodes (controller + broker). An odd number of replicas (3 or 5) is recommended for controller leader election
- **Resource usage** scales linearly — 3 replicas requires 3× the CPU and memory of a single node. You may need to increase cluster resources (e.g. `minikube start --cpus=8 --memory=12g`)
- **Local changes** require `LOCAL_DIR=.` when using the install script, which otherwise fetches manifests from GitHub. See [Install from a Local Checkout](installation.md#install-from-a-local-checkout)
2 changes: 1 addition & 1 deletion docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ If you prefer step-by-step control, the stack is installed in two phases.
### Phase 1 — Operators and CRDs

```shell
kubectl apply -k 'https://github.com/streamshub/developer-quickstart//overlays/core/base?ref=main'
kubectl apply --server-side --force-conflicts -k 'https://github.com/streamshub/developer-quickstart//overlays/core/base?ref=main'
```

Optionally, you can wait for the operators to become ready using the commands below:
Expand Down
2 changes: 1 addition & 1 deletion docs/overlays/core.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ No `OVERLAY` variable is needed — the core overlay is used by default.

```shell
# Phase 1 — Operators and CRDs
kubectl apply -k 'https://github.com/streamshub/developer-quickstart//overlays/core/base?ref=main'
kubectl apply --server-side --force-conflicts -k 'https://github.com/streamshub/developer-quickstart//overlays/core/base?ref=main'

# Optionally, wait for the operators to be ready
kubectl wait --for=condition=Available deployment/strimzi-cluster-operator -n strimzi --timeout=120s
Expand Down
19 changes: 18 additions & 1 deletion docs/overlays/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ If you prefer step-by-step control, the metrics overlay uses `overlays/metrics`

```shell
# Phase 1 — Operators and CRDs (includes Prometheus Operator)
kubectl create -k 'https://github.com/streamshub/developer-quickstart//overlays/metrics/base?ref=main'
kubectl apply --server-side --force-conflicts -k 'https://github.com/streamshub/developer-quickstart//overlays/metrics/base?ref=main'

# Optionally, wait for the operators to be ready
kubectl wait --for=condition=Available deployment/prometheus-operator -n monitoring --timeout=120s
Expand Down Expand Up @@ -107,6 +107,7 @@ curl -s http://localhost:9090/api/v1/targets | grep -o '"health":"up"' | wc -l

Open the StreamsHub Console UI — Kafka cluster CPU and memory usage should show up straight away.
However, other metrics such as those for topics will only show once topics have been created and messages are flowing through them.
On minikube, disk usage metrics are not available — see [Disk Usage Metrics Empty on Minikube](#disk-usage-metrics-empty-on-minikube) below.

## Troubleshooting

Expand Down Expand Up @@ -135,3 +136,19 @@ kubectl get kafka/dev-cluster -n kafka -o jsonpath='{.spec.kafka.metricsConfig}'
- PodMonitor label mismatch — Prometheus selects PodMonitors with `app: strimzi`; verify the label is present
- Kafka metrics not enabled — the metrics overlay patches the Kafka CR to add `metricsConfig`; check that it was applied

### Disk Usage Metrics Empty on Minikube

The Console UI shows CPU and memory graphs but the disk usage panel is empty:

```shell
# Check if volume stats are available in Prometheus
kubectl exec -n monitoring prometheus-prometheus-0 -c prometheus -- \
wget -qO- 'http://localhost:9090/api/v1/query?query=kubelet_volume_stats_used_bytes' \
| grep -c '"result":\[\]'
# Output of 1 means no volume stats are present
```

**Cause:**

- This is a minikube platform limitation, not a configuration issue. The `kubelet_volume_stats_*` and `container_fs_usage_bytes` metrics are not exposed by minikube's kubelet and cAdvisor, particularly with the Docker driver. On production clusters (OpenShift, EKS, GKE) these metrics are available and disk usage displays correctly

Loading