A Helm chart that self-hosts the CodeRAG HTTP/REST API (and, optionally, the web UI) on any Kubernetes cluster, with a persistent index, a git-sourced workspace, scheduled re-indexing, and sensible security defaults.
Don't use Helm? Every example below works with plain
kubectltoo — just pipehelm template …intokubectl apply -f -(see Without Helm).
⚠️ SECURITY — the API is UNAUTHENTICATED by default. With no API key set, anyone who can reach the Service (in-cluster, via port-forward, or through an Ingress) can read your indexed source and trigger reindexing. Before exposing CodeRAG beyond a trusted namespace you must:
- Set an API key so every request is authenticated —
--set secrets.apiKey=…(demo) or, preferred, supplyCODERAG_API_KEYviasecrets.existingSecret(see Require authentication).- Terminate TLS and add auth at any Ingress — never publish the plain HTTP API.
- Turn on the NetworkPolicy (
--set networkPolicy.enabled=true) so only known clients can reach the server pod (see Lock down network access).
- Chart:
helm/coderag· Values reference:helm/coderag/values.yaml - Images:
ghcr.io/neverdecel/coderag:beta(API) and:beta-ui(UI), built bydocker-beta.yml.
CodeRAG keeps its index in a single embedded LanceDB store, and the engine is a single writer — the store is written non-atomically, so two processes writing one index would corrupt it. The chart is built around that fact:
- One replica,
Recreatestrategy,ReadWriteOncePVC. Never scale the writer horizontally; it is not safe and the chart intentionally pinsreplicas: 1. - Indexing is driven over HTTP, not by a second pod mounting the volume. An initial
Job (and an optional CronJob) call
POST /indexon the running server, so exactly one process ever touches the index files. - The embedding model is downloaded once (≈130 MB) and cached on the data volume
(
CODERAG_CACHE_DIR=/data/.model-cache), so restarts don't re-download it. A generous startup probe covers that first download before liveness kicks in. - Standalone by default.
helm installwith no arguments boots a healthy server on your cluster's default storage (empty index); point it at your code with one setting. - The codebase is mounted read-only into the app and refreshed by a git init container (and optional git-sync sidecar), never written by the engine.
- Hardened by default: non-root (uid 10001), read-only root filesystem, dropped
capabilities,
RuntimeDefaultseccomp, and the service-account token is not mounted.
The server is the primary, recommended surface. The UI is optional and runs in one of two topologies:
- Shared (recommended for a read-only/demo UI):
ui.useServerIndex=truemounts the server's index volume read-only, so the UI serves exactly what the index/reindex Jobs built — no separate volume, always in sync, and it can never corrupt the writer's store. Reindexing stays a server-side Job. - Independent (default): the UI gets its own data volume and bundles the engine. Nothing populates that volume automatically (the index Jobs drive the server's volume), so you build it with the in-app Reindex button — which is disabled in demo mode, so a demo UI left on the default shows an empty index (0 files / 0 chunks).
- A Kubernetes cluster (v1.25+) and
kubectlconfigured for it. - A default
StorageClass(or setpersistence.storageClass) that can provisionReadWriteOncevolumes. - Helm 3 (only for the Helm workflow).
Standalone (zero config). Installs and runs anywhere with a default StorageClass — no required flags:
helm install coderag ./deploy/helm/coderag --namespace coderag --create-namespaceThe server comes up healthy on a freshly provisioned 10Gi volume with an empty index. Now point it at your code — clone a git repo into the pod:
helm upgrade coderag ./deploy/helm/coderag -n coderag --reuse-values \
--set workspace.source=git \
--set workspace.git.repository=https://github.com/Neverdecel/CodeRAG.gitThat provisions the index volume, clones the repo into the pod, and runs a one-shot Job
that builds the index once the server is ready. (You can pass both --sets on the first
install too, to do it in one step.) Watch it come up:
kubectl -n coderag get pods -w
kubectl -n coderag logs -f job/coderag-index-1 # initial indexing progressQuery it:
kubectl -n coderag port-forward svc/coderag-server 8000:8000
curl "http://127.0.0.1:8000/status"
curl "http://127.0.0.1:8000/search?q=where%20is%20retry%20handled&k=5"The chart needs no Tiller/cluster-side component, so you can render it locally and apply the plain manifests:
helm template coderag ./deploy/helm/coderag \
--namespace coderag \
--set workspace.git.repository=https://github.com/Neverdecel/CodeRAG.git \
> coderag.yaml
kubectl create namespace coderag
kubectl -n coderag apply -f coderag.yamlRe-render and re-apply to upgrade. (You lose Helm's release tracking and the automatic revision-suffixed index Job, but the manifests are otherwise identical.)
Full list with comments: values.yaml. The most-used knobs:
| Value | Default | Purpose |
|---|---|---|
image.tag |
beta |
Image tag. Pin to sha-<commit> for reproducibility. |
workspace.source |
emptyDir |
emptyDir (standalone) · git · existingClaim. |
workspace.git.repository |
– | Required for source=git. Repo to index. |
workspace.git.ref |
"" |
Branch/tag (empty = default branch). |
workspace.git.sync.enabled |
false |
Sidecar that git pulls on an interval. |
persistence.enabled |
true |
Persist the index to a PVC (false = ephemeral). |
persistence.size |
10Gi |
Index volume size. |
persistence.storageClass |
"" |
"" default class · <name> · "-" static. |
persistence.volumeName / persistence.selector |
– | Bind a pre-provisioned PV (static). |
config.provider |
fastembed |
fastembed (local, no key) · openai · fake. |
config.openaiBaseUrl |
"" |
Self-hosted OpenAI-compatible endpoint. |
secrets.existingSecret |
"" |
Preferred — pre-created Secret with OPENAI_API_KEY / ANTHROPIC_API_KEY / CODERAG_API_KEY. |
secrets.openaiApiKey / secrets.anthropicApiKey |
"" |
Inline keys (demo only — stored in the release; prefer existingSecret). |
secrets.apiKey |
"" |
Inline CODERAG_API_KEY — turns API auth ON (demo only; prefer existingSecret). |
networkPolicy.enabled |
false |
Recommended — default-deny ingress to the server, allow only UI/jobs/(ingress). |
server.service.type |
ClusterIP |
ClusterIP · NodePort · LoadBalancer. |
index.initJob.enabled |
true |
Build the index automatically on install/upgrade. |
index.cronjob.enabled |
false |
Recurring reindex (index.cronjob.schedule). |
ui.enabled |
false |
Also deploy the web UI. |
ui.useServerIndex |
false |
UI serves the server's index (read-only) instead of its own empty volume. |
ui.coLocateWithServer |
false |
Pin the UI onto the server's node — required with useServerIndex on RWO storage. |
ingress.enabled |
false |
Expose via an Ingress (add TLS + auth — the API has none). |
resources (server.*, ui.*) |
see values | CPU/memory requests & limits. |
The index needs one ReadWriteOnce volume per writer. The chart works with whatever your
cluster already provides — you rarely need to configure anything.
Use the cluster default StorageClass (recommended). Leave persistence.storageClass: ""
and the PVC binds to your default class. That covers virtually every managed and
self-managed cluster out of the box:
| Environment | Typical default class |
|---|---|
| Amazon EKS | gp3 / gp2 (EBS CSI) |
| Google GKE | standard-rwo (PD CSI) |
| Azure AKS | managed-csi / default (Disk CSI) |
| k3s / Rancher | local-path |
| Minikube / kind | standard |
| DigitalOcean / Civo / … | provider block-storage class |
Pick a specific class when you run your own provisioner:
--set persistence.storageClass=longhorn # Longhorn
--set persistence.storageClass=nfs-client # NFS subdir provisioner
--set persistence.storageClass=openebs-hostpath # OpenEBS LocalPVBind a pre-provisioned PersistentVolume (static). Common on-prem when there's no
dynamic provisioner — e.g. a hand-made NFS, hostPath, or local PV. Disable provisioning
with storageClass: "-" and point at the PV by name (or label):
persistence:
storageClass: "-" # storageClassName: "" — no dynamic provisioning
volumeName: coderag-data-pv # bind this specific PV
# or match by labels instead of by name:
# selector:
# matchLabels: { app: coderag }# Example PV backed by an NFS export (apply once, cluster-wide):
apiVersion: v1
kind: PersistentVolume
metadata:
name: coderag-data-pv
spec:
capacity: { storage: 10Gi }
accessModes: [ReadWriteOnce]
storageClassName: ""
nfs: { server: nfs.internal, path: /export/coderag }Bring your own PVC. If you already manage the claim, reference it directly and the
chart won't create one: --set persistence.existingClaim=my-index-pvc.
The index is single-writer, so
ReadWriteOnceis the right access mode.ReadWriteMany(NFS, CephFS) also works if that's all you have, but it buys you nothing here.
helm install coderag ./deploy/helm/coderag -n coderag --create-namespace \
--set workspace.git.repository=https://github.com/org/repo.git \
--set secrets.openaiApiKey=sk-... \
--set config.provider=openai # optional: OpenAI embeddings tooPrefer a pre-created Secret (so keys never sit in your values/CI):
kubectl -n coderag create secret generic coderag-keys \
--from-literal=OPENAI_API_KEY=sk-... \
--from-literal=ANTHROPIC_API_KEY=sk-ant-...
helm install coderag ./deploy/helm/coderag -n coderag \
--set workspace.git.repository=https://github.com/org/repo.git \
--set secrets.existingSecret=coderag-keysThe CodeRAG HTTP API is unauthenticated unless CODERAG_API_KEY is set. When set, the
server, the UI, and the in-cluster index/reindex Jobs all use it, and every request must
present it (as a Bearer token: Authorization: Bearer <key>). Always enable this
before exposing CodeRAG outside a trusted namespace.
Preferred — supply the key via a pre-created Secret (no credential in your values/CI):
kubectl -n coderag create secret generic coderag-keys \
--from-literal=CODERAG_API_KEY="$(openssl rand -hex 32)"
helm install coderag ./deploy/helm/coderag -n coderag \
--set secrets.existingSecret=coderag-keys
# (the same Secret can also hold OPENAI_API_KEY / ANTHROPIC_API_KEY)
# Then call the API with the key:
curl -H "Authorization: Bearer $(kubectl -n coderag get secret coderag-keys \
-o jsonpath='{.data.CODERAG_API_KEY}' | base64 -d)" \
http://127.0.0.1:8000/statusDemo only — inline key (lands in the stored Helm release in plaintext; fine for a throwaway cluster, not for production):
--set secrets.apiKey="$(openssl rand -hex 32)"By default any pod in the cluster can reach the server's Service. Enable the bundled NetworkPolicy to default-deny ingress to the server pod and allow only known clients — the UI pods, the index/reindex Jobs, and (optionally) your ingress controller — on the API port (8000). Egress is restricted to DNS and HTTPS (for git/model/provider access).
--set networkPolicy.enabled=trueYour CNI must enforce NetworkPolicy (Calico, Cilium, Antrea, Weave, AKS Azure-CNI, GKE Dataplane V2, …). To also let an ingress controller through, point the policy at its pods:
networkPolicy:
enabled: true
ingressController:
namespaceSelector:
matchLabels: { kubernetes.io/metadata.name: ingress-nginx }
podSelector:
matchLabels: { app.kubernetes.io/name: ingress-nginx }
# If your provider/model egress needs a non-443 port, either add it here…
# extraEgress:
# - ports: [{ port: 11434, protocol: TCP }] # e.g. an in-cluster Ollama
# …or disable egress restrictions entirely:
# egress: { enabled: false }The policy already permits DNS (53) and HTTPS (443) egress. A self-hosted OpenAI-compatible endpoint on a custom port (e.g. Ollama on 11434) needs an
extraEgressrule oregress.enabled=false.
--set config.openaiBaseUrl=http://ollama.ai-system.svc:11434/v1 \
--set config.llmProvider=openai \
--set config.chatModel=llama3.1Pair a git-sync sidecar (pulls new commits) with a reindex CronJob (re-embeds changes):
--set workspace.git.sync.enabled=true \
--set workspace.git.sync.periodSeconds=300 \
--set index.cronjob.enabled=true \
--set index.cronjob.schedule="*/30 * * * *"Option A — pre-populated volume (no in-cluster git auth). Put your code on a PVC
(e.g. via a CI job or kubectl cp) and mount it:
--set workspace.source=existingClaim \
--set workspace.existingClaim=my-code-pvcOption B — your own clone init container + a Secret (credential helper, no token in
the URL). Skip the built-in clone (source=emptyDir) and supply credentials from a
Secret. Do not put the token in the clone URL — it leaks into process listings,
shell history, git remote -v, and the repo's .git/config on the workspace volume.
Instead hand it to git out-of-band via GIT_ASKPASS: write a tiny helper script to
/tmp with mode 0600, and clone a clean https://github.com/... URL.
# private-repo.yaml
workspace:
source: emptyDir # disables the built-in (public) git clone
extraInitContainers:
- name: git-clone
image: alpine/git:2.45.2@sha256:16ad8e788e1d3b0c30f18da8dde5c0ace3b187445a62d8af893b003ca1e70592
securityContext: { allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, capabilities: { drop: [ALL] } }
env:
- name: HOME
value: /tmp
- name: GIT_TERMINAL_PROMPT # fail fast instead of hanging on a credential prompt
value: "0"
- name: GIT_ALLOW_PROTOCOL # restrict to safe transports
value: "https:git"
- name: GIT_USERNAME
value: x-access-token # GitHub PAT/installation token username
- name: GIT_TOKEN
valueFrom: { secretKeyRef: { name: git-creds, key: token } }
command: ["/bin/sh","-c"]
args:
- |
set -eu
# GIT_ASKPASS helper: git calls it for "Username"/"Password" prompts.
# The token never appears in the URL, argv, or .git/config.
ASKPASS="$(mktemp /tmp/askpass.XXXXXX)"
chmod 0600 "$ASKPASS"
cat > "$ASKPASS" <<'EOF'
#!/bin/sh
case "$1" in
Username*) printf '%s' "$GIT_USERNAME" ;;
Password*) printf '%s' "$GIT_TOKEN" ;;
esac
EOF
chmod 0700 "$ASKPASS"
export GIT_ASKPASS="$ASKPASS"
git clone --depth=1 -- "https://github.com/org/private-repo.git" /workspace
rm -f "$ASKPASS"
volumeMounts:
- { name: workspace, mountPath: /workspace }
- { name: tmp, mountPath: /tmp }kubectl -n coderag create secret generic git-creds --from-literal=token=ghp_...
helm install coderag ./deploy/helm/coderag -n coderag -f private-repo.yamlThe
tmpvolume is anemptyDir, so the helper script lives only in memory/ephemeral storage for the life of the init container and is removed after the clone. The token is supplied solely through thegit-credsSecret env var.
⚠️ The API has no built-in auth. Any Ingress you create must terminate TLS and add an authentication layer in front of it. At minimum setsecrets.apiKey/secrets.existingSecret(below) so the app itself rejects unauthenticated requests, and add an auth annotation/middleware at the controller (e.g. ingress-nginxnginx.ingress.kubernetes.io/auth-*, or an OAuth2 proxy). Always include atls:block.
ingress:
enabled: true
className: nginx
annotations:
# Example: require Basic-auth at the edge (in addition to the app's API key).
nginx.ingress.kubernetes.io/auth-type: basic
nginx.ingress.kubernetes.io/auth-secret: coderag-basic-auth
hosts:
- host: coderag.example.com
paths:
- { path: /, pathType: Prefix, service: server }
tls:
- { secretName: coderag-tls, hosts: [coderag.example.com] } # TLS is mandatoryRecommended — UI serves the server's index (read-only):
--set ui.enabled=true \
--set ui.useServerIndex=true
# On ReadWriteOnce storage (the default), also pin the UI onto the server's node:
--set ui.coLocateWithServer=true
# Omit coLocateWithServer if persistence uses a ReadWriteMany storageClass.The UI mounts the server's index volume read-only, so it shows whatever the
init/reindex Jobs built — nothing to reindex from the UI, and it stays in sync with the
server. This is the right choice for a public/demo UI, where the in-app Reindex button is
disabled. Open it via port-forward (svc/coderag-ui:8501) or an Ingress path with
service: ui.
Independent UI (default,
ui.useServerIndex=false): the UI gets its own data volume and clones the same repo, but nothing populates that volume — you must click Reindex in the sidebar to build it (impossible in demo mode). If your UI shows 0 files / 0 chunks, this is almost always why: the index Jobs filled the server's volume, not the UI's. Switch toui.useServerIndex=true.
No versioned tags are published yet; the default is the rolling :beta. Pin to a commit:
--set image.tag=sha-<commit> # API → ghcr.io/.../coderag:sha-<commit>
# UI → :sha-<commit>-ui (image.uiSuffix)For private registries, set image.pullSecrets: [{ name: my-regcred }].
# Trigger a reindex by hand (incremental):
kubectl -n coderag exec deploy/coderag-server -c server -- \
python -c "import urllib.request as u; print(u.urlopen(u.Request('http://127.0.0.1:8000/index', data=b'{\"full\":false}', headers={'content-type':'application/json'})).read().decode())"
# Or from your laptop after a port-forward:
curl -X POST localhost:8000/index -H 'content-type: application/json' -d '{"full": true}'
# Status, logs:
curl localhost:8000/status
kubectl -n coderag logs deploy/coderag-server -c server
kubectl -n coderag logs deploy/coderag-server -c git-sync # if sync enabledhelm upgrade coderag ./deploy/helm/coderag -n coderag --reuse-valuesEach upgrade runs a fresh …-index-<revision> Job to refresh the index. A ConfigMap
checksum annotation rolls the pod automatically when configuration changes.
The index PVCs are annotated helm.sh/resource-policy: keep, so your index survives
an uninstall. Remove the volumes explicitly when you're done:
helm uninstall coderag -n coderag
kubectl -n coderag delete pvc -l app.kubernetes.io/instance=coderagThe same checks run in CI (helm.yml):
helm lint deploy/helm/coderag -f deploy/helm/coderag/ci/default-values.yaml
helm template coderag deploy/helm/coderag -f deploy/helm/coderag/ci/full-values.yaml \
| kubeconform -strict -summary -kubernetes-version 1.29.0- Pod stuck
ContainerCreating/Pending— usually the PVC can't be provisioned. Checkkubectl -n coderag describe pvcand setpersistence.storageClassto a class that supportsReadWriteOnce. - First start is slow / startup probe restarts — the embedding model (~130 MB) is
downloading. It's cached on the data volume afterwards. Raise
server.startupProbe.failureThresholdon very slow networks. - A read-only-filesystem write error (rare; some model backend writing outside the
mounted caches) — the pod runs with
readOnlyRootFilesystem: trueand writable/tmp,/data, and/home/coderag. If a backend insists on another path, mount it viaextraVolumes/extraVolumeMounts, or relax the hardening:--set securityContext.readOnlyRootFilesystem=false. - UI shows 0 files / 0 chunks / 0 vectors — the UI is on its own (empty) data volume
while the index Jobs populated the server's volume. They are different PVCs
(
…-ui-datavs…-server-data). Setui.useServerIndex=trueso the UI serves the server's index read-only (addui.coLocateWithServer=trueonReadWriteOncestorage). The independent UI only fills its own volume via the in-app Reindex button, which is disabled in demo mode.
- Single writer by design — do not raise
replicas. For higher search throughput, put a cache/load balancer in front of the read endpoints; the index itself stays single-writer. ReadWriteOnceties the index to one node at a time; that's expected for the embedded store.- The UI, when enabled, defaults to a separate index from the server. For a single
shared index, set
ui.useServerIndex=true(the UI reads the server's volume read-only), or run the server alone and point browsers/tools at its REST API.