The Helm-based operator is the operator pattern for teams who have a working Helm chart and do not want to write Go. The Operator SDK ships a generic reconciler that watches a CR, converts its .spec to Helm values, and runs helm install / upgrade / uninstall on every CR event. The chart does the work; you write zero lines of Go.
This is a two-part tutorial. Part 1 (this article) gets you from zero to a deployed operator with a four-template demo-app chart, a DemoApp CRD, and a fully understood watches.yaml. Part 2 picks up everything you do with the operator afterwards: lifecycle (upgrade/uninstall), drift, hooks, scope, and the hard ceiling.
Prerequisites: install Operator-SDK on Linux, Helm v3 or v4 CLI, Docker, kubectl, and a kind cluster. Familiarity with helm install, helm upgrade, and helm uninstall is assumed - the CLI commands used in this tutorial are identical on both Helm 3 and Helm 4.
A note on Helm 3 vs Helm 4. Helm 4 (current stable v4.2.0 as of mid-2026) is the version you should install for new work. The Operator SDK Helm plugin (
helm.sdk.operatorframework.io/v1) used in this tutorial currently still bundles the Helm v3 Go SDK internally (helm-operator-pluginsv0.9.x depends onhelm.sh/helm/v3); Helm 4 SDK migration is on the roadmap. This does not affect anything you do in this article - the chart format, the CLI you use to test the chart standalone, and the resulting Helm release format (sh.helm.release.v1.<name>.v<rev>Secrets) are all unchanged between Helm 3 and Helm 4. The Helm hybrid operator is the path to take if you want to ship a Go operator that uses the Helm v4 SDK directly today.
What a Helm-based operator actually is (and what you don't write)
A Helm-based operator has four moving pieces:
- A Helm chart (your existing one or a new one).
- A CRD that defines the shape of the CR.
- A
watches.yamlthat maps CRGroup/Version/Kindto a chart directory. - The pre-built helm-operator reconciler binary shipped by Operator SDK.
The reconciler is gcr.io/kubebuilder/... style image that Operator SDK builds for you. When a CR is created or updated, the reconciler reads its .spec, calls helm install or helm upgrade with that spec passed as values, and writes status conditions. When the CR is deleted, it calls helm uninstall. You write zero Go code - the reconciler is generic.
Here is what is generated vs what you provide, compared to a Go operator:
| Piece | Helm-based operator | Go operator (for comparison) |
|---|---|---|
| Resource templates | Helm chart you write (or already have) | Go code in controllers/ constructing each resource |
| CRD YAML | Generated by operator-sdk init from CLI flags (permissive) |
Generated by make manifests from Go API types |
| Go API types | None - no types.go, no make generate |
api/v1alpha1/*_types.go you author |
| Reconcile function | None - pre-built reconciler runs helm install/upgrade |
Reconcile() you author |
| Configuration | watches.yaml (Kind to chart mapping) |
SetupWithManager in code |
| Status conditions | Fixed: Initialized, Deployed, ReleaseFailed |
Anything you want |
| Custom finalizer logic | Not possible without rebuilding the operator image | A few lines of Go |
| Reading external state | Not possible in reconcile | Standard Go HTTP/SDK calls |
The trade-off is clear: zero code for "install this chart on every CR," at the cost of zero control over anything else. The five rows marked None / Not possible above are the hard ceiling - Part 2 enumerates them with workarounds (Helm hooks) and Article 3 / Helm hybrid operator show how to break the ceiling by switching to a Go-driven hybrid.
What you'll build in this two-part series
A DemoApp CRD that drives a tiny four-template Helm chart called demo-app:
| Template | Purpose |
|---|---|
templates/deployment.yaml |
Nginx Deployment serving a single HTML file |
templates/service.yaml |
ClusterIP Service in front of the Deployment |
templates/configmap.yaml |
The HTML body (sourced from .Values.message) |
templates/secret.yaml |
A fake API key (sourced from .Values.apiKey) mounted as an env var |
The DemoApp CR maps to Helm values like this:
| CR field | Helm value | Resource it ends up in |
|---|---|---|
spec.replicas |
replicaCount |
Deployment.spec.replicas |
spec.image |
image |
Deployment.spec.template.spec.containers[0].image |
spec.message |
message |
ConfigMap.data.index.html |
spec.apiKey |
apiKey |
Secret.data.api-key (base64) |
spec.service.type |
service.type |
Service.spec.type |
What ships in Part 1 vs Part 2:
| Part 1 (this article) | Part 2 (next) |
|---|---|
Write the demo-app chart |
Upgrade the CR, see helm upgrade |
Scaffold the operator (operator-sdk init) |
Delete the CR, see helm uninstall |
| Build, deploy, apply first CR (install only) | overrideValues and value precedence (full rules) |
| Tighten the generated CRD | Drift detection (edit / delete chart resources, watch reconcile) |
Walk every field of watches.yaml |
Helm hooks for pre/post install/upgrade/delete custom work |
| Cluster-scoped vs namespace-scoped (WATCH_NAMESPACE + RBAC swap) | |
Multi-tenancy with the selector field |
|
| The hard ceiling (what you cannot do without Go) |
Prerequisites
This article assumes you have already completed the full lab setup in Install Operator-SDK on Linux - that one walks the installs (Go, kubectl, Docker, Helm 4 CLI, kind, operator-sdk binary) and brings up a plain kind cluster you can target.
If everything from that guide is in place, all of these should print without error:
operator-sdk version
# operator-sdk version: "v1.42.2", commit: "...", kubernetes version: "1.33.1", ...
helm version --short
# v4.2.0+g0646808
kubectl version --client
# Client Version: v1.36.1
# Kustomize Version: v5.8.1
kind version
# kind v0.31.0 go1.25.5 linux/amd64
docker version --format '{{.Server.Version}}'
# 29.2.1
# kind cluster (created by `kind create cluster --name demo` in the install article)
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# demo-control-plane Ready control-plane ... v1.35.0
# Helm plugin available in operator-sdk
operator-sdk init --plugins helm --help 2>&1 | head -n 1
# Initialize a new Helm-based operator project.Version note: This article was verified end-to-end against operator-sdk v1.42.2, helm v4.2.0, kind v0.31.0 (which ships
kindest/node:v1.35.0by default), and Docker 29.2.1. Older operator-sdk versions before v1.34 had a different scaffold (two-container manager pod with akube-rbac-proxysidecar) — see the note in Step 4 below.
If operator-sdk init --plugins helm --help errors with no plugin could be resolved with key "helm", your operator-sdk build does not include the Helm plugin - re-run the install steps in the prereq article. If kubectl get nodes is empty, run kind create cluster --name demo from the prereq article's Step 1.
Image distribution: this article (and Part 2) builds a local operator image and needs to ship it into the kind cluster. We use ttl.sh — a free, public, ephemeral container registry that requires no signup — because it works from any cluster with zero setup. The pattern is
make docker-build IMG=ttl.sh/demoapp-$(uuidgen):24h→docker push "$IMG"→make deploy IMG="$IMG". The prereq article explains the choice in detail, including why we don't usekind load docker-image(brittle on Docker 24+) or a localregistry:2container (works but ~30 lines of setup). Anything pushed to ttl.sh is public; do not push proprietary code or secrets — use a real registry (GHCR, ECR, GAR, ACR) for production.
Part A - Build the operator end to end
Step 1 - Write the demo-app chart
Scaffold a minimal chart:
mkdir -p ~/helm-operator && cd ~/helm-operator
helm create demo-apphelm create (Helm 4) produces about a dozen default templates including deployment.yaml, service.yaml, ingress.yaml, serviceaccount.yaml, hpa.yaml, httproute.yaml, plus _helpers.tpl, NOTES.txt, and a tests/ directory — all useless for our purpose. Wipe them and start clean:
rm demo-app/templates/*.yaml
rm demo-app/templates/tests/*.yaml
rmdir demo-app/templates/tests
rm demo-app/templates/_helpers.tpl
rm demo-app/templates/NOTES.txtThe first
rm demo-app/templates/*.yamlcatcheshttproute.yaml,hpa.yaml, and friends in one shot. If you're on Helm 3 you'll see one or two fewer files — the cleanup is the same.
Write demo-app/values.yaml:
replicaCount: 1
image: nginx:1.27-alpine
message: "Hello from demo-app"
apiKey: "changeme"
service:
type: ClusterIP
port: 80The chart uses the upstream
nginx:1.27-alpinefrom Docker Hub directly — it's a stable, widely-mirrored public image and one pull per CR over the course of this tutorial doesn't come close to Docker Hub's anonymous rate limit (100 pulls / 6 h). The operator image you'll build in Step 4 is a different story (it's built locally and has to reach the cluster) — that one goes throughttl.shas flagged in Prerequisites.
Write demo-app/templates/_helpers.tpl:
{{- define "demo-app.name" -}}
{{ .Release.Name }}
{{- end }}
{{- define "demo-app.labels" -}}
app.kubernetes.io/name: {{ include "demo-app.name" . }}
app.kubernetes.io/managed-by: demo-app-operator
{{- end }}Write demo-app/templates/configmap.yaml:
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "demo-app.name" . }}
labels: {{- include "demo-app.labels" . | nindent 4 }}
data:
index.html: |
<html><body><h1>{{ .Values.message }}</h1></body></html>Write demo-app/templates/secret.yaml:
apiVersion: v1
kind: Secret
metadata:
name: {{ include "demo-app.name" . }}
labels: {{- include "demo-app.labels" . | nindent 4 }}
type: Opaque
stringData:
api-key: {{ .Values.apiKey | quote }}Write demo-app/templates/deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "demo-app.name" . }}
labels: {{- include "demo-app.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
app.kubernetes.io/name: {{ include "demo-app.name" . }}
template:
metadata:
labels: {{- include "demo-app.labels" . | nindent 8 }}
spec:
containers:
- name: web
image: {{ .Values.image }}
ports:
- containerPort: 80
env:
- name: API_KEY
valueFrom:
secretKeyRef:
name: {{ include "demo-app.name" . }}
key: api-key
volumeMounts:
- name: web-content
mountPath: /usr/share/nginx/html
volumes:
- name: web-content
configMap:
name: {{ include "demo-app.name" . }}Write demo-app/templates/service.yaml:
apiVersion: v1
kind: Service
metadata:
name: {{ include "demo-app.name" . }}
labels: {{- include "demo-app.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
selector:
app.kubernetes.io/name: {{ include "demo-app.name" . }}
ports:
- port: {{ .Values.service.port }}
targetPort: 80Verify the chart renders and lints:
helm lint demo-app
# ==> Linting demo-app
# [INFO] Chart.yaml: icon is recommended
# 1 chart(s) linted, 0 chart(s) failedSmoke-test the chart standalone (we are not yet using the operator — the kubelet pulls nginx:1.27-alpine directly from Docker Hub):
helm install hello demo-app --set message="standalone test"
kubectl get deploy,svc,cm,secret -l app.kubernetes.io/name=hello
# kubectl get -l is exact-match; the full label value is `hello`, not a substring
helm uninstall helloIf standalone install works, the chart is good. From here on, the operator drives Helm - you never run helm install directly again.
Step 2 - Scaffold with operator-sdk init (Helm plugin)
Create an empty operator project alongside the chart:
mkdir -p ~/helm-operator/demo-app-operator
cd ~/helm-operator/demo-app-operator
operator-sdk init --plugins=helm.sdk.operatorframework.io/v1 --domain example.com \
--group demo --version v1alpha1 --kind DemoApp --helm-chart=../demo-app
helm.sdk.operatorframework.io/v1is the fully-qualified plugin key. The Operator SDK also accepts the short form--plugins=helm(bare alias resolving to the same plugin). Short forms like--plugins=helm/v1are not recognized by the plugin resolver - the SDK returnsno plugin could be resolved with key "helm/v1".
Three things just happened:
- A Go module (well, mostly YAML) was scaffolded with a
Dockerfile,Makefile,config/, andhelm-charts/. - Your
demo-appchart was copied intohelm-charts/demo-app/. - A
watches.yamlwas generated mappingdemo.example.com/v1alpha1/DemoApptohelm-charts/demo-app.
You will also see a level=warning message near the end of the output:
time="..." level=warning msg="The RBAC rules generated in config/rbac/role.yaml are based on the chart's default manifest. Some rules may be missing for resources that are only enabled with custom values, and some existing rules may be overly broad. Double check the rules generated in config/rbac/role.yaml to ensure they meet the operator's permission requirements."This is an honest warning, not an error. The Operator SDK infers RBAC by rendering the chart with default values and granting create/update/patch/delete on every Kind that comes out. For our demo-app chart the default render is also the only render - we always emit a Deployment, Service, ConfigMap, and Secret regardless of CR values - so the generated role.yaml covers every chart resource.
The warning bites real charts that render conditionally:
# templates/ingress.yaml
{{- if .Values.ingress.enabled }}
apiVersion: networking.k8s.io/v1
kind: Ingress
...
{{- end }}With ingress.enabled: false in defaults, operator-sdk init will not generate ingresses.networking.k8s.io permissions, and a CR that sets ingress.enabled: true later will fail the reconcile with ingresses.networking.k8s.io is forbidden. The fix in that case is to manually edit config/rbac/role.yaml to add every Kind the chart can possibly render across all values combinations. Part 2 of this tutorial covers RBAC tightening in the namespace-scoped section; the gap-filling pattern is identical for cluster-scoped operators.
Add the patch verb to the existing events rule
The chart-rendering inference catches every Kind the chart writes. The scaffold also adds two framework-only rules the inference cannot see: secrets:* (helm release storage) and events:create (so the operator can emit Kubernetes Events for things like ReleaseFailed or OverrideValuesInUse). Read your generated config/rbac/role.yaml and you'll find this block already there:
# We need to create events on CRs about things happening during reconciliation
- apiGroups:
- ""
resources:
- events
verbs:
- createWhat's missing is the patch verb. The framework's EventRecorder aggregates repeated events (e.g., many OverrideValuesInUse warnings on the same CR) using a Patch call, and without patch the operator log fills with events ... is forbidden lines on every aggregation attempt (the reconcile itself still succeeds — only the aggregation fails). Add patch to that rule now, before the first make deploy:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch # add this lineTreat the scaffolded role.yaml as a starting point, not the final answer.
Future-proof for Helm hook resources
Add this bundle to config/rbac/role.yaml now, in the same edit as the events rule above, so the single make deploy you run at the end of this article carries everything Part 2 will need:
- apiGroups:
- ""
resources:
- serviceaccounts
- namespaces
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- batch
resources:
- jobs
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- rbac.authorization.k8s.io
resources:
- roles
- rolebindings
- clusterroles
- clusterrolebindings
verbs:
- get
- list
- watch
- create
- update
- patch
- deleteThese rules are not free — granting the operator clusterroles/clusterrolebindings write power is a privilege escalation surface (any chart it manages can mint cluster admin). For a production operator that never uses hooks you should leave them out and add only what each chart actually renders. For a tutorial operator where you will exercise hooks in the next article, adding them up-front avoids a second deploy cycle and keeps the focus on the helm-operator concepts rather than RBAC bookkeeping.
Step 3 - Project folder structure
demo-app-operator/
├── Dockerfile # multi-stage build of the helm-operator binary + chart
├── Makefile # docker-build / deploy / undeploy targets
├── PROJECT # operator-sdk metadata
├── watches.yaml # CR Kind → chart mapping (the routing layer)
├── helm-charts/
│ └── demo-app/ # your chart, copied here at scaffold time
│ ├── Chart.yaml
│ ├── values.yaml
│ └── templates/...
└── config/
├── crd/
│ └── bases/
│ └── demo.example.com_demoapps.yaml # generated CRD (permissive)
├── samples/
│ └── demo_v1alpha1_demoapp.yaml # generated sample CR
│ └── kustomization.yaml
├── default/ # kustomize base for deploying the operator
├── manager/ # the Deployment that runs the operator pod
└── rbac/ # ClusterRole, ClusterRoleBinding, ServiceAccountRead three files now - they are the entire "API" of your operator:
cat watches.yaml
cat config/crd/bases/demo.example.com_demoapps.yaml | head -40
cat config/samples/demo_v1alpha1_demoapp.yamlThe watches.yaml:
# Use the 'create api' subcommand to add watches to this file.
- group: demo.example.com
version: v1alpha1
kind: DemoApp
chart: helm-charts/demo-app
# +kubebuilder:scaffold:watchFour required fields, nothing more. Part C walks every optional field you can add to this.
The generated sample CR:
apiVersion: demo.example.com/v1alpha1
kind: DemoApp
metadata:
name: demoapp-sample
spec:
# Default values copied from <project_dir>/helm-charts/demo-app/values.yaml
replicaCount: 1
image: nginx:1.27-alpine
message: "Hello from demo-app"
apiKey: "changeme"
service:
type: ClusterIP
port: 80Notice: the sample CR's spec is exactly your chart's values.yaml. This is the implicit mapping - whatever you put in spec becomes Helm values verbatim.
Step 4 - Build, deploy, verify
Pick a unique image URL on ttl.sh. The UUID avoids collisions with anyone else using ttl.sh, and 24h is the time-to-live before the image auto-expires (plenty of room for working through Part 1 + Part 2):
export IMG=ttl.sh/demoapp-$(uuidgen):24h
echo "$IMG"
# ttl.sh/demoapp-3f8b9c12-4a5e-49b8-9d6a-87f2c1e0d3a4:24hBuild the operator image with that tag. The Dockerfile produces a single image containing the helm-operator reconciler binary plus your chart bundled at /opt/helm/helm-charts/demo-app:
make docker-build IMG="$IMG"Push it to ttl.sh - the cluster will pull from the same URL:
docker push "$IMG"Why ttl.sh instead of
kind load docker-imageor a local registry? Newer Docker (24+) with the containerd snapshotter breakskind loadwithctr: content digest <sha>: not found, and running a localregistry:2container requires ~30 lines of cluster setup (customcontainerdConfigPatches+ per-nodehosts.toml). ttl.sh works with zero setup from any cluster — see the prereq article's ttl.sh section for the full reasoning. Do not push proprietary images to ttl.sh — it's public.
Deploy the operator (this also applies the CRD), then wait for the new operator pod to be ready before doing anything else:
make deploy IMG="$IMG"
kubectl -n demo-app-operator-system rollout status deploy/demo-app-operator-controller-manager
# deployment "demo-app-operator-controller-manager" successfully rolled outmake deploy runs kustomize build config/default | kubectl apply -f - - it creates the namespace, ServiceAccount, ClusterRole, ClusterRoleBinding, and the operator Deployment.
Always wait for
rollout statusaftermake deploybefore applying CRs. Every rebuild of the operator image triggers a Deployment rollout. The new pod isn't Ready instantly — and if you apply (or re-apply) a CR while the OLD pod is still serving, it will reconcile against the OLDwatches.yaml/ chart / RBAC inside that pod, producing results that don't match this article. The same rule applies to every rebuild block in Part 2.
Sample output (your ttl.sh/demoapp-... URL will differ — the UUID is the one you generated):
cd config/manager && /root/helm-operator/demo-app-operator/bin/kustomize edit set image controller=ttl.sh/demoapp-3f8b9c12-4a5e-49b8-9d6a-87f2c1e0d3a4:24h
/root/helm-operator/demo-app-operator/bin/kustomize build config/default | kubectl apply -f -
namespace/demo-app-operator-system created
customresourcedefinition.apiextensions.k8s.io/demoapps.demo.example.com created
serviceaccount/demo-app-operator-controller-manager created
role.rbac.authorization.k8s.io/demo-app-operator-leader-election-role created
clusterrole.rbac.authorization.k8s.io/demo-app-operator-demoapp-admin-role created
clusterrole.rbac.authorization.k8s.io/demo-app-operator-demoapp-editor-role created
clusterrole.rbac.authorization.k8s.io/demo-app-operator-demoapp-viewer-role created
clusterrole.rbac.authorization.k8s.io/demo-app-operator-manager-role created
clusterrole.rbac.authorization.k8s.io/demo-app-operator-metrics-auth-role created
clusterrole.rbac.authorization.k8s.io/demo-app-operator-metrics-reader created
rolebinding.rbac.authorization.k8s.io/demo-app-operator-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/demo-app-operator-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/demo-app-operator-metrics-auth-rolebinding created
service/demo-app-operator-controller-manager-metrics-service created
deployment.apps/demo-app-operator-controller-manager createdVerify:
kubectl get crd | grep demoapps
# demoapps.demo.example.com 2026-06-02T04:00:00Z
kubectl -n demo-app-operator-system get pods
# NAME READY STATUS RESTARTS AGE
# demo-app-operator-controller-manager-7f8b... 1/1 Running 0 25sREADY 1/1 is the modern shape: a single container running the pre-built helm-operator binary. Older operator-sdk releases (before v1.36 / kubebuilder v4.4) scaffolded a two-container pod (READY 2/2) with a kube-rbac-proxy sidecar handling metrics-endpoint auth; that has been replaced with in-process authentication using the Kubernetes TokenReview API, so newer scaffolds drop the sidecar entirely. Both shapes are correct — this article assumes the modern one. Tail the manager log to confirm it picked up watches.yaml:
kubectl -n demo-app-operator-system logs deploy/demo-app-operator-controller-manager -c manager | head -20
# {"level":"info","ts":"2026-06-03T08:37:54Z","logger":"cmd","msg":"Version","Go Version":"go1.25.8","GOOS":"linux","GOARCH":"amd64","helm-operator":"v1.42.2","commit":"6001c29067051e1a04e829ea033988b904d1845e"}
# {"level":"info","ts":"2026-06-03T08:37:54Z","logger":"cmd","msg":"Watching all namespaces"}
# {"level":"info","ts":"2026-06-03T08:37:54Z","logger":"helm.controller","msg":"Watching resource","apiVersion":"demo.example.com/v1alpha1","kind":"DemoApp","reconcilePeriod":"1m0s"}
# {"level":"info","ts":"2026-06-03T08:37:54Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
# {"level":"info","ts":"2026-06-03T08:37:54Z","msg":"starting server","name":"health probe","addr":"[::]:8081"}
# I0603 08:37:54.626521 1 leaderelection.go:257] attempting to acquire leader lease demo-app-operator-system/demo-app-operator...
# I0603 08:37:54.676569 1 leaderelection.go:271] successfully acquired lease demo-app-operator-system/demo-app-operator
# {"level":"info","ts":"2026-06-03T08:37:54Z","msg":"Starting EventSource","controller":"demoapp-controller","source":"kind source: *unstructured.Unstructured"}
# {"level":"info","ts":"2026-06-03T08:37:55Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8443","secure":true}
# {"level":"info","ts":"2026-06-03T08:37:55Z","msg":"Starting Controller","controller":"demoapp-controller"}
# {"level":"info","ts":"2026-06-03T08:37:55Z","msg":"Starting workers","controller":"demoapp-controller","worker count":2}"msg":"Watching all namespaces" confirms cluster scope (no WATCH_NAMESPACE env var → all namespaces). The line "msg":"Watching resource","kind":"DemoApp","reconcilePeriod":"1m0s" confirms watches.yaml was loaded for the DemoApp Kind. Part 2 covers how to flip cluster-scope to namespace-scope.
Step 5 - First install: apply the CR
kubectl apply -f config/samples/demo_v1alpha1_demoapp.yaml
# demoapp.demo.example.com/demoapp-sample createdWatch the operator's reconcile fire:
kubectl -n demo-app-operator-system logs deploy/demo-app-operator-controller-manager -c manager -f
# ... (startup lines from the previous step) ...
# {"level":"info","ts":"2026-06-03T08:41:47Z","msg":"Starting EventSource","controller":"demoapp-controller","source":"kind source: *unstructured.Unstructured"}
# {"level":"info","ts":"2026-06-03T08:41:47Z","logger":"helm.controller","msg":"Watching dependent resource","ownerApiVersion":"demo.example.com/v1alpha1","ownerKind":"DemoApp","apiVersion":"v1","kind":"Secret"}
# {"level":"info","ts":"2026-06-03T08:41:47Z","logger":"helm.controller","msg":"Watching dependent resource","ownerApiVersion":"demo.example.com/v1alpha1","ownerKind":"DemoApp","apiVersion":"v1","kind":"ConfigMap"}
# {"level":"info","ts":"2026-06-03T08:41:47Z","logger":"helm.controller","msg":"Watching dependent resource","ownerApiVersion":"demo.example.com/v1alpha1","ownerKind":"DemoApp","apiVersion":"v1","kind":"Service"}
# {"level":"info","ts":"2026-06-03T08:41:47Z","logger":"helm.controller","msg":"Watching dependent resource","ownerApiVersion":"demo.example.com/v1alpha1","ownerKind":"DemoApp","apiVersion":"apps/v1","kind":"Deployment"}
# {"level":"info","ts":"2026-06-03T08:41:47Z","logger":"helm.controller","msg":"Installed release","namespace":"default","name":"demoapp-sample","apiVersion":"demo.example.com/v1alpha1","kind":"DemoApp","release":"demoapp-sample"}
# {"level":"info","ts":"2026-06-03T08:41:50Z","logger":"helm.controller","msg":"Reconciled release","namespace":"default","name":"demoapp-sample","apiVersion":"demo.example.com/v1alpha1","kind":"DemoApp","release":"demoapp-sample"}The four "Watching dependent resource" lines are the operator subscribing to events on every Kind the chart renders (Secret, ConfigMap, Service, Deployment) — that's the drift-detection mechanism covered in Part 2. "Installed release" is the actual helm install call returning success. Stop tailing with Ctrl-C.
The chart's resources should be present in default:
kubectl get all,cm,secret -l app.kubernetes.io/name=demoapp-sample
# NAME READY STATUS RESTARTS AGE
# pod/demoapp-sample-799cd75ff5-lvb4w 1/1 Running 0 72s
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/demoapp-sample ClusterIP 10.96.196.76 <none> 80/TCP 75s
# NAME READY UP-TO-DATE AVAILABLE AGE
# deployment.apps/demoapp-sample 1/1 1 1 75s
# NAME DESIRED CURRENT READY AGE
# replicaset.apps/demoapp-sample-799cd75ff5 1 1 1 74s
# NAME DATA AGE
# configmap/demoapp-sample 1 75s
# NAME TYPE DATA AGE
# secret/demoapp-sample Opaque 1 76sPort-forward and curl the service to confirm the chart actually renders your message:
kubectl port-forward svc/demoapp-sample 8080:80 &
curl localhost:8080
# <html><body><h1>Hello from demo-app</h1></body></html>
kill %1The Helm release lives as a Secret (Helm's default storage backend):
kubectl get secret -l owner=helm
# NAME TYPE DATA AGE
# sh.helm.release.v1.demoapp-sample.v1 helm.sh/release.v1 1 3m22sYou now have a working pre-built Helm operator. Zero lines of Go. The full lifecycle demo (upgrade, uninstall, drift, hooks, etc.) is in Part 2. The rest of Part 1 makes the CRD safer and explains every knob in watches.yaml.
Part B - The CRD and CR (the generated API)
What operator-sdk wrote for you (the permissive default)
Read the generated CRD:
cat config/crd/bases/demo.example.com_demoapps.yamlYou will see something like:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: demoapps.demo.example.com
spec:
group: demo.example.com
names:
kind: DemoApp
listKind: DemoAppList
plural: demoapps
singular: demoapp
scope: Namespaced
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
x-kubernetes-preserve-unknown-fields: true
status:
type: object
x-kubernetes-preserve-unknown-fields: true
subresources:
status: {}Why x-kubernetes-preserve-unknown-fields: true is the default
operator-sdk init --plugins=helm.sdk.operatorframework.io/v1 cannot know the shape of your chart's values.yaml - charts can use arbitrary nested keys, conditionals, and templated values. Rather than guess, it accepts anything. That gives you a working operator on day one but no client-side validation: a typo like replicaCounnt: 3 is silently dropped at template time and the chart uses the default.
For real use you should tighten the schema.
Tightening the CRD with OpenAPI v3 markers
You want to replace the permissive spec: block inside properties: with an explicit OpenAPI v3 schema. The surrounding fields - apiVersion, kind, metadata (which is just type: object with no further schema), and status (kept permissive on purpose, the operator owns it) - stay as-is.
Because indentation in this YAML is depth-sensitive and easy to get wrong with a surgical replace, the safest path is to overwrite the whole file. Open config/crd/bases/demo.example.com_demoapps.yaml and replace the entire contents with:
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: demoapps.demo.example.com
spec:
group: demo.example.com
names:
kind: DemoApp
listKind: DemoAppList
plural: demoapps
singular: demoapp
scope: Namespaced
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
apiVersion:
type: string
kind:
type: string
metadata:
type: object
spec:
type: object
required:
- replicaCount
- message
properties:
replicaCount:
type: integer
minimum: 1
maximum: 10
default: 1
image:
type: string
pattern: '^[a-z0-9./:-]+$'
default: 'nginx:1.27-alpine'
message:
type: string
minLength: 1
maxLength: 200
apiKey:
type: string
minLength: 8
service:
type: object
properties:
type:
type: string
enum: ["ClusterIP", "NodePort", "LoadBalancer"]
default: "ClusterIP"
port:
type: integer
minimum: 1
maximum: 65535
default: 80
status:
type: object
x-kubernetes-preserve-unknown-fields: true
subresources:
status: {}Worked example - what tightening buys you
Re-apply the CRD:
kubectl apply -f config/crd/bases/demo.example.com_demoapps.yamlNow bad CRs are rejected at admission, never reach the reconciler:
cat <<EOF | kubectl apply -f -
apiVersion: demo.example.com/v1alpha1
kind: DemoApp
metadata:
name: bad
spec:
replicaCount: 99 # exceeds maximum: 10
message: "" # violates minLength: 1
service:
type: "Invalid" # not in enum
EOF
# The DemoApp "bad" is invalid:
# * spec.message: Invalid value: "": spec.message in body should be at least 1 chars long
# * spec.replicaCount: Invalid value: 99: spec.replicaCount in body should be less than or equal to 10
# * spec.service.type: Unsupported value: "Invalid": supported values: "ClusterIP", "NodePort", "LoadBalancer"(The order is alphabetical by field path, not the order you wrote them in.)
This is the single highest-leverage edit in any Helm-based operator project. Without it, the only failure signal you get is a chart render error in the operator log - far too late and far too hidden.
What happens when a user submits invalid spec
Two layers reject invalid input:
- kube-apiserver uses the OpenAPI schema to reject at admission - this is what the example above showed.
- The chart's own template logic can still fail at render time for fields the schema cannot express (e.g., "
spec.imagemust point to an image tag that exists in your registry"). Those failures appear in the operator log and setstatus.conditions[type=ReleaseFailed].status=True.
Treat the CRD schema as the first line of defense and chart template guards ({{ required "..." }}) as the second.
Rebuild and redeploy after CRD edits
CRD changes do not require rebuilding the operator image - they live in config/crd/bases/ and are applied by make deploy. Re-apply the CRD alone whenever you change validation:
kubectl apply -f config/crd/bases/demo.example.com_demoapps.yamlYou only rebuild the image when the chart or watches.yaml changes (covered in Part C).
Part C - The watches.yaml routing layer
The watches.yaml you saw in Step 3 had only the four required fields. The full schema includes four optional fields that let you tune cadence, control drift behaviour, inject values from the operator level, and filter which CRs each controller picks up.
Schema overview
| Field | Type | Required | Default | Purpose |
|---|---|---|---|---|
group |
string | yes | - | CR API group (e.g. demo.example.com) |
version |
string | yes | - | CR API version (e.g. v1alpha1) |
kind |
string | yes | - | CR Kind (e.g. DemoApp) |
chart |
string (path) | yes | - | Local chart directory inside the operator image |
reconcilePeriod |
duration | no | 1m |
Cadence of periodic resync per CR |
watchDependentResources |
bool | no | true |
Watch chart-rendered resources, reconcile on their changes |
overrideValues |
map[string]any | no | {} |
Operator-level value injection (highest precedence) |
selector |
LabelSelector | no | {} |
Only CRs matching these labels are handled by this controller |
A complete example with every field present:
- group: demo.example.com
version: v1alpha1
kind: DemoApp
chart: helm-charts/demo-app
reconcilePeriod: 30s
watchDependentResources: true
overrideValues:
image: "nginx:1.27-alpine"
registryMirror: "$REGISTRY_MIRROR"
selector:
matchLabels:
tier: productiongroup, version, kind, chart - the required four
These bind a CR Kind to a chart. They are exactly what operator-sdk init --plugins=helm.sdk.operatorframework.io/v1 filled in for you.
chartis a path inside the operator image (the Dockerfile copies yourhelm-charts/into/opt/helm/helm-charts/). Remote chart URLs are not supported - rebuild the image when the chart changes.- Changing any of
group,version,kindis an API break; users have to migrate their CRs.
reconcilePeriod - cadence
Every CR is reconciled on:
- Every event on the CR itself (create/update/delete).
- Every event on a chart-rendered resource (if
watchDependentResources: true). - Periodically, every
reconcilePeriod.
The periodic resync exists as a safety net - it re-renders the chart and re-applies anything missing, even if no event fired. The default 1m is reasonable for tens of CRs; with hundreds you should bump it to 5m or 10m to keep CPU sane. The trade-off is detection latency for "silent" drift (the kind events don't catch).
The drift demo and full tuning guidance is in Part 2.
watchDependentResources - drift on/off
When true (default), the operator subscribes to events for the resource types its chart renders. If somebody kubectl edits a chart-rendered ConfigMap, the operator wakes up and re-renders, reverting the change.
When false, the operator only reacts to events on the CR itself plus the periodic resync. Drift is only corrected on the resync cadence - useful for debugging or when you intentionally want a stable window for manual intervention.
watchDependentResources: false # only react to CR events + periodic resyncoverrideValues - operator-level value injection
Values you set here have higher precedence than the CR's .spec and higher precedence than the chart's values.yaml defaults. Use them for:
- Pinning fields the user should not override (image registry, security contexts).
- Injecting env-var-sourced values (per-environment defaults from the operator pod's environment).
- Cluster-wide labels you want on every release.
Static example - force a specific image tag for all CRs:
overrideValues:
image: "nginx:1.27-alpine"Env-var substitution example — the operator pod sets REGISTRY_MIRROR via env, and the value flows into every CR's reconcile:
overrideValues:
registryMirror: "$REGISTRY_MIRROR"
imagePullPolicy: "$IMAGE_PULL_POLICY"The supported substitution syntax is intentionally minimal:
| Form | Meaning |
|---|---|
$VAR |
Substitute env var VAR. If unset, resolves to an empty string (no error). |
${VAR} |
Same as $VAR. |
'{{ env "VAR" }}' |
Go-template form via Sprig. Same empty-string-on-unset behaviour as $VAR. |
'{{ default "x" (env "VAR") }}' |
Go-template form with a fallback. The only supported way to get a default. |
Shell-style fallback is NOT supported.
${VAR:-default}(the bash idiom) is not recognised by the helm-operator. If you write it, the substitution silently fails and the value becomes an empty string — which then overrides the chart's own default. Always set the env var on the operator pod, or use the'{{ default ... (env "VAR") }}'Go-template form.
To set the env vars, edit config/manager/manager.yaml:
env:
- name: REGISTRY_MIRROR
value: "registry.internal/proxy"
- name: IMAGE_PULL_POLICY
value: "Always"Part 2 covers the full precedence rules (overrideValues > CR .spec > chart values.yaml) and patterns like per-environment defaults and secret-handling.
selector - label-based CR filtering
With selector, the operator only reconciles CRs whose labels match. The use case is multi-tenancy with a single operator binary: run two copies of the operator with different selectors, each handling a different tenant's CRs.
selector:
matchLabels:
tier: productionA DemoApp CR labeled tier: production would be picked up; one labeled tier: staging would not. Without the label or with a different label, the CR is silently ignored by this controller. Full multi-tenant patterns (three options including selector) are in Part 2.
⚠️ Known regression —
selectoris ignored in cluster-scoped mode. Since helm-operator v1.34 there is a confirmed bug where theselectorfield is silently dropped whenWATCH_NAMESPACEis empty (i.e. the default cluster-scoped configuration). Every CR of the watched Kind gets reconciled regardless of labels. Two workarounds: (1) setWATCH_NAMESPACEto a specific namespace on the manager pod (this turns the operator into a namespace-scoped one — Part 2 walks that flip), or (2) add the labelhelm.sdk.operatorframework.io/chart: <chart-name>to every CR you want reconciled (hacky but works cluster-wide). The Part C worked example below setsWATCH_NAMESPACE: "default"so thatselectoractually filters; remove that env var if you need true cluster scope and the selector won't behave as documented above.
Multi-Kind operator - one operator, two charts
watches.yaml is a list. One operator can manage multiple Kinds, each mapped to its own chart:
- group: demo.example.com
version: v1alpha1
kind: DemoApp
chart: helm-charts/demo-app
- group: demo.example.com
version: v1alpha1
kind: WorkerApp
chart: helm-charts/worker-app
reconcilePeriod: 5m
watchDependentResources: trueYou scaffold the second Kind with operator-sdk create api:
operator-sdk create api \
--group demo \
--version v1alpha1 \
--kind WorkerApp \
--helm-chart=../worker-appThis adds a second entry to watches.yaml, copies the chart into helm-charts/worker-app/, and adds a CRD for WorkerApp. The same operator pod runs both controllers in-process.
Worked example - exercise the optional fields end-to-end
The schema and per-field notes above describe what each optional field does. This section is the hands-on counterpart: edit watches.yaml, rebuild, redeploy, and observe reconcilePeriod, overrideValues, and selector in action with three concrete checks. (watchDependentResources gets its own dedicated drift demos in Part 2 — leaving it at the default true here.)
1. Edit watches.yaml
Replace the four-line minimal watches.yaml with this:
- group: demo.example.com
version: v1alpha1
kind: DemoApp
chart: helm-charts/demo-app
reconcilePeriod: 10s
overrideValues:
apiKey: "$SHARED_DEMO_KEY"
selector:
matchLabels:
tier: demoWhat this changes:
reconcilePeriod: 10s— speeds up the periodic safety-net resync from the1m0sdefault so it's easy to observe at startup.overrideValues.apiKey: "$SHARED_DEMO_KEY"— overrides whateverapiKeythe CR sets (or doesn't set) with the value of theSHARED_DEMO_KEYenv var on the operator pod. (Remember: no${VAR:-default}shell syntax — set the env var or the override resolves to empty.)selector.matchLabels.tier: demo— only CRs labelledtier: demowill be reconciled by this controller.
2. Add the env vars to the operator pod
Open config/manager/manager.yaml, find the containers: block, and add this env: block under the manager container (SHARED_DEMO_KEY feeds the override; WATCH_NAMESPACE: "default" is required for the selector to actually filter — see the regression note in the selector section above):
env:
- name: SHARED_DEMO_KEY
value: "operator-supplied-key-12345"
- name: WATCH_NAMESPACE
value: "default"3. Rebuild, push, redeploy
Both watches.yaml and manager.yaml are baked into the operator image at make docker-build time, so changes need a new image. Bump the tag (or generate a fresh ttl.sh URL) and push:
export IMG=ttl.sh/demoapp-$(uuidgen):24h # new URL for the v0.1.1 iteration
make docker-build IMG="$IMG"
docker push "$IMG"
make deploy IMG="$IMG"
kubectl -n demo-app-operator-system rollout status \
deploy/demo-app-operator-controller-manager
# deployment "demo-app-operator-controller-manager" successfully rolled outA fresh UUID per build keeps the kubelet from caching a stale image under the same tag — saves an
imagePullPolicy: Alwayspatch. If you'd rather keep one URL across iterations, exportIMGonce and re-push to the same tag, but you'll wantimagePullPolicy: Alwayson the operator Deployment.
Check 1 - reconcilePeriod is what you set
Tail the operator startup log:
kubectl -n demo-app-operator-system logs deploy/demo-app-operator-controller-manager \
-c manager | grep -i 'reconcilePeriod'
# {"level":"info","ts":"...","logger":"helm.controller","msg":"Watching resource",
# "apiVersion":"demo.example.com/v1alpha1","kind":"DemoApp","reconcilePeriod":"10s"}reconcilePeriod":"10s" confirms the value flowed from watches.yaml into the runtime controller config. (Full drift / cadence demos are in Part 2 — this is just the "it took effect" check.)
Check 2 - selector filters CRs
Create two CRs in the same namespace - only picked carries the tier: demo label:
kubectl apply -f - <<'EOF'
apiVersion: demo.example.com/v1alpha1
kind: DemoApp
metadata:
name: picked
labels:
tier: demo # matches selector
spec:
replicaCount: 1
image: nginx:1.27-alpine
message: "Hello from the picked CR"
---
apiVersion: demo.example.com/v1alpha1
kind: DemoApp
metadata:
name: ignored
# no labels -> does NOT match selector
spec:
replicaCount: 1
image: nginx:1.27-alpine
message: "Should not be reconciled"
EOFBoth CRs exist, but only picked should have chart resources behind it:
kubectl get demoapp
# NAME AGE
# ignored 8s
# picked 8s
kubectl get deploy,svc,cm,secret -l app.kubernetes.io/name=picked
# deployment.apps/picked 1/1 1 1 20s
# service/picked ClusterIP ...
# configmap/picked 1 20s
# secret/picked Opaque 1 20s
kubectl get deploy,svc,cm,secret -l app.kubernetes.io/name=ignored
# No resources found in default namespace.The ignored CR exists in etcd but the controller filtered it out before reconcile. The operator log confirms it:
kubectl -n demo-app-operator-system logs deploy/demo-app-operator-controller-manager \
-c manager | grep -E 'picked|ignored' | head -3
# {"level":"info","ts":"...","msg":"Starting Reconcile","name":"picked","namespace":"default"}
# {"level":"info","ts":"...","msg":"Reconciled release","name":"picked","namespace":"default"}
# (no "ignored" lines — the predicate dropped the event before it reached Reconcile)Check 3 - overrideValues wins over the CR
The picked CR did not set spec.apiKey (the chart's default changeme would normally apply). But overrideValues.apiKey in watches.yaml injected the env-var-sourced value at higher precedence. Confirm by base64-decoding the Secret:
kubectl get secret picked -o jsonpath='{.data.api-key}' | base64 -d
# operator-supplied-key-12345Not changeme (the chart's values.yaml default), but operator-supplied-key-12345 — the value the operator pod's SHARED_DEMO_KEY env var resolved to and flowed through $SHARED_DEMO_KEY in watches.yaml. This is the override + env-var-substitution path the section above described, end-to-end. You can also confirm what Helm received by running helm get values picked -n default — the USER-SUPPLIED VALUES block should show apiKey: operator-supplied-key-12345. Part 2 walks the full precedence rules (overrideValues > CR .spec > chart values.yaml) with more patterns.
Confirming the override visibly. The framework also emits one
WarningEvent of reasonOverrideValuesInUseper overridden field, on the CR's Event stream — handy for explaining to CR authors why their value was ignored:bashkubectl get events --field-selector involvedObject.name=picked --sort-by=.lastTimestamp # LAST SEEN TYPE REASON OBJECT MESSAGE # 12s Warning OverrideValuesInUse demoapp/picked Chart value "apiKey" overridden to "operator-supplied-key-12345" by operator's watches.yaml
Cleanup
kubectl delete demoapp picked ignored --ignore-not-foundRecommended before starting Part 2: revert watches.yaml to the minimal four required fields and remove both SHARED_DEMO_KEY and WATCH_NAMESPACE from manager.yaml's env: block, then rebuild + redeploy with a fresh IMG=ttl.sh/demoapp-$(uuidgen):24h and re-apply config/samples/demo_v1alpha1_demoapp.yaml. Part 2 assumes the operator is cluster-scoped with a vanilla watches.yaml and a single demoapp-sample CR running.
Pitfalls per field
| Pitfall | Why it hurts |
|---|---|
Editing the chart but not running make docker-build + docker push |
The new templates never reach the running operator |
Setting reconcilePeriod: 5s with hundreds of CRs |
Operator CPU pegs; the API server takes the punishment too |
Putting a remote chart URL in chart: |
The operator boots, finds the path on disk, and crashes |
Using ${VAR:-default} in overrideValues |
Shell-style fallback is not supported; the substitution silently becomes "" |
Forgetting to declare env vars in manager.yaml for $VAR substitution |
The variable resolves to an empty string, which then overrides the chart default |
Leaving WATCH_NAMESPACE unset and using selector |
Known regression — selector is ignored cluster-scoped |
Two watches.yaml entries with overlapping selector for the same Kind |
Both controllers fight over the same CRs; status flaps |
Switching watchDependentResources: false "to debug" and forgetting |
Drift goes uncorrected until next periodic resync |
What's next - Part 2
You now have a working Helm-based operator with a tight CRD and a fully understood watches.yaml. Part 2 - Lifecycle, drift, hooks, scope, and the hard ceiling picks up here and covers:
- Lifecycle: upgrade the CR and see
helm upgraderun, delete the CR and seehelm uninstallcascade. - Values mapping: full precedence rules (
overrideValues > CR .spec > chart values.yaml), env-var substitution patterns, secret-handling patterns. - Drift detection: edit a ConfigMap with
kubectl edit, watch the operator revert it within seconds; same for deletion. - Helm hooks: pre/post install/upgrade/delete Jobs - the pre-built operator's only escape hatch for "do something custom around the chart."
- Scope and multi-tenancy: flip from cluster-scoped to namespace-scoped (
WATCH_NAMESPACE+ RBAC swap), and three options for multi-tenant deployments includingselector. - The hard ceiling: the features the pre-built operator cannot provide - custom finalizer logic against external systems, custom status fields, cross-CR coordination, reading external state in reconcile. Each entry links to the Helm hybrid operator where the ceiling is broken.
If you only need "install the chart on every CR," you may never need Part 2. If you need anything beyond that - even drift recovery on its own - read it.
Further reading
- Helm operator vs Flux vs Argo CD - when each tool is the right pick.
- Helm hybrid operator (Go + Helm SDK) - same chart, your own Go reconciler, none of the ceiling.
- Custom Resource Definitions explained - the language-neutral CRD reference.
- Operator-SDK install on Linux - if you skipped the prerequisite.
- Helm docs - chart template guide - everything chart authors should know (applies to both Helm 3 and Helm 4 charts).
Summary
The Helm-based operator gives you a Kubernetes operator with zero lines of Go: a Helm chart, a CRD, and a watches.yaml mapping the two. The pre-built reconciler ships from Operator SDK and handles install/upgrade/uninstall on every CR event. Part 1 walked you from an empty directory to a deployed operator with a tightened CRD and a fully understood watches.yaml - the four required fields plus the four optional knobs (reconcilePeriod, watchDependentResources, overrideValues, selector). The single most valuable edit you can make is tightening the CRD's schema; the permissive x-kubernetes-preserve-unknown-fields: true default is fine for day one and dangerous for day ten. Part 2 picks up where Part 1 ends: lifecycle, drift, Helm hooks, scope, and the hard ceiling beyond which only a Helm hybrid operator will do.

