Kubernetes Operator with controller-runtime: Status, Finalizers, Webhooks, and Drift

Last reviewed: by
Kubernetes Operator with controller-runtime: Status, Finalizers, Webhooks, and Drift

This article continues the Go-based Kubernetes Operator tutorial. The foundation article ended with a working DemoApp operator that reconciles one Deployment.

That foundation is useful, but real operators rarely stop at one object. A practical operator usually manages several children, publishes status, handles deletion safely, reacts to drift, watches referenced resources, and rejects bad custom resources before they reach the reconciler.

This controller-runtime tutorial adds those core capabilities to the same DemoApp operator.

By the end, DemoApp will manage:

  • a ConfigMap containing app configuration
  • a Service exposing the app
  • a Deployment running the app
  • status conditions that tell users what happened
  • a finalizer that performs cleanup before deletion
  • watches on owned resources and referenced Secrets
  • mutating and validating webhooks
  • Kubernetes Events for important changes

This article intentionally links to the deeper concept articles where appropriate. The goal here is a tutorial path, not a second copy of every theory page.

Go operator series (3 parts): Part 1 — Operator SDK foundation · Part 2 — controller-runtime (this page) · Part 3 — envtest, fake client, kind · Operator tutorial hub


Why this article matters

A basic Operator SDK controller is easy to scaffold. A useful Kubernetes operator is harder because it has to behave well when the cluster changes underneath it.

Operators rarely face only the question of whether a controller can create a Deployment. Typical questions include:

  • Why does a custom resource show no status?
  • How are external resources cleaned up before a CR is deleted?
  • How does an operator recreate a ConfigMap after someone edits it manually?
  • How can a controller watch a Secret it references but does not own?
  • Should validation live in the CRD schema, in a webhook, or in the reconciler?
  • Why does the reconcile loop run repeatedly?
  • Which RBAC permissions does the controller actually need?

Those are the questions this article answers in code. The examples use the official controller-runtime building blocks documented in the controller-runtime project, the admission webhook model documented by Kubernetes in dynamic admission control, and the finalizer behavior documented in the Kubernetes finalizers concept page.

This article is the bridge between a scaffolded Go operator and an operator that a platform team could use as a starting point.


Starting point

You should already have the project from the Go Operator SDK foundation tutorial:

text
~/operators/demoapp-operator
├── api/v1alpha1/demoapp_types.go
├── internal/controller/demoapp_controller.go
├── config/samples/demo_v1alpha1_demoapp.yaml
└── config/

Check that the foundation project still works:

bash
make generate
make manifests
make install
make run

In another terminal:

bash
kubectl apply -f config/samples/demo_v1alpha1_demoapp.yaml
kubectl get demoapp hello
kubectl get deployment hello

Stop make run before editing.


Step 1 - Expand the DemoApp API

Update DemoAppSpec in api/v1alpha1/demoapp_types.go:

go
type DemoAppSpec struct {
	// Image is the container image used by the DemoApp Deployment.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	Image string `json:"image"`

	// Replicas is the desired number of application Pods.
	//
	// +kubebuilder:default=1
	// +kubebuilder:validation:Minimum=1
	// +kubebuilder:validation:Maximum=10
	// +optional
	Replicas *int32 `json:"replicas,omitempty"`

	// Port is the application port exposed by the container and Service.
	//
	// +kubebuilder:default=8080
	// +kubebuilder:validation:Minimum=1
	// +kubebuilder:validation:Maximum=65535
	// +optional
	Port int32 `json:"port,omitempty"`

	// Message is stored in a ConfigMap and injected as DEMO_MESSAGE.
	//
	// +kubebuilder:default="hello from DemoApp"
	// +optional
	Message string `json:"message,omitempty"`

	// ConfigVersion is copied into the ConfigMap. Changing it gives users
	// a simple knob to trigger config rollout behavior.
	//
	// +kubebuilder:default="v1"
	// +kubebuilder:validation:MinLength=1
	// +optional
	ConfigVersion string `json:"configVersion,omitempty"`

	// ApiKeySecretName is an optional Secret name. When set, the operator
	// watches that Secret and injects API_KEY from key "api-key".
	//
	// +optional
	ApiKeySecretName string `json:"apiKeySecretName,omitempty"`
}

Update DemoAppStatus:

go
type DemoAppStatus struct {
	// Conditions summarize the current state for humans and automation.
	// +optional
	Conditions []metav1.Condition `json:"conditions,omitempty"`

	// ObservedGeneration is the latest metadata.generation processed by the controller.
	// +optional
	ObservedGeneration int64 `json:"observedGeneration,omitempty"`

	// ReadyReplicas is copied from the Deployment status.
	// +optional
	ReadyReplicas int32 `json:"readyReplicas,omitempty"`

	// ServiceName is the Service created for this app.
	// +optional
	ServiceName string `json:"serviceName,omitempty"`

	// ConfigMapName is the ConfigMap created for this app.
	// +optional
	ConfigMapName string `json:"configMapName,omitempty"`
}

The CR now describes a small application, not just a container image. This is the shape most users expect from a real operator: one high-level API, several lower-level Kubernetes resources.

Run:

bash
make generate
make manifests

Step 2 - Add condition helpers

Create internal/controller/status.go:

go
package controller

import (
	"k8s.io/apimachinery/pkg/api/meta"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"

	demov1alpha1 "github.com/example/demoapp-operator/api/v1alpha1"
)

const (
	ConditionAvailable   = "Available"
	ConditionProgressing = "Progressing"
	ConditionDegraded    = "Degraded"
)

func markProgressing(app *demov1alpha1.DemoApp, reason, message string) {
	meta.SetStatusCondition(&app.Status.Conditions, metav1.Condition{
		Type:               ConditionProgressing,
		Status:             metav1.ConditionTrue,
		Reason:             reason,
		Message:            message,
		ObservedGeneration: app.Generation,
	})
}

func markAvailable(app *demov1alpha1.DemoApp, reason, message string) {
	meta.SetStatusCondition(&app.Status.Conditions, metav1.Condition{
		Type:               ConditionAvailable,
		Status:             metav1.ConditionTrue,
		Reason:             reason,
		Message:            message,
		ObservedGeneration: app.Generation,
	})
	meta.SetStatusCondition(&app.Status.Conditions, metav1.Condition{
		Type:               ConditionProgressing,
		Status:             metav1.ConditionFalse,
		Reason:             reason,
		Message:            message,
		ObservedGeneration: app.Generation,
	})
	meta.RemoveStatusCondition(&app.Status.Conditions, ConditionDegraded)
}

func markDegraded(app *demov1alpha1.DemoApp, reason, message string) {
	meta.SetStatusCondition(&app.Status.Conditions, metav1.Condition{
		Type:               ConditionDegraded,
		Status:             metav1.ConditionTrue,
		Reason:             reason,
		Message:            message,
		ObservedGeneration: app.Generation,
	})
	meta.SetStatusCondition(&app.Status.Conditions, metav1.Condition{
		Type:               ConditionAvailable,
		Status:             metav1.ConditionFalse,
		Reason:             reason,
		Message:            message,
		ObservedGeneration: app.Generation,
	})
}

Use the Kubernetes-standard condition shape. Avoid custom status fields like phase: Running as the only signal; conditions are easier for users and automation to inspect.

For the deeper model, see Kubernetes Status Subresource and Conditions.


Step 3 - Add desired-state builders for ConfigMap and Service

Create internal/controller/resources.go:

go
package controller

import (
	appsv1 "k8s.io/api/apps/v1"
	corev1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/utils/ptr"

	demov1alpha1 "github.com/example/demoapp-operator/api/v1alpha1"
)

func labelsFor(app *demov1alpha1.DemoApp) map[string]string {
	return map[string]string{
		"app.kubernetes.io/name":       "demoapp",
		"app.kubernetes.io/instance":   app.Name,
		"app.kubernetes.io/managed-by": "demoapp-operator",
	}
}

func desiredReplicas(app *demov1alpha1.DemoApp) int32 {
	if app.Spec.Replicas == nil {
		return 1
	}
	return *app.Spec.Replicas
}

func desiredPort(app *demov1alpha1.DemoApp) int32 {
	if app.Spec.Port == 0 {
		return 8080
	}
	return app.Spec.Port
}

func desiredMessage(app *demov1alpha1.DemoApp) string {
	if app.Spec.Message == "" {
		return "hello from DemoApp"
	}
	return app.Spec.Message
}

func desiredConfigVersion(app *demov1alpha1.DemoApp) string {
	if app.Spec.ConfigVersion == "" {
		return "v1"
	}
	return app.Spec.ConfigVersion
}

func buildConfigMap(app *demov1alpha1.DemoApp) *corev1.ConfigMap {
	return &corev1.ConfigMap{
		ObjectMeta: metav1.ObjectMeta{
			Name:      app.Name + "-config",
			Namespace: app.Namespace,
			Labels:    labelsFor(app),
		},
		Data: map[string]string{
			"message":       desiredMessage(app),
			"configVersion": desiredConfigVersion(app),
		},
	}
}

func buildService(app *demov1alpha1.DemoApp) *corev1.Service {
	labels := labelsFor(app)
	return &corev1.Service{
		ObjectMeta: metav1.ObjectMeta{
			Name:      app.Name,
			Namespace: app.Namespace,
			Labels:    labels,
		},
		Spec: corev1.ServiceSpec{
			Type: corev1.ServiceTypeClusterIP,
			Selector: map[string]string{
				"app.kubernetes.io/instance": app.Name,
			},
			Ports: []corev1.ServicePort{
				{
					Name: "http",
					Port: desiredPort(app),
				},
			},
		},
	}
}

func buildDeployment(app *demov1alpha1.DemoApp) *appsv1.Deployment {
	labels := labelsFor(app)
	port := desiredPort(app)

	env := []corev1.EnvVar{
		{
			Name: "DEMO_MESSAGE",
			ValueFrom: &corev1.EnvVarSource{
				ConfigMapKeyRef: &corev1.ConfigMapKeySelector{
					LocalObjectReference: corev1.LocalObjectReference{
						Name: app.Name + "-config",
					},
					Key: "message",
				},
			},
		},
	}

	if app.Spec.ApiKeySecretName != "" {
		env = append(env, corev1.EnvVar{
			Name: "API_KEY",
			ValueFrom: &corev1.EnvVarSource{
				SecretKeyRef: &corev1.SecretKeySelector{
					LocalObjectReference: corev1.LocalObjectReference{
						Name: app.Spec.ApiKeySecretName,
					},
					Key:      "api-key",
					Optional: ptr.To(true),
				},
			},
		})
	}

	return &appsv1.Deployment{
		ObjectMeta: metav1.ObjectMeta{
			Name:      app.Name,
			Namespace: app.Namespace,
			Labels:    labels,
		},
		Spec: appsv1.DeploymentSpec{
			Replicas: ptr.To(desiredReplicas(app)),
			Selector: &metav1.LabelSelector{
				MatchLabels: map[string]string{
					"app.kubernetes.io/instance": app.Name,
				},
			},
			Template: corev1.PodTemplateSpec{
				ObjectMeta: metav1.ObjectMeta{
					Labels: labels,
				},
				Spec: corev1.PodSpec{
					Containers: []corev1.Container{
						{
							Name:  "app",
							Image: app.Spec.Image,
							Ports: []corev1.ContainerPort{
								{
									Name:          "http",
									ContainerPort: port,
								},
							},
							Env: env,
						},
					},
				},
			},
		},
	}
}

This file keeps resource construction separate from reconciliation. That matters because the testing tutorial can test these builders without starting a Kubernetes API server.


Step 4 - Reconcile multiple child resources

Update internal/controller/demoapp_controller.go:

go
package controller

import (
	"context"

	appsv1 "k8s.io/api/apps/v1"
	apierrors "k8s.io/apimachinery/pkg/api/errors"
	"k8s.io/apimachinery/pkg/api/equality"
	"k8s.io/apimachinery/pkg/runtime"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/client"
	"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
	"sigs.k8s.io/controller-runtime/pkg/log"

	demov1alpha1 "github.com/example/demoapp-operator/api/v1alpha1"
)

const demoAppFinalizer = "demo.example.com/finalizer"

type DemoAppReconciler struct {
	client.Client
	Scheme *runtime.Scheme
}

// +kubebuilder:rbac:groups=demo.example.com,resources=demoapps,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=demo.example.com,resources=demoapps/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=demo.example.com,resources=demoapps/finalizers,verbs=update
// +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups="",resources=services,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups="",resources=configmaps,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups="",resources=secrets,verbs=get;list;watch
// +kubebuilder:rbac:groups="",resources=events,verbs=create;patch

func (r *DemoAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	logger := log.FromContext(ctx)

	var app demov1alpha1.DemoApp
	if err := r.Get(ctx, req.NamespacedName, &app); err != nil {
		if apierrors.IsNotFound(err) {
			return ctrl.Result{}, nil
		}
		return ctrl.Result{}, err
	}
	original := app.DeepCopy()

	if !app.ObjectMeta.DeletionTimestamp.IsZero() {
		return r.reconcileDelete(ctx, &app)
	}

	if controllerutil.AddFinalizer(&app, demoAppFinalizer) {
		if err := r.Update(ctx, &app); err != nil {
			return ctrl.Result{}, err
		}
		return ctrl.Result{}, nil
	}

	markProgressing(&app, "Reconciling", "Reconciling DemoApp child resources")

	if err := r.reconcileConfigMap(ctx, &app); err != nil {
		return r.failStatus(ctx, &app, "ConfigMapFailed", err)
	}
	if err := r.reconcileService(ctx, &app); err != nil {
		return r.failStatus(ctx, &app, "ServiceFailed", err)
	}
	deployment, err := r.reconcileDeployment(ctx, &app)
	if err != nil {
		return r.failStatus(ctx, &app, "DeploymentFailed", err)
	}

	app.Status.ObservedGeneration = app.Generation
	app.Status.ConfigMapName = app.Name + "-config"
	app.Status.ServiceName = app.Name
	if deployment != nil {
		app.Status.ReadyReplicas = deployment.Status.ReadyReplicas
	}

	if deployment != nil && deployment.Status.ReadyReplicas >= desiredReplicas(&app) {
		markAvailable(&app, "DeploymentReady", "All requested replicas are ready")
	} else {
		markProgressing(&app, "WaitingForReplicas", "Waiting for Deployment replicas to become ready")
	}

	if !equality.Semantic.DeepEqual(original.Status, app.Status) {
		if err := r.Status().Update(ctx, &app); err != nil {
			return ctrl.Result{}, err
		}
	}

	logger.Info("reconciled DemoApp", "name", app.Name, "namespace", app.Namespace)
	return ctrl.Result{}, nil
}

func (r *DemoAppReconciler) reconcileConfigMap(ctx context.Context, app *demov1alpha1.DemoApp) error {
	cm := buildConfigMap(app)
	return r.createOrUpdateOwned(ctx, app, cm, func() {
		desired := buildConfigMap(app)
		cm.Labels = desired.Labels
		cm.Data = desired.Data
	})
}

func (r *DemoAppReconciler) reconcileService(ctx context.Context, app *demov1alpha1.DemoApp) error {
	svc := buildService(app)
	return r.createOrUpdateOwned(ctx, app, svc, func() {
		desired := buildService(app)
		svc.Labels = desired.Labels
		svc.Spec.Type = desired.Spec.Type
		svc.Spec.Selector = desired.Spec.Selector
		svc.Spec.Ports = desired.Spec.Ports
	})
}

func (r *DemoAppReconciler) reconcileDeployment(ctx context.Context, app *demov1alpha1.DemoApp) (*appsv1.Deployment, error) {
	deploy := buildDeployment(app)
	err := r.createOrUpdateOwned(ctx, app, deploy, func() {
		desired := buildDeployment(app)
		deploy.Labels = desired.Labels
		deploy.Spec = desired.Spec
	})
	return deploy, err
}

func (r *DemoAppReconciler) createOrUpdateOwned(
	ctx context.Context,
	app *demov1alpha1.DemoApp,
	obj client.Object,
	mutate func(),
) error {
	_, err := controllerutil.CreateOrUpdate(ctx, r.Client, obj, func() error {
		mutate()
		return controllerutil.SetControllerReference(app, obj, r.Scheme)
	})
	return err
}

func (r *DemoAppReconciler) failStatus(
	ctx context.Context,
	app *demov1alpha1.DemoApp,
	reason string,
	err error,
) (ctrl.Result, error) {
	markDegraded(app, reason, err.Error())
	_ = r.Status().Update(ctx, app)
	return ctrl.Result{}, err
}

The reconciler now does three separate writes:

  1. ConfigMap
  2. Service
  3. Deployment

That order is deliberate. The Deployment references the ConfigMap, and the Service selects Pods created by the Deployment. Kubernetes can tolerate any order here, but human readers understand dependency order more easily.

This pattern is covered more deeply in Multi-Resource Reconciliation.

This shape is close to what production operators do. The custom resource is not a one-to-one replacement for a Deployment. It is an application API. The operator owns the translation from that application API to the Kubernetes objects needed to run it.

That translation layer gives you control:

  • the user does not need to know the labels required by the Service selector
  • the user does not need to know how the ConfigMap is mounted or referenced
  • the operator can keep immutable Deployment selector labels stable
  • the operator can add labels consistently for inventory and cleanup
  • status can summarize many children into one readable custom resource

This is also why the desired-state builders matter. Each child type has a small function that answers one question: "What should this object look like for the current DemoApp?" The reconciler then applies those desired objects in a predictable order.

Avoid building one huge Reconcile function that constructs every resource inline. It becomes hard to test, hard to review, and easy to break when the API grows.

The original := app.DeepCopy() line is also intentional. The reconciler updates .status only when the status actually changed. Writing status on every reconcile can create a noisy update loop: the status write changes the object's resource version, the watch sees the update, and the controller reconciles again. Conditions should be useful state, not a heartbeat.


Step 5 - Add finalizer cleanup

Add this delete path to the same controller file:

go
func (r *DemoAppReconciler) reconcileDelete(ctx context.Context, app *demov1alpha1.DemoApp) (ctrl.Result, error) {
	if !controllerutil.ContainsFinalizer(app, demoAppFinalizer) {
		return ctrl.Result{}, nil
	}

	audit := &corev1.ConfigMap{
		ObjectMeta: metav1.ObjectMeta{
			Name:      app.Name + "-delete-audit",
			Namespace: app.Namespace,
		},
	}

	_, err := controllerutil.CreateOrUpdate(ctx, r.Client, audit, func() error {
		audit.Labels = map[string]string{
			"app.kubernetes.io/name":       "demoapp-delete-audit",
			"app.kubernetes.io/instance":   app.Name,
			"app.kubernetes.io/managed-by": "demoapp-operator",
		}
		audit.Data = map[string]string{
			"demoApp":   app.Name,
			"namespace": app.Namespace,
			"deletedAt": app.DeletionTimestamp.Time.UTC().Format(time.RFC3339),
		}
		return nil
	})
	if err != nil {
		return ctrl.Result{}, err
	}

	controllerutil.RemoveFinalizer(app, demoAppFinalizer)
	if err := r.Update(ctx, app); err != nil {
		return ctrl.Result{}, err
	}

	return ctrl.Result{}, nil
}

Add these imports if they are not already present:

go
import (
	"time"

	corev1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

This finalizer writes an audit ConfigMap before deletion completes. In real systems, this is where you would clean up external state that owner references cannot handle:

  • delete a cloud database
  • remove a DNS record
  • revoke a license
  • detach a backup schedule
  • notify an external inventory system

If all cleanup is inside Kubernetes and all children are owned by the CR, owner references may be enough. Finalizers are for cleanup that must happen before the parent object disappears.

See Kubernetes Finalizers Explained for the full deletion model and Owner references and garbage collection for how SetControllerReference ties child lifecycle to the CR.

The important finalizer rule is: add the finalizer before you create anything that requires cleanup.

If the controller creates an external database and only adds the finalizer afterwards, a crash between those two operations can leave an orphaned external database. In this tutorial, the first reconcile adds the finalizer and returns. The next reconcile creates or updates child resources. That extra loop is normal and intentional.

Finalizers should also be narrow. Do not use a finalizer as a generic "delete everything" mechanism when Kubernetes garbage collection can already delete owned children. Use owner references for Kubernetes-owned lifecycle. Use finalizers for work the API server cannot do for you.


Step 6 - Watch owned resources and referenced Secrets

The controller already watches owned resources with Owns. Add all child types:

go
func (r *DemoAppReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&demov1alpha1.DemoApp{}).
		Owns(&appsv1.Deployment{}).
		Owns(&corev1.Service{}).
		Owns(&corev1.ConfigMap{}).
		Watches(
			&corev1.Secret{},
			handler.EnqueueRequestsFromMapFunc(r.demoAppsForSecret),
		).
		Complete(r)
}

Now add the mapper:

go
func (r *DemoAppReconciler) demoAppsForSecret(ctx context.Context, secret client.Object) []reconcile.Request {
	var list demov1alpha1.DemoAppList
	if err := r.List(ctx, &list, client.InNamespace(secret.GetNamespace())); err != nil {
		return nil
	}

	requests := make([]reconcile.Request, 0)
	for i := range list.Items {
		app := list.Items[i]
		if app.Spec.ApiKeySecretName == secret.GetName() {
			requests = append(requests, reconcile.Request{
				NamespacedName: types.NamespacedName{
					Namespace: app.Namespace,
					Name:      app.Name,
				},
			})
		}
	}
	return requests
}

Add these imports if they are not already present:

go
import (
	"k8s.io/apimachinery/pkg/types"
	"sigs.k8s.io/controller-runtime/pkg/handler"
	"sigs.k8s.io/controller-runtime/pkg/reconcile"
)

This is a common production pattern:

  • Owns handles resources created by the operator.
  • Watches with a mapper handles external resources the operator references but does not own.

For example, the operator does not own the user's Secret. It should not delete it. But if the Secret changes, the operator may need to restart Pods, update status, or validate that the key exists.

For the full event model, see Watches, Events, and Predicates.

Import checklist for demoapp_controller.go

The snippets above split imports across steps. Before Step 7, your controller file should compile with imports along these lines (adjust paths if your module is not github.com/example/demoapp-operator):

go
import (
	"context"
	"time"

	appsv1 "k8s.io/api/apps/v1"
	corev1 "k8s.io/api/core/v1"
	apierrors "k8s.io/apimachinery/pkg/api/errors"
	"k8s.io/apimachinery/pkg/api/equality"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/apimachinery/pkg/runtime"
	"k8s.io/apimachinery/pkg/types"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/client"
	"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
	"sigs.k8s.io/controller-runtime/pkg/handler"
	"sigs.k8s.io/controller-runtime/pkg/log"
	"sigs.k8s.io/controller-runtime/pkg/reconcile"

	demov1alpha1 "github.com/example/demoapp-operator/api/v1alpha1"
)

Step 8 adds k8s.io/client-go/tools/record for Recorder.Event (reuse the existing corev1 import for corev1.EventTypeNormal).


Step 7 - Add admission webhooks

Generate webhook scaffolding:

bash
operator-sdk create webhook --group demo --version v1alpha1 --kind DemoApp --defaulting --programmatic-validation

With current Operator SDK go/v4, this creates webhook code under internal/webhook/v1alpha1/ and updates cmd/main.go so the manager serves admission webhooks. Older examples may place webhook code under api/v1alpha1/; follow the path the scaffold generated.

Implement defaulting:

In the generated webhook file, DemoApp refers to the API type (api/v1alpha1.DemoApp or the package alias from the scaffold — adjust imports to match operator-sdk create webhook output).

go
func (d *DemoAppCustomDefaulter) Default(ctx context.Context, obj runtime.Object) error {
	app, ok := obj.(*DemoApp)
	if !ok {
		return fmt.Errorf("expected DemoApp but got %T", obj)
	}

	if app.Spec.Replicas == nil {
		app.Spec.Replicas = ptr.To[int32](1)
	}
	if app.Spec.Port == 0 {
		app.Spec.Port = 8080
	}
	if app.Spec.Message == "" {
		app.Spec.Message = "hello from DemoApp"
	}
	if app.Spec.ConfigVersion == "" {
		app.Spec.ConfigVersion = "v1"
	}
	return nil
}

Implement validation:

go
func (v *DemoAppCustomValidator) ValidateCreate(ctx context.Context, obj runtime.Object) (admission.Warnings, error) {
	app, ok := obj.(*DemoApp)
	if !ok {
		return nil, fmt.Errorf("expected DemoApp but got %T", obj)
	}
	return nil, validateDemoApp(app)
}

func (v *DemoAppCustomValidator) ValidateUpdate(ctx context.Context, oldObj, newObj runtime.Object) (admission.Warnings, error) {
	oldApp, ok := oldObj.(*DemoApp)
	if !ok {
		return nil, fmt.Errorf("expected old DemoApp but got %T", oldObj)
	}
	newApp, ok := newObj.(*DemoApp)
	if !ok {
		return nil, fmt.Errorf("expected new DemoApp but got %T", newObj)
	}

	if oldApp.Spec.ApiKeySecretName != "" && oldApp.Spec.ApiKeySecretName != newApp.Spec.ApiKeySecretName {
		return nil, field.Forbidden(
			field.NewPath("spec", "apiKeySecretName"),
			"apiKeySecretName is immutable after it is set",
		)
	}

	return nil, validateDemoApp(newApp)
}

func (v *DemoAppCustomValidator) ValidateDelete(ctx context.Context, obj runtime.Object) (admission.Warnings, error) {
	return nil, nil
}

func validateDemoApp(app *DemoApp) error {
	var allErrs field.ErrorList
	path := field.NewPath("spec")

	if app.Spec.Image == "" {
		allErrs = append(allErrs, field.Required(path.Child("image"), "image is required"))
	}
	if app.Spec.Replicas != nil && *app.Spec.Replicas > 5 && app.Spec.ApiKeySecretName == "" {
		allErrs = append(allErrs, field.Required(path.Child("apiKeySecretName"), "apiKeySecretName is required when replicas is greater than 5"))
	}
	if app.Spec.Port == 22 {
		allErrs = append(allErrs, field.Forbidden(path.Child("port"), "port 22 is reserved"))
	}

	return allErrs.ToAggregate()
}

Typical imports for the webhook file:

go
import (
	"context"
	"fmt"

	"k8s.io/apimachinery/pkg/runtime"
	"k8s.io/apimachinery/pkg/util/validation/field"
	"k8s.io/utils/ptr"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/webhook/admission"
)

Run:

bash
make generate
make manifests

Webhooks are where many beginner tutorials are too thin. Remember:

  • CRD schema validation handles simple structural rules.
  • Mutating webhooks set defaults that need code.
  • Validating webhooks reject invalid cross-field combinations.
  • Webhooks run in the API server admission path, before the object is stored.
  • Webhook certificates must work in a real cluster. The testing tutorial verifies this with kind.

For the deeper webhook setup, including cert-manager wiring, see Mutating and Validating Admission Webhooks in Operators.

A practical placement rule:

Rule type Best place
Field is required CRD schema
Field has min/max/enum CRD schema
Field should get a simple static default CRD schema or mutating webhook
Default depends on another field Mutating webhook
Field is immutable after creation Validating webhook
Value depends on another object Validating webhook or reconciler status
External system must approve it Usually reconciler status, not admission

Do not put every business rule into admission. Admission blocks the user's write request. If a rule depends on an external API that can be slow or unavailable, it may be better to accept the CR and mark it Degraded in status. Use admission for fast, deterministic rules that should reject bad desired state immediately.


Step 8 - Emit Kubernetes Events

Events help users understand what the operator did without reading controller logs.

Add a recorder to the reconciler:

go
import "k8s.io/client-go/tools/record"

type DemoAppReconciler struct {
	client.Client
	Scheme   *runtime.Scheme
	Recorder record.EventRecorder
}

Update cmd/main.go where the reconciler is created:

go
if err = (&controller.DemoAppReconciler{
	Client:   mgr.GetClient(),
	Scheme:   mgr.GetScheme(),
	Recorder: mgr.GetEventRecorderFor("demoapp-controller"),
}).SetupWithManager(mgr); err != nil {
	setupLog.Error(err, "unable to create controller", "controller", "DemoApp")
	os.Exit(1)
}

Emit an event when user-visible status changes, immediately after the status update succeeds:

go
if r.Recorder != nil {
	r.Recorder.Event(&app, corev1.EventTypeNormal, "Reconciled", "DemoApp child resources reconciled")
}

Then check:

bash
kubectl describe demoapp hello
kubectl get events --field-selector involvedObject.name=hello

Use Events for user-facing state transitions, not for every loop. If every reconcile emits an event, users get noise.


Step 9 - Run the upgraded operator locally

Install the updated CRD:

bash
make install

Run locally:

bash
make run

Apply this updated sample:

yaml
apiVersion: demo.example.com/v1alpha1
kind: DemoApp
metadata:
  name: hello
spec:
  image: nginx:1.27
  replicas: 2
  port: 80
  message: "hello from the controller-runtime tutorial"
  configVersion: "v2"

Verify:

bash
kubectl get configmap hello-config -o yaml
kubectl get service hello
kubectl get deployment hello
kubectl get demoapp hello -o yaml

Look for:

  • status.conditions
  • status.readyReplicas
  • status.serviceName
  • status.configMapName

Validated output (addresses and ages vary):

bash
kubectl get service hello
# NAME    TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
# hello   ClusterIP   10.96.x.x    <none>        80/TCP    1m

kubectl get demoapp hello -o jsonpath='{.status.readyReplicas}{"|"}{.status.configMapName}{"|"}{.status.serviceName}{"\n"}'
# 2|hello-config|hello

Step 10 - Prove drift correction

Manually edit the ConfigMap:

bash
kubectl patch configmap hello-config --type=merge -p '{"data":{"message":"manual drift"}}'
kubectl get configmap hello-config -o jsonpath='{.data.message}{"\n"}'

Because the ConfigMap is owned and watched, the operator should reconcile it back:

bash
kubectl get configmap hello-config -o jsonpath='{.data.message}{"\n"}'

You should see:

text
hello from the controller-runtime tutorial

Manually scale the Deployment:

bash
kubectl scale deployment hello --replicas=5
kubectl get deployment hello -o jsonpath='{.spec.replicas}{"\n"}'

The operator should restore it to the CR's desired replica count.

This is basic drift correction. If multiple actors need to own different fields of the same object, consider Server-Side Apply in Operators instead of whole-spec mutation. For patterns and anti-patterns, see Drift detection patterns in operators.


Step 11 - Prove the finalizer

Delete the CR:

bash
kubectl delete demoapp hello

Check the audit ConfigMap:

bash
kubectl get configmap hello-delete-audit -o yaml

The CR deletion should complete only after the finalizer path writes the audit record and removes the finalizer.

If deletion hangs, inspect:

bash
kubectl get demoapp hello -o yaml
kubectl logs -n demoapp-operator-system deploy/demoapp-operator-controller-manager -c manager

A stuck finalizer means the controller cannot complete its cleanup logic.


Before you call this production-ready

This tutorial stops at the “feature-complete reconciler” line. For multi-replica managers you still need leader election, observability from Prometheus metrics, and correct liveness/readiness probes on the Deployment that runs your manager. Part 3 proves behavior with tests; those production topics have dedicated chapters so this file stays readable.


Tutorial checkpoint

The operator now has the important core capabilities:

  • multiple child resources
  • idempotent reconciliation
  • owner references
  • owned-resource watches
  • external Secret watch mapping
  • status conditions
  • finalizer cleanup
  • drift correction
  • webhooks for defaulting and validation
  • Events for user-visible state changes
  • generated RBAC from markers

That is the real "Go operator" experience: the project is no longer a single Deployment driven by one CR.

Testing Kubernetes Operators with envtest and kind answers the next practical questions:

  • How should this operator be tested end to end?
  • Is a fake client enough on its own?
  • How are admission webhooks exercised under test?
  • How is the manager image built and deployed?
  • How are operator upgrades rolled out safely?
  • How are the usual failure modes diagnosed?

Continue to Testing Kubernetes Operators with envtest and kind.


Frequently Asked Questions

1. What is the difference between Part 1 and this controller-runtime tutorial?

Part 1 built the smallest useful operator: one CR drives one Deployment. This article layers what most production teams need next: ConfigMap + Service + Deployment, status Conditions and ObservedGeneration, a finalizer delete path, drift correction, Owns/Watches for secondary inputs, mutating/validating admission webhooks, Events, and tighter RBAC markers.

2. Should every operator use finalizers?

No. Use a finalizer only when deletion needs controller-managed cleanup that garbage collection cannot perform (external APIs, off-cluster data, breaking a dependency graph). If every child is a normal Kubernetes object owned by the CR, owner references are often enough. This tutorial uses a finalizer to write an audit ConfigMap so the two-phase delete pattern is explicit.

3. Should I use `CreateOrUpdate` or Server-Side Apply (SSA)?

Both are valid. CreateOrUpdate is readable when your operator fully owns the child spec. SSA is better when multiple controllers share fields or you need explicit field managers — see Server-Side Apply in operators.

4. Why add admission webhooks if the CRD already has OpenAPI validation?

CRD schema covers structural rules (required fields, enums, min/max). Webhooks cover imperative rules: immutability after create, defaults that depend on other fields, cross-object checks, and anything the OpenAPI subset cannot express.

5. Does this article include automated tests for the operator?

No — it focuses on runtime behavior. Part 3 walks through unit tests, fake client, envtest (including webhook admission), and kind smoke tests for the same code paths.

6. Why does the operator put `message` in a ConfigMap instead of only env on the Deployment?

Real workloads usually separate configuration from the Pod template so rollouts and config bumps are independent. It also gives the tutorial a second owned object to reconcile and drift-correct.

7. Why is the API key Secret optional, and should the operator own it?

Optional Secrets are common for credentials users bring themselves. The operator should reference or read them but normally not set controller: true ownership on user Secrets — you rarely want the CR delete to wipe someone else's credential object.

8. Why does status track `ObservedGeneration` alongside Conditions?

Clients use metadata.generation vs status.observedGeneration to know whether Conditions reflect the latest spec. Without that signal, automation may read stale Ready=True after a spec change.

9. Why not add HPA, Ingress, NetworkPolicy, ServiceAccount, and RoleBinding in the same tutorial?

The pattern repeats: builder + idempotent apply + ownership where appropriate + status aggregation. Adding every workload type would obscure the mechanics already shared with multi-resource reconciliation.

10. Are webhooks required for every operator?

No. Start with CRD validation, add webhooks when you need code-level defaulting/validation, and read Mutating and validating admission webhooks for TLS, cert-manager, and failure modes.

11. My status flips constantly / reconcile never settles — what is the first thing to check?

Status hot loops usually mean you Status().Update even when nothing changed, or you watch your own status writes without a predicate. Diff status before writing (see Kubernetes reconcile loop) and consider GenerationChangedPredicate on the primary For() if appropriate.

What's next?

Ship and verify the same operator in Part 3: Testing with envtest, fake client, and kind. Need the conceptual map first? Reconcile loop, Status and Conditions, and Finalizers deepen the patterns used here.


Validated corrections and sample output

  • Use separate RBAC markers for services, configmaps, secrets, and events; do not combine them as services;configmaps;secrets;events.
  • Current Operator SDK go/v4 creates webhook code under internal/webhook/v1alpha1/, importing API types from api/v1alpha1.
  • Emit Kubernetes Events only when user-visible status changes, not unconditionally on every reconcile.
  • For local make run while developing the reconciler, use ENABLE_WEBHOOKS=false make run. Test live admission webhooks in envtest or kind with a real serving certificate.
go
// Correct RBAC marker shape.
// +kubebuilder:rbac:groups="",resources=services,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups="",resources=configmaps,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups="",resources=secrets,verbs=get;list;watch
// +kubebuilder:rbac:groups="",resources=events,verbs=create;patch
bash
kubectl get deployment hello
# NAME    READY   UP-TO-DATE   AVAILABLE
# hello   2/2     2            2

kubectl get demoapp hello -o jsonpath='{.status.readyReplicas}{"|"}{.status.conditions[?(@.type=="Available")].status}{"\n"}'
# 2|True

kubectl patch configmap hello-config --type=merge -p '{"data":{"message":"manual drift"}}'
# configmap/hello-config patched

kubectl get configmap hello-config -o jsonpath='{.data.message}{"\n"}'
# hello from the controller-runtime tutorial

kubectl scale deployment hello --replicas=5
# deployment.apps/hello scaled

kubectl get deployment hello -o jsonpath='{.spec.replicas}{"\n"}'
# 2

kubectl get configmap hello-delete-audit -o jsonpath='{.data.demoApp}{"|"}{.data.namespace}{"|"}{.data.deletedAt}{"\n"}'
# hello|default|2026-06-05T12:07:36Z
Deepak Prasad

R&D Engineer

Founder of GoLinuxCloud with over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels across development, DevOps, …

  • Red Hat Certified System Administrator in Red Hat OpenStack
  • Certified Kubernetes Application Developer (CKAD)
  • Red Hat Certified Specialist in Ansible Automation
  • Go (programming language)
  • Python (programming language)
  • DevOps
  • Computer Security