CRD validation gets you regex, range, and "required field" checks for free. Once a
Kubernetes Operator needs a default value that depends on another field, a cross-field invariant ("when ha: true, replicas must be ≥ 3"), or a check against current cluster
state ("the referenced namespace must exist"), the CRD schema cannot do
it. The operator pattern's answer is admission webhooks: HTTPS
endpoints the API server calls during every create/update/delete,
before the object hits etcd.
This article covers the webhook lifecycle, the difference between
mutating and validating webhooks, how kubebuilder scaffolds them, the
Webhook*Configuration fields that matter, when to prefer
ValidatingAdmissionPolicy (the built-in CEL alternative that became
GA in Kubernetes 1.30), and patterns that keep webhooks from blowing up
your control plane.
Two admission webhook types are covered here — mutating and validating. There is also a third type, the conversion webhook, which translates between CRD versions and is covered separately in CRD version upgrades and conversion webhooks.
TL;DR — webhooks in one screen
Generate the scaffolding:
kubebuilder create webhook --group cache --version v1alpha1 --kind Memcached --defaulting --programmatic-validationThat produces api/v1alpha1/memcached_webhook.go with these methods:
func (r *Memcached) Default() {
if r.Spec.Size == 0 {
r.Spec.Size = 1
}
}
func (r *Memcached) ValidateCreate() (admission.Warnings, error) {
return nil, r.validate()
}
func (r *Memcached) ValidateUpdate(old runtime.Object) (admission.Warnings, error) {
return nil, r.validate()
}
func (r *Memcached) ValidateDelete() (admission.Warnings, error) {
return nil, nil
}
func (r *Memcached) validate() error {
if r.Spec.Size < 0 || r.Spec.Size > 100 {
return fmt.Errorf("size must be between 0 and 100")
}
return nil
}…plus marker comments in main.go registering the webhook with the
manager. After make manifests && make deploy, both webhooks are
live. The rest of this article unpacks what happens behind that
scaffolding.
Why webhooks if CRDs already validate?
Anyone who has built a CRD has run into the question: my OpenAPI schema already validates field types, ranges, regex, and required fields — so what is left for a webhook to do?
The honest answer is: most field-level validation should stay in the CRD schema. Webhooks exist for the cases the schema cannot reach.
| What you need | Where it belongs |
|---|---|
| Field type, regex, min/max, required | CRD OpenAPI schema |
Single-field CEL rule (e.g. self >= 0) |
CRD x-kubernetes-validations |
Cross-field invariant ("if ha then size >= 3") |
CEL or validating webhook |
| Defaulting fields based on other fields | Mutating webhook |
| Validation against current cluster state ("namespace must exist") | Validating webhook |
| Validation that needs Go code or external calls | Validating webhook |
Immutable fields ("storageClass cannot change after creation") |
CEL x-kubernetes-validations (transition rules) or validating webhook |
| Sidecar injection (Istio, Linkerd, observability agents) | Mutating webhook on Pod Create — see below |
Cluster-wide policy ("no :latest tags", required labels) |
ValidatingAdmissionPolicy or OPA Gatekeeper / Kyverno |
| Per-namespace resource quotas | Built-in ResourceQuota admission controller (don't write a custom webhook) |
| Cluster-wide policy across many resource kinds | ValidatingAdmissionPolicy or OPA Gatekeeper |
Three concrete things only a webhook can do:
- Read other Kubernetes objects during admission ("the referenced
ConfigMapmust exist and have adefault-profilekey"). The CRD schema is purely structural; it knows nothing about the rest of the cluster. - Set defaults that depend on context. "If the user did not specify a
storageClass, use the cluster's default storage class." The schema can set static defaults; only a mutating webhook can set context-aware ones. - Express business logic in Go when CEL would be unreadable. Three-level nested conditionals, error messages built from multiple fields, or behaviour that calls a library — all painful in CEL, all natural in Go.
For everything else, schema + CEL is faster, cheaper, and never has an outage. We come back to the CEL vs webhook trade-off in detail below.
Prerequisites
- A scaffolded operator built with the Operator SDK install guide and an understanding of the reconcile loop.
- The RBAC primer for operators — webhooks add a ServiceAccount permission for the TLS certs Secret.
- A TLS certificate provisioner. cert-manager is
the standard choice; it watches the WebhookConfiguration, generates a
certificate, and injects the
caBundleautomatically.
Where Webhooks Fit in the API Server Pipeline
A webhook is simply an HTTPS server that the Kubernetes API server calls during admission, before an object is persisted to etcd.
- Mutating webhooks can modify the incoming object (defaults, labels, annotations, sidecar injection, finalizers).
- Validating webhooks can only allow or reject the request based on business rules.
You can think of them as the admission equivalents of a formatter and a linter:
| CI Pipeline | Admission Pipeline |
|---|---|
| Code formatter | Mutating webhook |
| Linter | Validating webhook |
The API server processes admission requests in a strict order:
A few important points:
- Mutating webhooks run before schema validation, so any defaults they add must still satisfy the CRD schema.
- Validating webhooks run after schema validation, making them ideal for cross-field rules, immutability checks, and cluster-state validation.
- If multiple mutating webhooks match the request, they run sequentially and each webhook sees the changes made by the previous one.
Mutating webhooks: defaulting and injection
Use cases:
- Default values for unset fields.
.spec.replicas: 1if the user didn't specify. - Cross-field defaults.
.spec.storageClass: <cluster-default>if the user picks the basic profile. - Field injection. Adding annotations the operator needs internally (e.g. a UUID for tracing).
The webhook returns a JSON Patch (RFC 6902) that describes the
mutations. controller-runtime hides this from you with the
Defaulter interface — you just modify the Go object:
func (r *Memcached) Default() {
if r.Spec.Size == 0 {
r.Spec.Size = 3 // default replica count
}
if r.Spec.StorageClass == "" {
r.Spec.StorageClass = "default"
}
if r.Annotations == nil {
r.Annotations = map[string]string{}
}
if _, ok := r.Annotations["cache.example.com/created-at"]; !ok {
r.Annotations["cache.example.com/created-at"] = time.Now().UTC().Format(time.RFC3339)
}
}Behind the scenes, controller-runtime compares the original and mutated objects and synthesises a JSON Patch. The API server applies the patch and the next pipeline stage sees the mutated object.
Sidecar injection — the canonical mutating webhook
The most famous admission webhook in the Kubernetes ecosystem is not
a validator at all — it is the mutating webhook that injects sidecar
containers into Pods on admission. Istio's envoy proxy, Linkerd's
linkerd-proxy, the Datadog agent, AWS App Mesh, and dozens of other
service-mesh and observability tools all do the same thing: register a
mutating webhook that watches Pod Create events and patches an extra
container into .spec.containers before the Pod is persisted to etcd.
A minimal sketch (raw admission.Handler form, since you're operating
on a built-in Pod, not your own CRD):
func (h *podInjector) Handle(ctx context.Context, req admission.Request) admission.Response {
pod := &corev1.Pod{}
if err := h.decoder.Decode(req, pod); err != nil {
return admission.Errored(http.StatusBadRequest, err)
}
if pod.Labels["cache.example.com/inject-sidecar"] != "true" {
return admission.Allowed("not opted in")
}
pod.Spec.Containers = append(pod.Spec.Containers, corev1.Container{
Name: "cache-exporter",
Image: "cache.example.com/exporter:v1.2.3",
Ports: []corev1.ContainerPort{{ContainerPort: 9100}},
})
marshalled, _ := json.Marshal(pod)
return admission.PatchResponseFromRaw(req.Object.Raw, marshalled)
}Three things this pattern depends on:
- Opt-in via label or annotation. Pair with an
objectSelectoron theMutatingWebhookConfiguration(see the selectors subsection below) so the API server never even calls your webhook for pods that will not be injected. failurePolicy: Ignoreis usually the right choice. A broken sidecar injector should not block everyPodcreate in the cluster.reinvocationPolicy: IfNeededif a later mutating webhook might add fields to your injected sidecar (e.g. a downstream defaulter that fills in resource limits).
Most operator authors will not write a sidecar injector themselves —
service meshes and observability platforms already provide them. The
pattern is worth knowing because it shows up in every cluster, and
because it is the cleanest example of a mutating webhook operating on
a built-in Kubernetes kind (Pod) rather than on the operator's
own CRD.
Rules for mutating webhooks
- Idempotent. The same input must produce the same output. If two mutating webhooks both default the same field with different logic, you have a non-deterministic system.
- No side effects. Don't write to external systems from the webhook. The API server may call you multiple times for the same request.
- Order-independent. Your mutation should make sense regardless of which other webhooks already mutated the object.
- Schema-valid output. Whatever you produce must pass the CRD OpenAPI schema in stage 4.
Validating webhooks: invariants and cluster checks
Use cases:
- Cross-field invariants.
if r.Spec.HA && r.Spec.Replicas < 3 { return error }. - Immutable fields.
if old.Spec.StorageClass != new.Spec.StorageClass { return error("immutable") }. - Cluster-state checks. "The referenced ConfigMap must exist".
- Deletion guards. Reject delete if the CR has dependents.
Implementation:
func (r *Memcached) ValidateCreate() (admission.Warnings, error) {
return nil, r.validate(nil)
}
func (r *Memcached) ValidateUpdate(old runtime.Object) (admission.Warnings, error) {
oldMem := old.(*Memcached)
if oldMem.Spec.StorageClass != r.Spec.StorageClass {
return nil, fmt.Errorf("spec.storageClass is immutable")
}
return nil, r.validate(oldMem)
}
func (r *Memcached) ValidateDelete() (admission.Warnings, error) {
// allow delete; a finalizer is the right place to gate cleanup,
// not a validating webhook. See /kubernetes-finalizers-explained/.
return nil, nil
}
func (r *Memcached) validate(old *Memcached) error {
if r.Spec.HA && r.Spec.Size < 3 {
return fmt.Errorf("when ha is true, size must be >= 3")
}
if r.Spec.Size > 100 {
return fmt.Errorf("size cannot exceed 100")
}
return nil
}Warnings — non-blocking guidance
The first return value is admission.Warnings — strings the API
server attaches to the response. Use them for non-blocking
deprecation notices:
func (r *Memcached) ValidateCreate() (admission.Warnings, error) {
var warnings admission.Warnings
if r.Spec.LegacyMode {
warnings = append(warnings, "spec.legacyMode is deprecated; use spec.mode instead")
}
return warnings, r.validate(nil)
}kubectl prints warnings without failing the command. Great for
guiding users away from soft-deprecated fields.
The generated WebhookConfiguration YAML
make manifests produces both ValidatingWebhookConfiguration and
MutatingWebhookConfiguration in config/webhook/manifests.yaml. A
realistic example:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: validating-webhook-configuration
webhooks:
- name: vmemcached.kb.io
clientConfig:
service:
name: memcached-operator-webhook-service
namespace: memcached-operator-system
path: /validate-cache-example-com-v1alpha1-memcached
caBundle: Cg== # filled in by cert-manager
rules:
- operations: [CREATE, UPDATE]
apiGroups: [cache.example.com]
apiVersions: [v1alpha1]
resources: [memcacheds]
scope: Namespaced
failurePolicy: Fail
sideEffects: None
admissionReviewVersions: [v1]
timeoutSeconds: 10Field-by-field:
| Field | What to set |
|---|---|
clientConfig.service |
The Service backing your webhook pod |
clientConfig.caBundle |
Base64-encoded CA cert; cert-manager fills this in |
rules.operations |
Usually [CREATE, UPDATE]; add DELETE only if you want deletion validation |
rules.scope |
Namespaced (matches CR scope); use Cluster for cluster-scoped CRs |
failurePolicy |
Fail for validation, Ignore for non-critical defaulting |
sideEffects |
None for almost every operator |
admissionReviewVersions |
[v1] (v1beta1 removed in Kubernetes 1.22) |
timeoutSeconds |
5–10; the API server kills the call past this |
Scoping with objectSelector, namespaceSelector, matchPolicy, reinvocationPolicy
The default WebhookConfiguration matches every object that satisfies the
rules block. In a real cluster you almost always want to narrow that
down — a stuck webhook should not be able to break system namespaces.
webhooks:
- name: vmemcached.kb.io
# ... clientConfig, rules, etc ...
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: NotIn
values: [kube-system, kube-public, cert-manager]
objectSelector:
matchLabels:
cache.example.com/managed: "true"
matchPolicy: Equivalent
reinvocationPolicy: IfNeeded| Field | What it does | When you need it |
|---|---|---|
namespaceSelector |
Label-match against the namespace's labels. | Almost always — opt kube-system and other privileged namespaces out so a broken webhook cannot brick the cluster. |
objectSelector |
Label-match against the object's own labels. | When users should be able to opt-in (or opt-out) per object. |
matchPolicy |
Equivalent (default) calls the webhook for all API versions of the same resource; Exact only for versions in the rules block. |
Leave as Equivalent unless you have CRD versions with materially different schemas. |
reinvocationPolicy |
Never (default) calls each mutating webhook once; IfNeeded re-invokes it if a later webhook modified the object. |
Set to IfNeeded on a mutating webhook whose output depends on fields that other mutators might write. |
Validating webhooks have no
reinvocationPolicy— they are always called exactly once, after the final mutated object is known.
failurePolicy: Fail vs Ignore
This is the most important configuration choice.
failurePolicy: Fail
The API server rejects the call if the webhook is unreachable, times out, or returns a 5xx. Use for:
- Validating webhooks (always — if you can't validate, you must reject; otherwise invalid state lands in etcd).
- Mutating webhooks that perform critical defaults (without which the resulting object would be wrong).
The risk: an operator pod outage stops every Create/Update on the managed kind. For one operator with HA replicas, this is fine. For an operator that's down completely, no one can apply new CRs until it's back.
failurePolicy: Ignore
The API server lets the call through as if the webhook didn't exist. Use for:
- Mutating webhooks that only set nice-to-have defaults. If the operator is down, the user just doesn't get the default — the operator will set it on reconcile.
Don't ever use Ignore for validating webhooks unless you have a genuinely independent enforcement mechanism (OPA Gatekeeper, K-Rail, admission checks in the API itself).
Best of both worlds
Two webhook configurations:
- Validating,
failurePolicy: Fail— invariants you absolutely need. - Mutating,
failurePolicy: Ignore— defaults that are recoverable.
That way an operator outage doesn't block Create/Update — users just get un-defaulted objects, which the operator will reconcile to correctness when it's back.
sideEffects and dry-run
The sideEffects field is required in v1. Values:
| Value | Meaning |
|---|---|
None |
Webhook has no external side effects; safe for dry-run. |
NoneOnDryRun |
Has side effects but checks the dryRun: true flag and skips them. |
Some |
Always has side effects; dry-run forbidden for this resource. |
Unknown |
(Deprecated) Don't use. |
For 99% of operator webhooks the answer is None. Your
Default() just returns a patch; your ValidateCreate() just
returns allow/deny. Neither writes to etcd, calls external APIs, or
sends emails.
If you do have side effects (e.g. recording a deprecated-field
warning in an external system), use NoneOnDryRun and check the
request's DryRun flag:
func (h *handler) Handle(ctx context.Context, req admission.Request) admission.Response {
if !*req.DryRun {
h.recordUsage(req) // skip on dry-run
}
return admission.Allowed("")
}For full handler-level control, write a admission.Handler
implementation directly instead of the Defaulter/Validator
interfaces.
Webhooks vs CEL validation (and ValidatingAdmissionPolicy)
Kubernetes has spent the last several releases moving validation logic out of webhooks and into the API server itself, using Common Expression Language (CEL). There are now three places validation can live, and webhooks are no longer the default choice for many cases.
1. CRD x-kubernetes-validations (CEL inside the schema)
Available since Kubernetes 1.25 (GA in 1.29). The validation rule lives directly inside the CRD OpenAPI schema, runs in the API server, and needs no external code:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
size: { type: integer }
ha: { type: boolean }
x-kubernetes-validations:
- rule: "!self.ha || self.size >= 3"
message: "size must be >= 3 when ha is true"
- rule: "self.size <= 100"
message: "size cannot exceed 100"When to use it: any cross-field rule that can be expressed in CEL — and that is most of them. No webhook server, no TLS, no failurePolicy, no timeout. You ship the rule inside the CRD.
2. ValidatingAdmissionPolicy (cluster-wide CEL, GA in 1.30)
Decouples policy from any single CRD. A ValidatingAdmissionPolicy is
authored once and bound to many resource kinds via
ValidatingAdmissionPolicyBinding. Runs inside the API server, same as
CEL-in-schema:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: memcached-size-rules
spec:
matchConstraints:
resourceRules:
- apiGroups: [cache.example.com]
apiVersions: [v1alpha1]
resources: [memcacheds]
validations:
- expression: "!object.spec.ha || object.spec.size >= 3"
message: "size must be >= 3 when ha is true"ValidatingAdmissionPolicy was a beta feature in 1.28–1.29 and reached
GA in Kubernetes 1.30. For new validation work, prefer it over a
validating webhook unless you genuinely need Go.
3. Validating webhook (the original mechanism)
Keep using webhooks when your validation logic:
- needs to read other Kubernetes objects during admission (referenced ConfigMap, Secret, or CR must exist),
- calls external systems (license check, internal policy API),
- requires complex Go code that would be unreadable in CEL, or
- needs mutation, not just yes/no (use a mutating webhook — CEL has no equivalent yet).
Decision matrix
| Validation type | First choice | Why |
|---|---|---|
| Field type, regex, range, required | CRD OpenAPI schema | Free, no extra moving parts. |
Cross-field rule (if ha then size>=3) |
CRD x-kubernetes-validations |
Runs in the API server, no webhook needed. |
| Same rule across many CRDs | ValidatingAdmissionPolicy |
Author once, bind to many resources. |
| Lookup another object during admission | Validating webhook | Only webhooks have a Kubernetes client. |
| Default a field from other fields | Mutating webhook | No CEL equivalent for mutation yet. |
Cluster-wide policy ("no :latest tags") |
ValidatingAdmissionPolicy or OPA Gatekeeper |
Both are CEL/Rego-based and don't require a per-operator webhook. |
The direction of travel: anything that can be expressed in CEL is
moving off webhooks into the API server. Operators in 2026 should keep
webhooks for mutation and for validation that needs Go — and ship
everything else as x-kubernetes-validations or a
ValidatingAdmissionPolicy. The result is fewer moving parts: no
webhook server to keep highly available, no cert to rotate, no
failurePolicy to agonise over.
Mutating admission policies? A built-in CEL-based mutating equivalent (
MutatingAdmissionPolicy) was alpha in 1.30 and beta in 1.32. Treat it as up-and-coming; for now, mutating logic still ships as a mutating webhook in production operators.
Local development workflow
Webhooks need TLS, which makes make run (run-out-of-cluster) harder
than vanilla operators.
Use envtest
For integration tests, envtest can start a real apiserver with
your webhook registered:
testEnv := &envtest.Environment{
CRDDirectoryPaths: []string{filepath.Join("..", "config", "crd", "bases")},
WebhookInstallOptions: envtest.WebhookInstallOptions{
Paths: []string{filepath.Join("..", "config", "webhook")},
},
}
cfg, err := testEnv.Start()The webhook is registered against the test apiserver; the cert is
generated automatically. envtest is part of sigs.k8s.io/controller-runtime/pkg/envtest
and is the same harness kubebuilder scaffolds for you under
internal/controller/suite_test.go.
Dry-run against a real cluster
The fastest way to manually test a deployed webhook:
kubectl apply -f bad-memcached.yaml --dry-run=serverThe API server calls your webhook with dryRun: true. You see the
admission response without actually creating the resource.
Patterns and anti-patterns
Good: cross-field validation
if r.Spec.HA && r.Spec.Size < 3 {
return fmt.Errorf("HA mode requires size >= 3")
}CRD schema cannot express this kind of cross-field rule. Webhooks are perfect.
Good: immutable fields
if old.Spec.StorageClass != new.Spec.StorageClass {
return fmt.Errorf("storageClass is immutable")
}Prevents disruptive changes that would require destroying and recreating workloads.
Bad: re-checking field-level rules
// BAD - duplicate of CRD schema
if r.Spec.Size < 0 {
return fmt.Errorf("size must be >= 0")
}If your CRD schema has minimum: 0, that check already failed at
stage 4 before your webhook ran. Don't duplicate.
Bad: external API calls in the validator
// BAD - blocks the API server during admission
resp, _ := http.Get("https://config-service/...")Webhooks have a hard timeout (10s). An external API call that takes 5s is borderline; a slow one rejects every admission. Use cached state (read from the informer cache if it's already populated; the webhook server has access to the controller-runtime client).
Bad: stateful mutation
// BAD - non-deterministic if called twice
r.Spec.UUID = uuid.NewString()If the API server retries (e.g. on conflict), your webhook is called twice with different UUIDs. The behaviour is order-dependent. Fix: use a deterministic UUID derived from the object's metadata, or generate the UUID in the reconciler and not the webhook.
Monitoring webhooks
The metrics worth watching:
controller_runtime_webhook_requests_total{code=...}— total webhook calls broken down by status. Errors here mean the API server is seeing your webhook fail.controller_runtime_webhook_latency_seconds_bucket— time spent in your webhook. p99 over 1 s should ring an alarm; over the configuredtimeoutSecondsis critical.controller_runtime_webhook_requests_in_flight— concurrent requests being handled. Spikes suggest a cluster operation that doesn't tolerate the webhook (akubectl apply -f all.yaml).
For Prometheus wiring see operator metrics with Prometheus.
Common pitfalls
1. Forgetting to register the webhook in main.go
kubebuilder create webhook adds the SetupWebhookWithManager call
but only if main.go has the right block. If you have a custom
main.go, you must add it manually. Symptom: webhook never gets
called. Fix: check main.go has
(&cachev1alpha1.Memcached{}).SetupWebhookWithManager(mgr).
2. Webhook server cert not loaded
The webhook server fails to start because the cert files don't
exist or are unreadable. Symptom: readyz fails with
webhook-server check. Fix: ensure cert-manager
(or your cert provisioner) created the Secret and mounted it at
/tmp/k8s-webhook-server/serving-certs/, which is the path
controller-runtime's webhook server expects by default.
3. failurePolicy: Fail on a non-HA operator
Operator pod restarts (rolling deploy). For 5 seconds the webhook is
unavailable. The API server rejects every kubectl apply for the
managed kind. Users see "connection refused" errors. Fix: deploy
two replicas with leader election, so the webhook server is always
up.
4. Defaulter that depends on cluster state
Default() calls r.Get(ctx, namespaceKey, &ns) to look up the
namespace's default profile. The webhook server doesn't have a
client in scope; or worse, it does but the call blocks past
timeout. Fix: defaults should depend only on the object's own
fields. State-dependent decisions belong in the reconciler.
5. sideEffects: Unknown
Pre-v1 default. In v1 (since 1.16) it's invalid and the API server
rejects the configuration. Fix: declare sideEffects: None (almost
always correct).
Frequently Asked Questions
1. What is a Kubernetes admission webhook?
An HTTPS endpoint registered with the API server that intercepts API requests after authentication/authorization but before persistence. The API server calls the webhook with an AdmissionReview; the webhook responds with either "allowed" or "denied" (validating) or a JSON Patch (mutating). Webhooks let operators add defaults and validate invariants that CRD OpenAPI schemas cannot express.2. What is the difference between mutating and validating webhooks?
Mutating webhooks can modify the object via JSON Patch - typical use is defaulting (filling in.spec.replicas: 1 if not set). They run first. Validating webhooks can only allow or deny - typical use is cross-field invariants (e.g. "replicas must be even if HA enabled"). They run after all mutators. Both run before persistence; both must complete within timeoutSeconds.3. When should I use a webhook vs CRD OpenAPI validation?
Use CRD OpenAPI schema (or CEL validation in 1.25+) for field-level validation: types, regex, ranges. Use webhooks for: defaults that depend on other fields, cross-field invariants, validation against cluster state ("this namespace must exist"), or generating fields the user did not provide.4. How does kubebuilder scaffold a webhook?
kubebuilder create webhook --group cache --version v1alpha1 --kind Memcached --defaulting --programmatic-validation generates memcached_webhook.go with Default(), ValidateCreate(), ValidateUpdate(), ValidateDelete() methods. The marker comments in main.go register the webhook with the manager. make manifests adds the Validating/MutatingWebhookConfiguration to config/webhook/.5. What is failurePolicy and what should I set it to?
failurePolicy controls what happens when the webhook is unreachable or returns a 5xx. Fail (default in 1.18+) - reject the API call. Ignore - let the call through as if no webhook existed. Validating webhooks: Fail is correct. Mutating webhooks doing critical mutation: Fail. Mutating webhooks that just default to nice values: Ignore is safer (the worst case is a non-defaulted object, not a rejected call).6. What is sideEffects and why does it matter?
A required field declaring whether the webhook has side effects outside of the AdmissionReview response.None - safe for kubectl dry-run. NoneOnDryRun - has side effects but skips them on dry-run. Some - side effects always; dry-run disabled. Most operator webhooks should be None (they only return a patch or allow/deny; they don't write external state).7. How do I test a webhook locally?
Three approaches: (1)envtest with --with-webhooks - controller-runtime spins up the webhook server during integration tests. (2) Port-forward the deployed webhook and curl AdmissionReview JSON to it directly. (3) Use kubectl apply --dry-run=server against a real cluster - the API server calls your webhook with dryRun: true.8. How do I prevent webhooks from being unreachable during operator restart?
Run multiple operator replicas with leader election (webhook server runs on every replica, only reconcile is leader-only). UsefailurePolicy: Ignore for non-critical webhooks. Set timeoutSeconds: 5 so the API server gives up quickly if the webhook is genuinely down. Test by killing one pod and confirming admissions still succeed via the other replica.9. When should I use ValidatingAdmissionPolicy instead of a validating webhook?
ValidatingAdmissionPolicy (GA in Kubernetes 1.30) is a built-in CEL-based validator that runs inside the API server with no external webhook server. Prefer it whenever your validation logic can be expressed in CEL - cross-field checks, immutability, regex on string fields, list-length constraints. Keep a validating webhook only when you need Go code (cluster-state lookups, calls to an external API, complex conditional logic). VAP removes an entire failure mode: there is no webhook pod to be down, no cert to rotate, no timeout to tune.10. What are objectSelector and namespaceSelector and when should I use them?
Both are label selectors on the WebhookConfiguration that scope which objects the webhook is called for. namespaceSelector filters by the namespace's labels (commonly used to opt kube-system and other privileged namespaces out, e.g. {matchExpressions [{key kubernetes.io/metadata.name operator NotIn values [kube-system]}]}). objectSelector filters by the target object's own labels (e.g. only call the webhook for objects with environment=prod). Default is empty selector which matches everything - dangerous if your webhook can be down. Both selectors are evaluated in addition to the rules block.Summary
Admission webhooks are the operator pattern's way to enforce rules
that CRD schemas cannot express: cross-field validation, immutable
fields, intelligent defaults. kubebuilder scaffolds the boilerplate
with kubebuilder create webhook; you fill in Default(),
ValidateCreate(), and ValidateUpdate(). The configuration knobs
that matter are failurePolicy (Fail for validation, often Ignore
for defaulting), timeoutSeconds (5–10), and sideEffects (None
for almost every operator).
Done well, webhooks make your operator's API safer and easier to
use. Done poorly — long timeouts, side effects, failurePolicy: Fail
on a single-replica operator — they can take down API admission for
the entire managed kind. The next article covers the TLS plumbing
that webhooks require.
Further reading
- Custom Resource Definitions explained —
the schema layer that runs between mutating and validating webhooks,
including CEL
x-kubernetes-validationsfor in-schema validation. - CRD version upgrades and conversion webhooks — the third type of admission webhook (conversion), which converts between served and storage versions of a CRD.
- Kubernetes finalizers explained —
what
ValidateDelete()usually defers to, and why most operators leave the delete validator empty. - Server-Side Apply in operators —
how SSA's
fieldManagermodel interacts with mutating webhooks that default fields. - Operator RBAC: minimum permissions — the extra permissions a webhook adds to the operator's RBAC.
- Operator health and readiness probes — why the webhook-readiness check gates rolling deploys.
- Operator leader election explained —
the HA story that keeps
failurePolicy: Failviable. - External: admission webhook reference, ValidatingAdmissionPolicy reference, kubebuilder webhook guide, controller-runtime webhook package, cert-manager.

