Kubernetes Finalizers: Two-Phase Deletion, Cleanup Patterns, and Stuck Objects

A Kubernetes finalizer is a string on `metadata.finalizers` that tells the API server to keep an object alive (with `deletionTimestamp` set) until your controller has finished cleanup and removed the string. This complete guide covers the two-phase deletion lifecycle, the `controllerutil.AddFinalizer` / `RemoveFinalizer` pattern with copy-pastable Go code, naming conventions, multi-finalizer ordering, idempotent external-resource cleanup, observability (events and Conditions), envtest recipes, the `kubectl patch` force-delete escape hatch, and the stuck-object diagnoses that come up in every real Operator's production support queue.

Published May 31, 2026

Updated Jun 4, 2026

Author Deepak Prasad

Read time 18 min read

Reviewed Jun 4, 2026 byDeepak Prasad

If your Kubernetes Operator creates anything outside its own namespace — a cloud volume, a DNS record, an external SaaS API call, or even a Kubernetes resource in another cluster — it almost certainly needs a finalizer. Without one, a user runs kubectl delete mykind sample, Kubernetes immediately removes the object from etcd, your controller never gets a chance to clean up the side-effects, and you have just created an orphaned resource leak.

Finalizers are the canonical Kubernetes mechanism for safe cleanup, and they are conceptually simple — a string on metadata.finalizers that the API server refuses to ignore. The mechanics, on the other hand, are subtle: getting them wrong gives you stuck "Terminating" objects, hot reconcile loops on deletion, or worse, silent skipping of cleanup logic.

The two-phase deletion model. A kubectl delete is a request; the actual removal from etcd happens only after every finalizer has been cleared by its owning controller.

If you have not yet read the reconcile loop explained, that article is a prerequisite — finalizers reuse the same control loop you already know.

TL;DR — how finalizers work

A finalizer is a string in metadata.finalizers that protects an object from immediate deletion. The full lifecycle, in six steps:

User runs kubectl delete mykind sample.
API server sees a non-empty finalizers array, refuses to delete from etcd. Instead it stamps metadata.deletionTimestamp with the current time.
Every watch fires — your controller reconciles and notices that deletionTimestamp is set.
Controller runs cleanup logic (delete external DNS record, drop cloud disk, call vendor API, etc.).
Controller removes its finalizer string from metadata.finalizers and writes the object back.
API server sees finalizers is now empty and finally garbage-collects the object from etcd.

A user who ran kubectl delete -w will see the object stuck in Terminating between steps 2 and 6. That is correct behaviour — the object is being deleted, the API server is just waiting for the controller to give it the go-ahead.

This pattern is mandatory for any Operator that has external side-effects. For an Operator that only manages other Kubernetes resources, ordinary owner references and cascade deletion are usually enough.

Two-phase deletion in detail

The six steps above collapse into a simpler mental model: deletion is a two-phase process, not a single operation.

When a user runs:

bash

kubectl delete mykind sample

Kubernetes does not immediately remove the object from etcd. Instead, the API server first checks whether the object contains any entries in metadata.finalizers. If a finalizer exists, deletion is paused and the object enters a special Terminating state. This gives the owning controller an opportunity to perform any required cleanup before the object disappears permanently.

Concretely, before deletion the object looks like this:

yaml


metadata:
  finalizers:
  - backups.acme.io/finalizer

Immediately after kubectl delete backup sample, the object is still there — only with a deletionTimestamp stamped on it:

yaml


metadata:
  deletionTimestamp: "2026-06-01T12:34:56Z"
  finalizers:
  - backups.acme.io/finalizer

The API server has not removed anything from etcd; it has only marked the object for deletion and is now waiting for the controller to remove the finalizer.

The following diagram shows the complete lifecycle, from the object's creation through kubectl delete to the final etcd removal:

Kubernetes finalizer deletion flow: CR creation, finalizer registration, deletion request, deletionTimestamp, cleanup, finalizer removal, final etcd garbage collection

The eight-step lifecycle of a finalizer-protected Custom Resource, with the controller's cleanup phase highlighted.

The critical point is that kubectl delete is only a deletion request, not the actual deletion. Once deletionTimestamp is set:

The object remains visible through the API.
The object typically shows as Terminating.
Reconcile continues to run.
Cleanup logic can still update metadata and status.
The API server waits for all finalizers to be removed.

Only after every finalizer has been cleared does Kubernetes perform the final garbage collection step and remove the object from etcd.

Kubernetes itself uses finalizers

You already use finalizers indirectly every day.

Consider a PersistentVolume backed by a cloud disk.

When you delete the PersistentVolume object, Kubernetes does not immediately remove it from etcd.

Instead:

Kubernetes first deletes the underlying cloud disk.
It waits for confirmation that cleanup succeeded.
Only then does it remove the PersistentVolume object.

Without this protection, the PV could disappear while the expensive cloud disk continues running and generating costs.

A finalizer gives your Operator the same protection.

The API server delays deletion until your controller confirms that all external resources have been cleaned up.

Why owner references are not enough

A common question is:

If Kubernetes already has OwnerReferences and garbage collection, why do I need finalizers?

Owner references work only for Kubernetes resources that exist inside the cluster.

For example:

text


Memcached CR
    └── Deployment
    └── Service
    └── ConfigMap

Deleting the Memcached CR automatically deletes the child resources because Kubernetes garbage collection understands those objects.

However, Kubernetes cannot garbage-collect resources that exist outside the cluster:

text


Memcached CR
    └── AWS S3 Bucket
    └── Route53 DNS Record
    └── External Database

The API server has no way to delete those resources.

Finalizers allow the Operator to perform that cleanup before the CR disappears.

Rule of thumb:

Kubernetes resources → OwnerReferences.
External resources → Finalizers.
Many Operators use both together.

Deletion propagation policies and finalizers

kubectl delete accepts a --cascade flag that controls how children with ownerReferences are handled when the parent goes away. All three policies interact with finalizers, but each in a different way — and one of them silently adds a finalizer of its own.

`--cascade`	What happens to the parent	What happens to children	Interaction with finalizers
`background` (default)	Deleted immediately (subject to its finalizers)	Garbage-collected asynchronously	Your finalizer runs on the parent while children may still be alive
`foreground`	Stamped with `deletionTimestamp` and an implicit `foregroundDeletion` finalizer	Deleted first, blocking parent removal	User finalizers cleared only after every child with a blocking `ownerReference` is gone
`orphan`	Deleted immediately	Left alone, `ownerReferences` stripped	Parent finalizers still run; children become standalone objects

foregroundDeletion is a finalizer the kube-controller-manager owns. If you see it in metadata.finalizers and the object is stuck Terminating, the children are stuck, not your operator. Look at the child kinds before you blame your own cleanup code.

Use foreground deletion when your cleanup logic must observe a fully-deleted child tree — for example, an operator that mirrors child state to an external system and needs to confirm "no children remain" before tearing down the upstream representation. For most operators the default background policy is correct.

The finalizer string - convention and naming

The string itself is opaque to Kubernetes - the API server only counts how many strings are in the array. The convention is to use a domain-style name your controller owns, so there is never a collision when multiple controllers finalize the same object:

text


acme.io/release-cleanup
backups.acme.io/finalizer
crossplane.io/composite-resource
finalizer.cert-manager.io

You may also see the historical short form mycontroller-finalizer in older operators - it works, but the domain-style form is the upstream-recommended convention and reads better in kubectl describe. Adopt it for new code.

Three properties to aim for when picking the string:

DNS-prefixed. Use the API group of your CRD as the prefix (cache.example.com/finalizer). This avoids collisions with other operators in the same cluster.
Specific. If your operator manages multiple distinct external resources (bucket + DNS + IAM), use one finalizer string per resource rather than a single catch-all. The API server runs them independently, and per-resource strings make kubectl describe self-documenting: s3.cache.example.com/bucket, dns.cache.example.com/record, iam.cache.example.com/role.
Stable. Renaming a finalizer is a migration headache - existing objects in production carry the old name and will not match your code's ContainsFinalizer(..., newName). The cleanup path silently skips them and the finalizer becomes permanent. Pick the right name on day 1.

A constant (or a small block of them) in your Go package keeps the spelling honest:

go


const (
    finalizerBucket = "s3.cache.example.com/bucket"
    finalizerDNS    = "dns.cache.example.com/record"
)

The full Reconcile() pattern

Important:

Adding or removing a finalizer modifies metadata.finalizers, which means you must use r.Update().

r.Status().Update() only updates the status subresource and will not persist finalizer changes.

Almost every controller with finalizers follows the same skeleton. With controllerutil helpers it fits on one screen:

go


import (
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
    "sigs.k8s.io/controller-runtime/pkg/client"
)

const myFinalizer = "backups.acme.io/finalizer"

func (r *BackupReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    var backup acmev1.Backup
    if err := r.Get(ctx, req.NamespacedName, &backup); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }

    if backup.GetDeletionTimestamp().IsZero() {
        if controllerutil.AddFinalizer(&backup, myFinalizer) {
            if err := r.Update(ctx, &backup); err != nil {
                return ctrl.Result{}, err
            }
        }
    } else {
        if controllerutil.ContainsFinalizer(&backup, myFinalizer) {
            if err := r.cleanupExternalResources(ctx, &backup); err != nil {
                return ctrl.Result{RequeueAfter: 30 * time.Second}, err
            }
            controllerutil.RemoveFinalizer(&backup, myFinalizer)
            if err := r.Update(ctx, &backup); err != nil {
                return ctrl.Result{}, err
            }
        }
        return ctrl.Result{}, nil
    }

    return r.reconcileNormal(ctx, &backup)
}

Five things to notice:

Add the finalizer on the first reconcile, before any external side-effect. If you create a DNS record first and the next Update fails, you have orphaned a DNS record before the finalizer is in place.
AddFinalizer and RemoveFinalizer return a bool - they tell you whether the slice was mutated. Use the return to avoid pointless Update writes that trigger fresh watch events.
Cleanup is on the same code path as RemoveFinalizer. If cleanup fails, return the error and re-reconcile; the finalizer stays in place and the object stays in Terminating.
Idempotency is mandatory. Cleanup will run again on the next reconcile if the RemoveFinalizer Update fails - your DNS-record deletion should tolerate "not found" gracefully.
No .spec changes in the deletion branch. The API server will reject them with forbidden: only deletion is allowed because deletionTimestamp is set.

For why the Update and not a separate Status().Update() - finalizers live in metadata, which the regular Update covers; the status subresource is for .status only. See status and conditions for the matching pattern on the status side.

An idempotent cleanup function

The cleanup function called from the deletion branch must tolerate being re-run any number of times. A workqueue retry, a controller restart, or a slow external API can all force the cleanup logic to execute again after it has already succeeded in part. Treat "already gone" upstream as a successful outcome:

go


func (r *MemcachedReconciler) cleanupExternalResources(
    ctx context.Context, mem *cachev1alpha1.Memcached,
) error {
    log := log.FromContext(ctx)

    if mem.Status.ExternalBucketName == "" {
        log.Info("no external bucket recorded - nothing to cleanup")
        return nil
    }

    if err := r.s3.DeleteBucket(ctx, mem.Status.ExternalBucketName); err != nil {
        if isS3NotFound(err) {
            log.Info("bucket already deleted upstream",
                "bucket", mem.Status.ExternalBucketName)
            return nil
        }
        return fmt.Errorf("delete bucket %q: %w",
            mem.Status.ExternalBucketName, err)
    }

    log.Info("deleted external bucket", "bucket", mem.Status.ExternalBucketName)
    return nil
}

Two takeaways worth emphasising:

Record what you created on .status. The cleanup path cannot know which bucket to delete unless you persisted .status.ExternalBucketName (or equivalent) when you created the bucket. Operators that manage external resources always keep an inventory on the CR's own .status.
Map "404 not found" to success. Most upstream APIs raise an error when you try to delete a resource that no longer exists; that is the exact case where you want to continue and remove the finalizer.

Atomic ordering: cleanup → confirm → remove → update

A subtle but common bug is to call RemoveFinalizer before the cleanup result is known:

go


// BAD - finalizer removed first, cleanup result unverified
controllerutil.RemoveFinalizer(&mem, myFinalizer)
if err := r.cleanupExternalResources(ctx, &mem); err != nil {
    return ctrl.Result{}, err
}
return ctrl.Result{}, r.Update(ctx, &mem)

If cleanupExternalResources fails, the in-memory mem already has the finalizer removed; eventual consistency usually saves you, but the in-memory state was briefly inconsistent and the bug is easy to miss in unit tests. Always keep the ordering as: cleanup → confirm success → RemoveFinalizer → Update.

Multi-finalizer ordering

Nothing stops several controllers from adding finalizers to the same object. cert-manager, ArgoCD, your in-house Operator, a Velero backup hook - each can append its own string. The API server's rule is simple:

An object is removed from etcd only when metadata.finalizers is the empty slice. The order of removal does not matter.

In practice you do not need to coordinate with other controllers. Each finalizer is independent - when your controller sees deletionTimestamp set, it runs its own cleanup and removes its own string. Other finalizers remain; the object stays Terminating until they too are cleared. The ordering you actually need to guarantee is within your controller:

go


// 1. Delete external DNS record. If this fails, requeue, do not remove finalizer.
if err := r.deleteExternalDNS(ctx, backup); err != nil {
    return ctrl.Result{RequeueAfter: time.Minute}, err
}

// 2. Delete cloud disk. If this fails, requeue, do not remove finalizer.
if err := r.deleteCloudDisk(ctx, backup); err != nil {
    return ctrl.Result{RequeueAfter: time.Minute}, err
}

// 3. Only when *every* external side-effect is gone, remove the finalizer.
controllerutil.RemoveFinalizer(&backup, myFinalizer)
if err := r.Update(ctx, &backup); err != nil {
    return ctrl.Result{}, err
}

A common bug is to remove the finalizer before the cleanup completes - your operator looks neat, the object disappears quickly, and the cloud disk leaks into your bill.

If you genuinely need strict ordering across finalizers (DNS must be deleted before the IAM role), encode the ordering on a status field such as .status.cleanupPhase rather than relying on finalizer-array ordering - which the API server does not guarantee.

Making cleanup observable

A finalizer that fails silently is a debugging nightmare. Two practices turn a stuck Terminating into a self-explanatory ticket:

Surface cleanup progress on `.status` Conditions

Set the standard Ready Condition to False with a Cleaning reason while the deletion branch runs, so anyone running kubectl get sees the operator is actively trying:

go


meta.SetStatusCondition(&mem.Status.Conditions, metav1.Condition{
    Type:               "Ready",
    Status:             metav1.ConditionFalse,
    Reason:             "Cleaning",
    Message:            fmt.Sprintf("deleting bucket %q", mem.Status.ExternalBucketName),
    ObservedGeneration: mem.Generation,
})
if err := r.Status().Update(ctx, &mem); err != nil {
    return ctrl.Result{}, err
}

For the full Conditions convention — the four standard types, reason spelling, and observedGeneration rules — see status subresource and Conditions explained.

Emit Events on cleanup failures

Events are the on-call engineer's first hop after kubectl describe. Emit one whenever cleanup fails:

go


r.recorder.Eventf(&mem, corev1.EventTypeWarning,
    "CleanupFailed",
    "failed to delete external bucket %q: %v",
    mem.Status.ExternalBucketName, err,
)

That event surfaces in kubectl describe memcached memcached-sample and in any event-stream consumer (Slack notifier, Alertmanager). Without it, the only signal is operator logs - which nobody reads until something is already on fire.

Testing finalizer logic with envtest

The unit test most operators do write covers the happy reconcile. The path most operators forget is the finalizer cleanup, which is precisely the path that leaks money in production. With envtest (the controller-runtime test harness that spins up a real kube-apiserver and etcd in your test process) it is a 30-line Ginkgo spec:

go


It("deletes external resource and removes finalizer on CR delete", func() {
    cr := &cachev1alpha1.Memcached{ /* ... */ }
    Expect(k8sClient.Create(ctx, cr)).To(Succeed())

    // wait for the finalizer to be added
    Eventually(func() bool {
        _ = k8sClient.Get(ctx, key, cr)
        return controllerutil.ContainsFinalizer(cr, myFinalizer)
    }, "5s").Should(BeTrue())

    // simulate an external resource that was created earlier
    cr.Status.ExternalBucketName = "test-bucket"
    Expect(k8sClient.Status().Update(ctx, cr)).To(Succeed())

    // delete the CR
    Expect(k8sClient.Delete(ctx, cr)).To(Succeed())

    // assert the external resource was actually deleted by the operator
    Eventually(func() bool { return s3Mock.WasDeleted("test-bucket") }, "5s").
        Should(BeTrue())

    // and the CR is finally gone
    Eventually(func() bool {
        return apierrors.IsNotFound(
            k8sClient.Get(ctx, key, &cachev1alpha1.Memcached{}),
        )
    }, "5s").Should(BeTrue())
})

kubectl delete is asynchronous, so polling assertions with timeouts (Eventually(..., "5s")) is the pattern. This single spec catches the entire class of "finalizer leaks external resources" bugs that otherwise only surface in production support tickets.

Force-deleting a stuck object

Common misconception: kubectl delete --force --grace-period=0 does not bypass finalizers. The --force --grace-period=0 combination only skips graceful pod termination — it tells the API server to immediately delete a pod without waiting for kubelet acknowledgement, and it is meaningless on any non-pod kind. Finalizers are unaffected: the object stays Terminating until they are cleared. The kubectl patch ... --finalizers [] recipe below is the only built-in escape hatch.

When a controller is buggy, gone, or genuinely unable to clean up, you can patch the finalizers slice to empty. The API server will then garbage-collect the object immediately, side-effects be damned:

bash


kubectl patch <kind> <name> -n <ns> \
  -p '{"metadata":{"finalizers":[]}}' --type=merge

Equivalent JSON-Patch form (some Kubernetes versions are picky about the strategic-merge form when the field is missing):

bash


kubectl patch <kind> <name> -n <ns> \
  --type json -p '[{"op":"replace","path":"/metadata/finalizers","value":[]}]'

Two warnings:

This skips your cleanup logic. Anything the controller would have deleted - cloud volumes, DNS, external secrets - is now orphaned. Clean it up manually before patching.
Confirm the controller is the only thing that should be doing this. On rare occasions the controller is alive and will shortly clean up on its own; in that case patching the finalizer races the controller and leaves cleanup half-finished.

For day-to-day operations the right move is to fix the controller and let it run; the patch is the last-resort escape hatch.

Diagnosing a stuck Terminating object

The standard checklist when an object sits in Terminating for more than a few seconds:

Inspect the object metadata.
bash
```
kubectl get <kind> <name> -n <ns> -o jsonpath='{.metadata.finalizers}'
```
The string tells you exactly which controller is supposed to be doing cleanup.
Confirm the owning controller is running.
bash
```
kubectl -n <operator-ns> get pods
kubectl -n <operator-ns> logs <controller-pod> --tail=200
```
A crashlooping or off-cluster controller is the most common cause of stuck finalizers.
Check the controller's RBAC.

If the controller can get the object but not update it, finalizer removal silently fails. Look for forbidden: lines in the logs. See operator RBAC: minimum permissions for the exact verbs every controller needs on its own CRs.
Look for events on the object.
bash
```
kubectl describe <kind> <name> -n <ns>
```
A well-written controller emits an Event each time cleanup fails (e.g. Warning CleanupFailed external DNS API returned 503).
Last resort: force-delete with kubectl patch (see above).

A useful one-liner for cluster-wide audits:

bash


kubectl get all -A -o json | jq -r \
  '.items[] | select(.metadata.deletionTimestamp != null) |
   "\(.kind)/\(.metadata.namespace)/\(.metadata.name) finalizers=\(.metadata.finalizers)"'

This surfaces every object that has been "deleting" for longer than expected.

The anti-patterns that ship to production

Adding the finalizer after creating the external side-effect. If the subsequent Update fails or the controller crashes, the side-effect is orphaned. Fix: add the finalizer first, on the very first reconcile, before any external write.
Removing the finalizer in a defer. A panic inside cleanup would still remove the finalizer, skipping the cleanup. Fix: remove the finalizer only on the explicit success path.
Not handling cleanup idempotency. External APIs return 404 when a resource is already gone; treat 404 as success in cleanup code. Fix: if err != nil && !apierrors.IsNotFound(err) { return err } on every external delete.
Calling r.Status().Update() in the deletion branch and expecting finalizers to come along. Finalizers live in metadata, not in status; the status subresource does not propagate metadata.finalizers changes. Fix: use r.Update() for finalizer-changing writes.
Forgetting to update tests. Unit tests that pass with a controller that ignores finalizers are doing zero work. Fix: write a test that sets deletionTimestamp on the fixture object and asserts that the controller calls cleanup and clears the finalizer.
Namespace-scoped CR holding a cluster-scoped external resource. Deleting the namespace tries to delete the CR, but the operator pod lives in a different namespace that may already be gone - the finalizer can never be removed, and the entire namespace gets stuck in Terminating. Fix: if your CR's cleanup target is cluster-scoped (an IAM role, a global DNS record, a cluster-wide CRD), make the CR itself cluster-scoped, or run the controller in a namespace that survives target-namespace teardown.
Renaming the finalizer string after release. controllerutil.ContainsFinalizer(..., "new/name") returns false on every CR in production. Cleanup is silently skipped, the old finalizer becomes permanent (the new code never adds it; the old code is gone). Fix: keep the original name, or ship a one-shot migration controller that rewrites finalizer names before the rename lands.

Frequently Asked Questions

1. What is a finalizer in Kubernetes?

A finalizer is a string that lives in an object's metadata.finalizers array. While at least one finalizer is present, the API server refuses to remove the object from etcd even after the user calls kubectl delete. Instead the server stamps metadata.deletionTimestamp, and the object stays alive in a "Terminating" state until every controller that owns a finalizer has run its cleanup and removed its string.

2. How do Kubernetes finalizers work?

Deletion is two-phase. Phase 1: a user runs kubectl delete, the API server stamps deletionTimestamp on the object and refuses to garbage-collect it. Phase 2: each owning controller reconciles, performs its cleanup, then removes its finalizer string from metadata.finalizers. When the array reaches length zero, the API server finally removes the object from etcd.

3. When should I add a finalizer to my operator?

Add a finalizer when the operator must clean up something that Kubernetes garbage collection cannot reach: an external cloud resource (S3 bucket, RDS instance), a DNS record, an IAM role, a TLS certificate at a CA, or a cluster-wide resource that is not owned by the CR. If the only "child" your operator creates is a Deployment in the same namespace, you do NOT need a finalizer - owner references and the GC are enough.

4. Where exactly do I add the finalizer in the reconcile loop?

After fetching the CR with r.Get(...) and BEFORE doing any work that creates external resources. Use controllerutil.AddFinalizer(&cr, finalizerName) and immediately persist with r.Update(ctx, &cr). If the persist fails, return the error and let the workqueue retry - the finalizer must be on the object in etcd before you create the external resource.

5. How do I detect that the CR is being deleted?

Check !cr.GetDeletionTimestamp().IsZero(). If it is non-zero, the user has issued kubectl delete, the API server has marked the object for deletion, and your reconciler should run the cleanup path (delete external resources, then call RemoveFinalizer). If it is zero, run the normal "ensure desired state" path.

6. What does the cleanup path look like?

Three steps: (1) try to delete the external resource idempotently - if it already does not exist, treat as success; (2) check that cleanup succeeded - if it did not, return the error so the workqueue retries; (3) call controllerutil.RemoveFinalizer(&cr, finalizerName) and r.Update(ctx, &cr). The API server then deletes the CR from etcd.

7. What is the naming convention for finalizers?

A finalizer string should be a fully-qualified domain-style name owned by your controller, e.g. backups.acme.io/finalizer or acme.io/release-cleanup. The convention prevents collisions when an object is acted on by multiple controllers - each one writes and removes only its own string. If your operator manages multiple distinct external resources (bucket + DNS + IAM), use distinct finalizer strings per resource, for example s3.cache.example.com/bucket and dns.cache.example.com/record.

8. How do I add a finalizer in controller-runtime?

Use the helpers in sigs.k8s.io/controller-runtime/pkg/controller/controllerutil: controllerutil.AddFinalizer(&obj, "acme.io/finalizer") mutates the in-memory object, then call r.Update(ctx, &obj) to persist. The helper is a no-op if the finalizer is already present, so it is safe to call on every reconcile.

9. How do I handle multiple finalizers safely?

Each controller manages its own finalizer string. The API server waits for ALL strings to be removed before deleting the object, but the order of removal is not guaranteed. Each cleanup must be independent - do not assume the others have run or not run. If you genuinely need ordering (DNS must go before IAM), encode it on .status (phase transitions) rather than relying on finalizer-array order.

10. How do I test finalizer logic?

Use envtest to spin up a real API server: create a CR and assert the finalizer is added; delete the CR and assert deletionTimestamp is set; let the controller run cleanup and assert the CR is gone. kubectl delete is asynchronous, so polling assertions with timeouts (Eventually) is the pattern. A single envtest spec catches the entire class of "finalizer leaks external resources" bugs.

11. Why is my Kubernetes resource stuck in Terminating?

The object has deletionTimestamp set but at least one entry remains in metadata.finalizers. The owning controller has not removed its string - either because the controller is down, the cleanup logic is failing in a loop, or RBAC prevents the controller from updating the object. Diagnose with kubectl describe and kubectl logs of the controller.

12. How do I force-delete a stuck Kubernetes object?

Patch the finalizers array to empty - this bypasses cleanup entirely: kubectl patch -p '{"metadata":{"finalizers":[]}}' --type=merge. This is destructive - whatever the controller would have cleaned up (external DNS records, cloud volumes, child resources in other namespaces) will be leaked. Use it only when the cluster is being destroyed or when you have independently cleaned up the side-effects.

13. What is the difference between owner references and finalizers?

Owner references drive cascade deletion - when the owner is deleted, Kubernetes garbage-collects every object that names it in metadata.ownerReferences. Finalizers drive cleanup deletion - they keep the owner itself alive while a controller runs custom cleanup logic (e.g. removing external DNS records, deleting cloud disks, calling third-party APIs). The two features are independent and usually used together.

14. What happens if a finalizer is never removed?

If a controller never removes its finalizer, the object remains in the Terminating state indefinitely. Kubernetes will not delete the object from etcd until every finalizer has been removed. The usual causes are controller crashes, RBAC issues, cleanup failures, or bugs in the finalizer logic.

15. What is the difference between --cascade=foreground and --cascade=background?

Background (the default) deletes the parent immediately and garbage-collects children asynchronously - your finalizer runs on the parent while children may still be alive. Foreground stamps the parent with deletionTimestamp and adds an implicit foregroundDeletion finalizer, then waits for every child with a blocking ownerReference to be deleted before clearing user finalizers. Use foreground when your cleanup logic must observe a fully-deleted child tree, e.g. when mirroring child state to an external system. The third policy, orphan, deletes the parent immediately and strips ownerReferences from the children, leaving them as standalone objects.

16. Does kubectl delete --force --grace-period=0 bypass finalizers?

No. The --force --grace-period=0 combination only skips graceful pod termination - it does not touch finalizers. An object with at least one finalizer in metadata.finalizers stays in the Terminating state until those finalizers are removed by their owning controllers (or until you patch the array to empty with kubectl patch ... --type=merge -p '{"metadata":{"finalizers":[]}}'). The --force flag is a pod-only escape hatch and is irrelevant to the finalizer mechanism.

What's next?

You now have safe deletion handled. Natural next reads:

Status subresource and Conditions explained — the symmetric mechanism for writing observed state back to the API server, with the KEP-1623 Ready / Progressing / Available / Degraded convention used in the observability section above.
The Kubernetes reconcile loop explained — the level-triggered control loop that calls your Reconcile() in the first place, and the three Result return paths your finalizer logic relies on.
Server-Side Apply in operators — why finalizer updates use r.Update() rather than r.Patch(), and how SSA changes the conflict story for the rest of the reconcile loop.
Watches, events, and predicates — why DeleteEvent is not a reliable cleanup trigger on its own and why you need finalizers even with a custom watch predicate.
Owner references and garbage collection — when ownerReferences alone are enough and finalizers are unnecessary.
Operator RBAC: minimum permissions — the most common cause of a stuck finalizer is a missing update verb on the CR; this is the article that prevents it.
Custom Resource Definitions explained — the schema the finalizers are guarding.

Summary

A finalizer is Kubernetes' safety mechanism for preventing resource leaks during deletion.

Instead of deleting an object immediately, the API server pauses deletion by setting deletionTimestamp and waits for every controller listed in metadata.finalizers to complete its cleanup work.

For Kubernetes-native resources, owner references and garbage collection are usually sufficient.

For anything outside the cluster—cloud volumes, DNS records, databases, SaaS APIs, certificates, or infrastructure services—a finalizer is the difference between clean deletion and an orphaned resource.

The production pattern is simple:

Add the finalizer before creating external resources.
Detect deletionTimestamp during reconciliation.
Perform idempotent cleanup.
Remove the finalizer only after cleanup succeeds.
Let Kubernetes complete the deletion.

Get this pattern right once, and every deletion your Operator performs becomes safe, predictable, and recoverable.

Deepak Prasad

R&D Engineer

Founder of GoLinuxCloud with more than 15 years of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive …