Kubernetes Operator vs Controller vs CRD: The Difference Explained (with Code)

"Operator", "Controller", and "CRD" are three of the most-confused terms in the entire Kubernetes ecosystem. Tutorials use them interchangeably, job interviews ask trick questions about them, and even seasoned engineers get the distinctions wrong. The truth is simple: they are three different things that often appear together, and once you see the pattern you can never un-see it.

If you have not yet read What is a Kubernetes Operator?, start there — this article assumes you already know what an Operator does and now wants to pin down where the lines are drawn.

A quick kitchen analogy

Before we touch any YAML, picture a restaurant kitchen:

A menu lists what dishes can be ordered. It is just paper — it does not cook anything.
A chef reads orders and cooks them. Without a menu, the chef has nothing to make. Without a chef, the menu is just decoration.
An Italian specialist chef is a chef who only handles Italian dishes and knows the exact subtleties (al dente pasta timing, how to balance an arrabbiata sauce, when fresh basil goes in versus dried).

Map that to Kubernetes:

Restaurant	Kubernetes
Menu (the list of dishes)	CRD — declares what custom objects exist
Chef (any chef who reads orders and cooks)	Controller — watches the API server and acts
Italian specialist chef (knows one cuisine deeply)	Operator — a controller that knows one application deeply

Restaurant kitchen analogy for Kubernetes: a menu represents the CRD (data only, no behaviour), a generic chef represents the Controller (any cook who reads orders and acts), and an Italian specialist chef represents the Operator (knows one cuisine deeply). Below: CRD + Controller + Domain knowledge = Operator.

That's the whole framing. The rest of this article makes it precise.

The 60-second answer

Here are the three terms in one sentence each:

A CRD (Custom Resource Definition) is a YAML schema that teaches the Kubernetes API server about a new object type. It is data, not code.
A Controller is a program that runs a control loop — watch, compare, act — to drive the actual state of one or more resources toward their desired state. The official definition lives in the Kubernetes docs.
An Operator is a controller that uses one or more CRDs to manage a specific application (Postgres, Kafka, cert-manager), and that encodes the operational know-how of running that application as code.

The relationship in one line:

Controller + CRD + Application-domain knowledge = Operator.

Or, in the most-quoted Stack Overflow phrasing:

"All Operators use the controller pattern, but not all controllers are Operators. It's only an Operator if it has: controller pattern + API extension + single-application focus."

CRD — the API contract (data, no behaviour)

A Custom Resource Definition adds a new resource type to the Kubernetes API. Once installed, the API server accepts, validates, and stores instances of that type just like a built-in Deployment or Service.

A trimmed CRD looks like this:

yaml


apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: postgresclusters.acme.io
spec:
  group: acme.io
  scope: Namespaced
  names:
    kind: PostgresCluster
    plural: postgresclusters
    singular: postgrescluster
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                replicas: { type: integer }
                version:  { type: string }

After kubectl apply -f you can run kubectl get postgresclusters and the API responds — the type exists. But nothing in the cluster reacts to a new PostgresCluster. No Pods are created, no Service is exposed, no backup is scheduled. A CRD on its own is just a typed slot in etcd.

Try it yourself: apply the CRD above on a Minikube cluster, create a PostgresCluster resource, then list it. The object will be there, happy as a clam, doing nothing. That is exactly what a CRD-only design feels like.

That fact alone unlocks a real architectural choice: many teams ship CRDs purely as a typed configuration interface consumed by GitOps tools like Argo CD, Flux, or Crossplane. That is a CRD-only design — perfectly legitimate, but not an Operator.

A deeper walk-through of writing CRDs (printer columns, structural schemas, conversion webhooks) lives in Custom Resource Definitions explained.

Controller — the active reconciler

A Kubernetes Controller is the active part. It runs as a Pod inside the cluster (or, briefly, on a developer laptop during development) and executes a never-ending control loop:

Watch the API server for changes to specific resource types.
Compare the desired state (.spec) against the actual state.
Act to close the gap by creating, updating, or deleting other resources.
Update status (.status) so users and other controllers can observe.
Repeat — the loop never exits.

If that loop sounds familiar, that's because every moving piece in Kubernetes is a controller. The cluster you are running right now contains dozens, all bundled into the kube-controller-manager process:

Built-in controller	What it reconciles
ReplicaSet controller	Manages Pods to match the replica count
Job / CronJob controller	Runs Pods to completion / on a schedule
DaemonSet controller	Ensures one Pod per node
Node controller	Tracks node health, evicts Pods from unhealthy nodes
Namespace controller	Garbage-collects deleted namespaces

None of these is an Operator — they reconcile generic primitives, not a particular application. They are controllers, full stop. When you create a Deployment that asks for 3 replicas, the Deployment controller doesn't know or care whether you are running NGINX, a microservice, or a hello-world Pod — it just makes sure 3 Pods exist.

A custom controller is any controller you write, deployed outside kube-controller-manager. It might:

Watch built-in resources (a controller that auto-injects a logging sidecar into every new Pod — similar in spirit to the sidecar pattern).
Watch custom resources (in which case it is also an Operator).
Watch a mix (a policy controller that ensures every Namespace has a ResourceQuota — see resource quotas).

Operator — controller + CRD + domain knowledge

An Operator is the specific case where all three ingredients show up:

You define a CRD that represents an entire application (PostgresCluster, KafkaTopic, Certificate).
You ship a custom controller that watches that CRD.
The controller code contains operational expertise — what the on-call DBA used to do at 3 a.m.: failover, replica re-attach, backup schedule, schema upgrade, version-skew handling.

That third ingredient is the one most "what is an operator?" articles gloss over, and it is the most important one. Two operators can have the same CRD shape and the same reconcile-loop skeleton, but the operator that knows the application is the one that wins.

Real-world Operators you have probably already used:

Prometheus Operator — ServiceMonitor, PodMonitor, PrometheusRule CRDs. Knows how Prometheus expects to be configured and reloaded.
cert-manager — Issuer, Certificate, Order CRDs. Knows the ACME protocol, DNS-01 vs HTTP-01 challenges, and when to renew a cert (30 days before expiry, not on expiry day).
CloudNativePG — Cluster CRD. Knows Postgres streaming replication, base-backup timing, point-in-time recovery, and which replicas are safe to promote.
Strimzi — Kafka, KafkaTopic, KafkaUser CRDs. Knows the Kafka broker config surface, ZooKeeper-to-KRaft migration, and rolling-update ordering.

The first Operator was the etcd-operator from CoreOS in 2016, which is also how the pattern got its name (see What is a Kubernetes Operator? for the full history).

All three pieces, in one Go skeleton

If you don't write Go yet, skim this section — the shape of the code is what matters. We will come back to Go in detail in the scaffold your first operator project guide, and you can also write operators in Python (KOPF) or Helm if Go is not your language.

Here is the smallest possible end-to-end picture using controller-runtime — the same library Operator SDK and Kubebuilder generate against.

1. The CRD type (Go struct → CRD YAML via controller-gen):

go


// api/v1/postgrescluster_types.go
type PostgresClusterSpec struct {
    Replicas int32  `json:"replicas"`
    Version  string `json:"version"`
}

type PostgresClusterStatus struct {
    ReadyReplicas int32  `json:"readyReplicas"`
    Phase         string `json:"phase"` // Pending / Ready / Degraded
}

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
type PostgresCluster struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`
    Spec              PostgresClusterSpec   `json:"spec,omitempty"`
    Status            PostgresClusterStatus `json:"status,omitempty"`
}

2. The Controller (the reconcile loop):

go


// controllers/postgrescluster_controller.go
type PostgresClusterReconciler struct {
    client.Client
    Scheme *runtime.Scheme
}

func (r *PostgresClusterReconciler) Reconcile(
    ctx context.Context, req ctrl.Request,
) (ctrl.Result, error) {

    var cluster acmev1.PostgresCluster
    if err := r.Get(ctx, req.NamespacedName, &cluster); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }

    // ----- DOMAIN KNOWLEDGE LIVES HERE -----
    // Build the desired StatefulSet from .spec.
    // Compare to the existing one. CreateOrUpdate.
    // Check primary health; promote a replica if needed.
    // Schedule a base backup if the schedule says so.
    // Update .status with observed state.
    // -----------------------------------------

    return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
}

func (r *PostgresClusterReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
        For(&acmev1.PostgresCluster{}).      // owns the CR
        Owns(&appsv1.StatefulSet{}).         // owns its children
        Complete(r)
}

3. Wire it together (main.go):

go


mgr, _ := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{Scheme: scheme})
(&PostgresClusterReconciler{
    Client: mgr.GetClient(), Scheme: mgr.GetScheme(),
}).SetupWithManager(mgr)
mgr.Start(ctrl.SetupSignalHandler())

That entire bundle — the PostgresCluster CRD + the PostgresClusterReconciler controller + the application-specific logic inside Reconcile() — is the Operator. Pull out any one piece and the name no longer fits:

Remove the CRD → you have a controller (likely one that mutates built-in resources).
Remove the controller → you have a CRD (a typed config surface).
Remove the Postgres-specific backup/failover/replica logic → you have a generic custom controller, not a Postgres Operator.

Common confusions, cleared up

Five mistakes that come up in interviews and PR reviews:

"A StatefulSet is an Operator for stateful workloads." No — a StatefulSet is a built-in primitive (ordered Pods, stable identities, ordered rollout). It does not understand application-level concepts like primary, replica, or backup. Most database Operators use a StatefulSet internally and add the application logic on top.
"kubectl plugins are controllers." No — kubectl plugins run on your laptop, client-side. Controllers run continuously inside the cluster, watching the API server.
"Admission webhooks are controllers." No — admission webhooks intercept API requests at write time, before the object is even stored. They do not run a continuous reconcile loop. Operators can expose webhooks (see mutating & validating webhooks for operators, covered later in this course), but a webhook on its own is not a controller.
"A Helm chart is an Operator." No — Helm renders YAML and pushes it once. An Operator runs forever and corrects drift. The two are complementary; operator-sdk init --plugins=helm even wraps a chart as an Operator (see Helm-based operator tutorial).
"CRD without a controller is broken." No — it is a valid design. Many teams ship CRDs purely as a typed configuration surface consumed by Argo CD, Flux, Crossplane, or kubectl. A CRD without a controller is just data, and that can be exactly what you want.

Which one do you actually need?

If you need to ...	Use
Expose a typed config object to your platform users	CRD only
Apply cluster-wide policy to built-in resources (mutate Pods, enforce labels)	Custom controller (no CRD)
Automate the full lifecycle (install, upgrade, backup, failover) of a specific application	Operator (CRD + Controller + Domain logic)
Install a stateless app once and forget it	Not even a CRD — just Helm or kustomize

The decision tree is short: if you have application-specific day-2 operations to encode, you are writing an Operator. If you have cluster-wide policy to apply to existing resources, you are writing a custom controller. If you just need a typed slot in the API, you are writing a CRD.

The 80/20 rule: before writing any of the above, check OperatorHub.io — for 80 % of common workloads (Postgres, Kafka, Redis, Prometheus, Vault, Elastic) someone has already shipped a production-grade Operator and you can install it in one command.

Frequently Asked Questions

1. What is the difference between a Kubernetes Operator and a Controller?

A Kubernetes Controller is any control loop that watches the API server and reconciles the actual state of a resource toward its desired state. A Kubernetes Operator is a specific kind of controller - one that uses a Custom Resource Definition (CRD) to represent an application and encodes domain-specific knowledge about how to install, upgrade, back up, and fail over that application. Every Operator is a Controller, but not every Controller is an Operator.

2. Can you have a CRD without a Controller?

Yes, but the CRD is then just data. Kubernetes will accept, store, and serve instances of that resource through kubectl get, but nothing will happen in the cluster. Teams sometimes use CRDs this way as a typed configuration store consumed by other tools (Argo CD, Flux, Crossplane). For an Operator you need both a CRD AND a controller watching it.

3. Is a Custom Controller the same as an Operator?

Not necessarily. A custom controller may watch built-in Kubernetes resources (Pods, Deployments, Namespaces) and apply cluster-wide policy without ever introducing a CRD - a Pod-injecting admission controller is one example. An Operator is specifically a custom controller that owns a CRD representing a particular application.

4. Is every Kubernetes Controller an Operator?

No. The Deployment, ReplicaSet, Job, StatefulSet, and DaemonSet controllers that ship with Kubernetes are controllers, but they are not Operators. They manage generic Kubernetes resources, not a specific application like Postgres or Kafka, and they do not encode application-level operational expertise.

5. Is a Kubernetes Operator just a Controller plus a CRD?

Mechanically yes, but the third ingredient is what makes it an Operator - the domain knowledge embedded in the controller code. The Operator pattern is "controller pattern plus API extension plus single-application focus". Without the application-specific operational logic, you have a custom controller, not an Operator.

6. Are kubectl plugins or admission webhooks considered controllers?

No. kubectl plugins run client-side and never reconcile cluster state. Admission webhooks intercept API requests at create/update time but do not run a continuous reconcile loop. Controllers and Operators are specifically long-running, level-triggered reconcilers that observe the cluster and drive it toward a desired state.

7. What is the difference between an Operator and a StatefulSet?

A StatefulSet is a built-in Kubernetes object that gives a set of Pods stable network identities and ordered rollout. It cannot run application-aware failover, automated backup, or schema upgrades. A Kubernetes Operator for the same application (for example, CloudNativePG for Postgres) typically uses a StatefulSet under the hood and adds the operational logic on top.

What's next?

You now have the clean mental model. The natural follow-ups:

Go deeper on the loop itself — the reconcile loop explained with timing diagrams and edge cases.
Understand level-triggered design — desired state vs actual state explains why operators are not event-driven the way most people assume.
Master the API contract — Custom Resource Definitions (CRDs) explained covers structural schemas, printer columns, and conversion webhooks.
Build your first Operator — Install Operator-SDK on Linux and you will be running the code skeleton above against a real cluster in under an hour.
Going back to basics? Start at What is a Kubernetes Operator? — the 60-second definition and the history of the pattern.