Building a Kubernetes Operator is only the first step. A controller that works with make run on one developer laptop still needs a repeatable CI/CD pipeline that proves every change is safe to merge, build, and release.
Most people searching for Kubernetes Operator CI/CD with GitHub Actions are not looking for a generic CI/CD definition. They usually want a working .github/workflows/operator-ci.yml file for a Kubebuilder or Operator SDK Go project, plus clear answers for envtest, kind, image publishing, OLM bundle validation, caching, and secrets.
This guide starts with a complete GitHub Actions workflow you can adapt, then explains each part so you know what belongs on pull requests, what belongs on main, and what should wait for release gates.
This article assumes a Go-based Kubebuilder or Operator SDK (go/v4) project with a Makefile exposing targets such as generate, manifests, test, docker-build, and optionally bundle. For local testing strategy and the difference between fake clients, envtest, and kind, read Testing Kubernetes Operators with envtest, fake client, and kind first.
Kubernetes Operator CI/CD in 60 seconds
- Run fast checks on every pull request:
gofmt,go vet, unit tests, generated manifests, and pinnedenvtest. - Use
envtestto test controllers against a real Kubernetes API server and etcd without starting a full cluster. - Build operator images on pull requests, but push images only from trusted branches, tags, or release workflows.
- Tag images with commit SHAs and promote immutable digests instead of relying on
latest. - Run slower
kind, scorecard, and full e2e suites onmain, release branches, scheduled workflows, or manual gates. - Cache Go modules, Go build output, Docker layers, and envtest assets with explicit version keys.
- Treat
operator-sdk bundle validateas packaging validation, not runtime proof.
Complete GitHub Actions workflow for a Kubernetes Operator
Create .github/workflows/operator-ci.yml in your operator repository:
name: operator-ci
on:
pull_request:
push:
branches:
- main
tags:
- 'v*'
workflow_dispatch:
concurrency:
group: operator-ci-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
GO_VERSION_FILE: go.mod
ENVTEST_K8S_VERSION: '1.31.x'
IMAGE_NAME: ghcr.io/${{ github.repository }}/manager
jobs:
test:
name: lint, unit, envtest
runs-on: ubuntu-latest
timeout-minutes: 15
permissions:
contents: read
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: ${{ env.GO_VERSION_FILE }}
cache: true
- name: Cache envtest assets
uses: actions/cache@v4
with:
path: ~/.local/share/kubebuilder-envtest
key: envtest-${{ runner.os }}-${{ env.ENVTEST_K8S_VERSION }}-${{ hashFiles('**/go.sum') }}
restore-keys: |
envtest-${{ runner.os }}-${{ env.ENVTEST_K8S_VERSION }}-
- name: Download dependencies
run: go mod download
- name: Check formatting
run: test -z "$(gofmt -l .)"
- name: Vet
run: go vet ./...
- name: Check generated code and manifests
run: |
make generate
make manifests
git diff --exit-code
- name: Run unit tests
run: go test ./... -short -count=1
- name: Run envtest integration tests
env:
ENVTEST_K8S_VERSION: ${{ env.ENVTEST_K8S_VERSION }}
run: make test
image:
name: build operator image
runs-on: ubuntu-latest
timeout-minutes: 20
needs: test
permissions:
contents: read
packages: write
attestations: write
id-token: write
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to GHCR
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and optionally push image
id: build
uses: docker/build-push-action@v6
with:
context: .
push: ${{ github.event_name != 'pull_request' }}
tags: |
${{ env.IMAGE_NAME }}:sha-${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Attest image provenance
if: startsWith(github.ref, 'refs/tags/v')
uses: actions/attest-build-provenance@v2
with:
subject-name: ${{ env.IMAGE_NAME }}
subject-digest: ${{ steps.build.outputs.digest }}
push-to-registry: true
- name: Print image digest
if: github.event_name != 'pull_request'
run: |
echo "Image digest: ${{ steps.build.outputs.digest }}"This workflow gives you the baseline most operator repositories need:
- Pull requests prove the code compiles, generated files are current, and controller tests pass with
envtest. - Pull requests build the image but do not push it.
- Pushes to
mainor tags publish an image to GitHub Container Registry. - The image is tagged by commit SHA, while the digest remains available for promotion.
concurrencycancels stale CI runs when a contributor pushes a newer commit.- Job-level permissions keep the test job read-only and grant registry/attestation permissions only to the image job.
I parsed this workflow locally with PyYAML before adding it. The first draft used an inline run: echo "Image digest: ..." line, and the parser rejected it because the colon made the scalar ambiguous. The version above uses a block scalar for that command.
If your repository lives in a monorepo, add a default working directory:
defaults:
run:
working-directory: operators/my-operatorPlace it at the workflow root, then update context: in docker/build-push-action to the same path.
Validate the workflow before committing
GitHub catches workflow syntax only after you push. Run a local parser first if your editor does not validate GitHub Actions YAML. I used Python and PyYAML for the workflow above:
python3 - <<'PY'
from pathlib import Path
import yaml
workflow = Path('.github/workflows/operator-ci.yml')
data = yaml.safe_load(workflow.read_text())
print('name:', data['name'])
print('jobs:', ','.join(data['jobs'].keys()))
print('test permissions:', data['jobs']['test']['permissions'])
print('image permissions:', data['jobs']['image']['permissions'])
PYSample output from the validation run:
name: operator-ci
jobs: test,image
test permissions: {'contents': 'read'}
image permissions: {'contents': 'read', 'packages': 'write', 'attestations': 'write', 'id-token': 'write'}This does not replace actionlint, but it catches indentation and YAML scalar mistakes before CI sees the file. If your team can install actionlint, use it as the stronger local and CI check.
What this pipeline is trying to prove
From pull request to merge-ready artifact
A practical Kubernetes Operator CI pipeline answers a few important questions before code reaches main:
- Does the code still compile and pass tests?
- Are generated artifacts such as CRDs, deepcopy code, RBAC, and manifests still in sync with the Go API types?
- Does the controller still reconcile correctly against a real Kubernetes API server through
envtest? - Can CI build the same manager image that will later be promoted?
- Does the repository avoid leaking release secrets into untrusted pull request workflows?
- If the project ships OLM metadata, does the bundle still validate?
A green pipeline does not prove the operator works in every cluster scenario, but it removes many common regressions before human review and release.
What you deliberately skip in early CI
Not every check belongs on every pull request. kind clusters, long-running end-to-end suites, scorecard tests, fuzzing, and soak tests often belong in scheduled workflows, main branch gates, release branches, or manual workflow_dispatch jobs.
Fast pull request CI encourages frequent commits and gives contributors feedback in minutes rather than hours.
Makefile targets CI expects
The workflow above assumes your project behaves like a normal Kubebuilder or Operator SDK repository.
A typical Makefile should provide:
| Target | Purpose in CI |
|---|---|
make generate |
Regenerates deepcopy code and generated Go artifacts |
make manifests |
Regenerates CRDs, RBAC, webhook, and manager manifests |
make test |
Runs controller tests, usually with envtest setup |
make docker-build IMG=... |
Builds the manager image when using Make instead of Buildx |
make bundle |
Generates an OLM bundle when the project publishes one |
The git diff --exit-code step after make generate and make manifests is important. It fails CI when a developer changes API types but forgets to commit updated CRDs or generated files.
If your repository separates fast unit tests and envtest suites using build tags or package boundaries, split them into separate CI steps. Otherwise, a single make test target is usually simpler and less surprising for Kubebuilder-style repositories.
Workflow shape: pull request, main, and release
Pull request workflow
On pull requests, run checks that are safe for forks and fast enough for code review:
gofmtorgofumpt.go vet.go test ./... -short.- Generated artifact drift checks.
make testor a targeted envtest test command.- Docker image build without pushing.
Do not assume repository secrets are available on forked pull requests. Even when they are technically available in private repositories, it is better to design PR checks so they do not need production credentials.
For concurrent controllers, consider running go test -race ./... in a scheduled workflow or release gate. The race detector can catch shared-state bugs in controller-runtime code, but it increases runtime enough that many teams keep it out of the fastest PR path.
Push to main
On push to main, run the same checks and then publish a commit-SHA image:
ghcr.io/OWNER/REPO/manager:sha-${{ github.sha }}You may also publish a convenience tag such as main, but use the immutable digest from the build output for promotion to staging or production.
Release tags
On release tags such as v1.4.2, publish semver image tags and release artifacts:
ghcr.io/OWNER/REPO/manager:v1.4.2- OLM bundle image or bundle directory.
- SBOM, provenance, and signatures if your organization requires supply-chain metadata.
The workflow above includes actions/attest-build-provenance@v2 for tag builds. That step needs attestations: write and id-token: write, so keep those permissions on the publishing job only. Add signing with cosign if your registry, marketplace, or internal policy requires signatures in addition to GitHub artifact attestations.
Keep release publishing in a protected workflow or GitHub Environment when production credentials are involved.
envtest in GitHub Actions
Why envtest belongs in operator CI
envtest starts a real Kubernetes API server and etcd, but it does not run a kubelet, scheduler, CNI, DNS, or a full Kubernetes control plane and node environment. That makes it a strong fit for controller reconciliation tests:
- CRD schema and API validation.
- Create, update, list, watch, status, and finalizer behavior.
- Reconcile loops that depend on Kubernetes API state.
- Webhook logic when configured in the envtest environment.
It is usually much faster than a kind job and more realistic than a fake client.
Pinning envtest assets
Pin the Kubernetes asset version so CI changes only when you choose to upgrade:
env:
ENVTEST_K8S_VERSION: '1.31.x'If your generated Makefile already uses setup-envtest, make make test consume that version. For example, many Kubebuilder projects have a pattern similar to:
ENVTEST_K8S_VERSION = 1.31.x
ENVTEST = $(LOCALBIN)/setup-envtest
.PHONY: test
test: manifests generate fmt vet envtest
KUBEBUILDER_ASSETS="$$( $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path )" go test ./... -coverprofile cover.out
The exact Makefile may differ by Kubebuilder or Operator SDK version, but the CI principle is the same: pin the asset version, cache the download path, and run the same target developers run locally.
Matrix testing across Kubernetes versions
Use a Kubernetes version matrix only when your support policy needs it. A practical matrix usually tests the oldest and newest supported Kubernetes versions:
strategy:
fail-fast: false
matrix:
envtest-k8s-version:
- '1.28.x'
- '1.30.x'Then set:
ENVTEST_K8S_VERSION: ${{ matrix.envtest-k8s-version }}Avoid a broad matrix on every pull request unless the project is small enough that CI remains fast.
Build and publish the manager image
For most operator repositories, use docker/build-push-action with BuildKit caching. It builds the same Dockerfile used for releases and can push only on trusted events.
- name: Build and optionally push image
uses: docker/build-push-action@v6
with:
context: .
push: ${{ github.event_name != 'pull_request' }}
tags: |
ghcr.io/${{ github.repository }}/manager:sha-${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=maxUse commit SHA tags for traceability and semver tags during releases. Mutable tags such as main or latest are convenient for humans, but they should not be the source of truth for environment promotion.
For GHCR, this workflow needs:
permissions:
contents: read
packages: writeFor cloud registries, prefer OIDC federation over static secrets:
- AWS: GitHub OIDC to an IAM role, then login to ECR.
- GCP: GitHub OIDC to Workload Identity Federation, then login to Artifact Registry.
- Azure: GitHub OIDC to a federated credential, then login to ACR.
Caching Go modules, Docker layers, and envtest
Caching is not just a speed optimization. It keeps pull request feedback fast enough that developers trust the pipeline.
Use actions/setup-go built-in caching for simple Go repositories:
- uses: actions/setup-go@v5
with:
go-version-file: go.mod
cache: trueUse explicit cache keys when you need custom paths:
- uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/go/pkg/mod
key: go-${{ runner.os }}-${{ hashFiles('**/go.sum') }}Cache envtest assets separately because they change with the Kubernetes version, not only with go.sum:
- uses: actions/cache@v4
with:
path: ~/.local/share/kubebuilder-envtest
key: envtest-${{ runner.os }}-${{ env.ENVTEST_K8S_VERSION }}-${{ hashFiles('**/go.sum') }}
restore-keys: |
envtest-${{ runner.os }}-${{ env.ENVTEST_K8S_VERSION }}-If your setup-envtest binary stores assets somewhere else, cache that actual path instead. The path must match your Makefile or the cache will look successful while envtest downloads on every run.
For Docker builds, prefer BuildKit cache:
cache-from: type=gha
cache-to: type=gha,mode=maxPrivate Go modules need extra authentication, such as .netrc, SSH deploy keys, GitHub App tokens, or a private Go proxy. Keep that setup isolated from normal build secrets and document it for contributors.
Optional: OLM bundle validation in CI
When bundle validation matters
Run operator-sdk bundle validate if your project publishes an OLM bundle, OperatorHub metadata, or a bundle image.
- name: Generate bundle
run: make bundle
- name: Install operator-sdk
run: |
curl -sSLo /usr/local/bin/operator-sdk \
"https://github.com/operator-framework/operator-sdk/releases/download/v1.34.2/operator-sdk_linux_amd64"
chmod +x /usr/local/bin/operator-sdk
- name: Validate bundle
run: operator-sdk bundle validate ./bundleChoose where validation runs based on how your repository manages bundles:
- If
bundle/is committed to Git, validate it on every pull request. - If the bundle is generated only for releases, validate it in the release workflow.
- If the bundle image reference is injected during release, validate after the final image reference is known.
Pin the operator-sdk version in CI rather than downloading latest. Validation rules and defaults can change across tool versions. I validated related bundle commands locally with operator-sdk v1.42.2; if your project uses an older Kubebuilder or Operator SDK plugin layout, pin the version that matches the project.
For OperatorHub-style validation, prefer the current optional validators:
operator-sdk bundle validate ./bundle \
--select-optional name=operatorhub/v2 \
--select-optional name=standardcapabilities \
--select-optional name=standardcategoriesBundle validation does not replace tests
Bundle validation checks metadata and packaging. It does not prove that your reconciler behaves correctly, that webhooks are reachable, or that the operator installs successfully in a real cluster.
Keep runtime testing separate:
envtestfor controller behavior.kindfor installation and cluster integration.- Staging clusters for release confidence.
Optional: kind-based jobs
What kind adds beyond envtest
Use kind when the test needs something envtest does not provide:
- Kubelet behavior.
- Service networking and DNS.
- Admission webhooks with real TLS and webhook registration.
- Multi-resource interactions that depend on a running cluster.
- Helm, OLM, or Kustomize installation flows.
- Smoke tests against the built operator image.
Because kind jobs are slower and consume more CI minutes, they often run on main, release branches, scheduled workflows, or manual workflow_dispatch triggers rather than every pull request.
Minimal kind smoke-test job
kind:
name: kind smoke test
runs-on: ubuntu-latest
needs:
- test
- image
if: github.event_name != 'pull_request'
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: helm/kind-action@v1
with:
cluster_name: operator-ci
- name: Load and install operator
run: |
make docker-build IMG=controller:ci
kind load docker-image controller:ci --name operator-ci
make deploy IMG=controller:ci
kubectl rollout status deployment/controller-manager -n system --timeout=120sAdjust the namespace, deployment name, and Makefile targets to match your project. Many Kubebuilder projects deploy into a namespace ending in -system, not literally system.
Upload debugging artifacts
When a kind job fails, upload enough cluster evidence to debug the run without reproducing it locally:
- name: Collect cluster diagnostics
if: failure()
run: |
mkdir -p artifacts
kubectl get all -A > artifacts/resources.txt
kubectl get events -A --sort-by=.metadata.creationTimestamp > artifacts/events.txt
kubectl describe pods -A > artifacts/pods.txt
kubectl logs -n system deploy/controller-manager > artifacts/operator.log || true
- name: Upload diagnostics
if: failure()
uses: actions/upload-artifact@v4
with:
name: kind-diagnostics
path: artifacts/For deeper troubleshooting patterns, see Debugging Kubernetes Operators.
Secrets and least privilege in GitHub Actions
Keep permissions as narrow as possible:
permissions:
contents: read
packages: writeAdd id-token: write only for workflows that use OIDC to authenticate to a cloud provider:
permissions:
contents: read
id-token: writeSeparate pull request checks from release publishing:
- Pull request jobs should avoid production secrets.
- Image publishing should run from trusted branches or tags.
- Production deployment should use protected GitHub Environments and required reviewers.
- Long-lived credentials should be replaced with OIDC whenever your registry or cloud provider supports it.
This separation is especially important for public repositories, where forked pull requests have a different trust model.
Common Kubernetes Operator CI failures
| Symptom | Likely cause | Fix |
|---|---|---|
git diff --exit-code fails after make manifests |
CRDs, RBAC, or generated code were not committed | Run make generate manifests locally and commit the output |
setup-envtest downloads assets on every run |
Envtest path is not cached or the cache key changes too often | Cache the envtest directory with a key based on OS and Kubernetes version |
| Tests pass locally but fail in CI | Local cluster state hides missing test setup | Make tests create all required CRDs, namespaces, schemes, and fixtures |
| Pull request cannot push to GHCR | Forked PRs do not get write credentials | Build without pushing on PRs, push only from trusted branches or tags |
| Bundle validation fails after an image change | CSV or bundle image reference drifted | Regenerate the bundle after final image tag or digest selection |
| kind job times out | The test is doing too much for PR feedback | Move it to main, release, nightly, or workflow_dispatch and upload diagnostics |
| Webhook tests pass in envtest but fail in kind | TLS, webhook service, or admission registration differs in a real cluster | Add a kind smoke test for webhook installation |
Checklist: production-ready operator CI/CD
| Area | Recommended setup |
|---|---|
| Pull request checks | gofmt, go vet, unit tests, generated artifact drift check, envtest |
| Envtest | Pin Kubernetes version, cache assets, keep PR runtime short |
| Image build | Build on PRs, push only from trusted branches or release tags |
| Image tags | Use sha-${{ github.sha }} and semver tags; promote digests |
| OLM bundle | Validate with a pinned operator-sdk version if the project ships bundles |
| kind/e2e | Run on main, releases, schedules, or manual workflows when too slow for PRs |
| Secrets | Prefer OIDC; avoid production secrets in pull request jobs |
| Debugging | Upload test logs, coverage, events, pod descriptions, and operator logs |
Frequently Asked Questions
1. What should a Kubernetes Operator GitHub Actions workflow run?
A practical workflow should run formatting checks, go vet, unit tests, generated manifest checks, envtest integration tests, and image builds. Add OLM bundle validation only if the repository publishes bundles, and move slower kind or end-to-end tests to main, release, nightly, or manual workflows.2. Should envtest run on every pull request?
Yes for most small and medium operator repositories if the job finishes in a few minutes. Pin the Kubernetes envtest asset version, cache the downloaded binaries, and split expensive kind or full cluster tests to main branch, scheduled, or release workflows when PR feedback becomes too slow.3. Do I need kind if I already run envtest?
Not always. envtest is enough for many controller reconciliation tests because it runs a real API server and etcd. Use kind when the test needs kubelet behavior, DNS, Services, webhook registration, admission TLS, or installation flows closer to a real cluster.4. Where should I store registry credentials for operator image publishing?
Prefer OIDC federation to AWS, GCP, or Azure over long-lived cloud credentials. For GitHub Container Registry, use GITHUB_TOKEN with packages permissions when possible, or use a narrowly scoped fine-grained PAT stored as an encrypted secret.5. Does operator-sdk bundle validate replace cluster testing?
No. operator-sdk bundle validate checks bundle metadata, CSV structure, and packaging rules. It does not prove the operator reconciles resources correctly at runtime. Keep envtest, kind, or staging-cluster tests for runtime behavior.6. Should Kubernetes Operator CI publish images from pull requests?
Public repositories usually avoid pushing permanent images from forked pull requests because secrets are not available and the trust boundary is different. Build without pushing on pull requests, then push immutable commit-SHA images from main, release branches, or protected release workflows.See also
These tutorials in the Kubernetes Operators series fit next in your reading order:
- Install Operator-SDK on Linux
- Testing Kubernetes Operators with envtest, fake client, and kind
- Go Kubernetes Operator SDK tutorial
- Debugging Kubernetes Operators
- Configuration: flags, env, live reload
- OLM bundles & OperatorHub
Upstream references
- GitHub Actions workflow syntax
- GitHub Actions dependency caching
- GitHub artifact attestations
- actions/setup-go
- docker/build-push-action
- Kubebuilder envtest reference
- Operator SDK CI best practices
Bottom line: a Kubernetes Operator CI/CD workflow should give developers a fast answer on every pull request, run envtest before merge, build the same image that will be released, push only from trusted events, validate OLM bundles when you publish them, and reserve expensive full-cluster proof for main, releases, scheduled runs, or manual gates.

