So you wanna write Kubernetes controllers?
106 comments
·January 22, 2025liampulles
osigurdson
Not Google but leverage a lot of compute at work and use Kubernetes for that. However, I use it even on small side projects as well because I am on the other side of the learning curve. The control plan is free in some cloud providers or can run locally. It brings a lot of consistency between on-premise and various cloud providers and easy to use once you get the hang of it.
fragmede
The funny thing about that is that Google doesn't use Kubernetes internally because it doesn't scale to their level. Borg is more advanced than Kubernetes in the ways that Google needs, so really Kubernetes is dumbed down for everyone else, and everyone else isn't Google scale, (except for those that are, eg Meta has Twine). so yeah, you're probably not Google, but people out there are Tinder or Reddit or Pinterest and all shouldn't have to reinvent the wheel.
sofixa
> The declarative approach and reconciliation loop offers so many possibilities for creating higher level descriptions of domain specific infrastructure.
Terraform running on a schedule gets you 3/4 of the way there for 5% of the complexity though.
kmac_
Terraform's not an orchestrator, it's for something totally different.
osigurdson
Helm gets you 99% of the way there with less complexity than running Terraform in loop. Terraform is great for bootstrapping Kubernetes of course.
granra
Íslenskur? :D
IMO using string templating for creating structured data in a white-space sensitive configuration language always ends with pain and cursing beyond the most basic application manifests. I'm not saying Terraform or HCL is necessarily the solution either but it certainly wouldn't be Helm in my book.
It's a shame language like CUE or nickel didn't take off for this.
doctorpangloss
Helm is an ultra low budget operator for anything. Maybe the best thing in the ecosystem, despite what everyone says.
sofixa
I'm not sure I'd describe Helm as "less complexity" than Terraform. HCL is much easier to read, write and template than YAML. Helm also cannot tell you exactly what changed/will change (tf plan). It also cannot use dynamic data (e.g. check how many availability zones there are in the region and use that for the number of Pods).
clx75
At work we are using Metacontroller to implement our "operators". Quoted because these are not real operators but rather Metacontroller plugins, written in Python. All the watch and update logic - plus the resource caching - is outsourced to Metacontroller (which is written in Go). We define - via its CompositeController or DecoratorController CRDs - what kind of resources it should watch and which web service it should call into when it detects a change. The web service speaks plain HTTP (or HTTPS if you want).
In case of a CompositeController, the web service gets the created/updated/deleted parent resource and any already existing child resources (initially none). The web service then analyzes the parent and existing children, then responds with the list of child resources whose existence and state Metacontroller should ensure in the cluster. If something is left out from the response compared to a previous response, it is deleted.
Things we implemented using this pattern:
- Project: declarative description of a company project, child resources include a namespace, service account, IAM role, SMB/S3/FSX PVs and PVCs generated for project volumes (defined under spec.volumes in the Project CR), ingresses for a set of standard apps
- Job: high-level description of a DAG of containers, the web service works as a compiler which translates this high-level description into an Argo Workflow (this will be the child)
- Container: defines a dev container, expands into a pod running an sshd and a Contour HTTPProxy (TCP proxy) which forwards TLS-wrapped SSH traffic to the sshd service
- KeycloakClient: here the web service is not pure - it talks to the Keycloak Admin REST API and creates/updates a client in Keycloak whose parameters are given by the CRD spec
So far this works pretty well and makes writing controllers a breeze - at least compared to the standard kubebuilder approach.
JeffMcCune
As other sibling comments suggest these use cases are better solved with a generator.
The rendered manifest pattern is a simpler alternative. Holos [1] is an implementation of the pattern using well typed CUE to wrap Helm and Kustomize in one unified solution.
It too supports Projects, they’re completely defined by the end user and result in the underlying resource configurations being fully rendered and version controlled. This allows for nice diffs for example, something difficult to achieve with plain ArgoCD and Helm.
remram
The choice is always between a controller and a generator.
The advantage of a controller is that it can react to external conditions, for example nodes/pods failing, etc. The is great for e.g. a database where you need to failover and update endpointslices. The advantage of a generator is that it can be tested easier, it can be dry-runned, and it is much simpler.
All of your examples seem to me like use cases that would be better implemented with a generator (e.g. Helm, or any custom script outputting YAML) than a controller. Any reason you wrote these as controllers anyway?
Kinrany
Even if a controller is necessary, wouldn't you still want to have a generator for the easy stuff?
Kinda like "functional core, imperative shell"?
ec109685
Curious why using controller for these aspects versus generating the K8s objects as part of your deployment pipeline that you just apply? The latter gives you versioned artifacts you can roll forward and back and independent deployment of these supporting pieces with each app.
Is there runtime dynamism that you need the control loop to handle beyond what the built-in primitives can handle?
clx75
Some of the resources are short-lived, including jobs and dev containers. The corresponding CRs are created/updated/deleted directly in the cluster by the project users through a REST API. For these, expansion of the CR into child resources must happen dynamically.
Other CRs are realized through imperative commands executed against a REST API. Prime example is KeycloakRealm and KeycloakClient which translate into API calls to Keycloak, or FSXFileSystem which needs Boto3 to talk to AWS (at least for now, until FSXFileSystem is also implemented in ACK).
For long-lived resources up-front (compile time?) expansion would be possible, we just don't know where to put the expansion code. Currently long-lived resource CRs are stored in Git, deployment is handled with Flux. When projects want an extra resource, we just commit it to Git under their project-resources folder. I guess we could somehow add an extra step here - running a script? - which would do the expansion and store the children in Git before merging desired state into the nonprod/prod branches, I'm just not clear on how to do this in a way that feels nice.
Currently the entire stack can be run on a developer's laptop, thanks to the magic of Tilt. In local dev it comes really handy that you can just change a CRs and the children are synced immediately.
Drawbacks we identified so far:
If we change the expansion logic, child resources of existing parents are (eventually) regenerated using the new logic. This can be a bad thing - for example jobs (which expand into Argo Workflows) should not change while they are running. Currently the only idea we have to mitigate this problem is storing the initial expansion into a ConfigMap and returning the original expansion from this "expansion cache" if it exists at later syncs.
Sometimes the Metacontroller plugin cannot be a pure function and executing the side effects introduces latency into the sync. This didn't cause any problems so far but maybe will as it goes against the Metacontroller design expressed in the docs.
Python is a memory hog, our biggest controllers can take ~200M.
ec109685
We've used an artifact store like Aritifactory to store the generated / expanded K8s yaml files, ending up with three pieces: 1) A versioned and packaged config generation system that your dev ops team owns. You'd have test and production versions of this that all applications use in their CI pipeline. 2) A templated input configuration that describes the unique bits per service (this configuration file is owned by each application team) 3) The output of #1 applied to #2, versioned in an artifact store that is generated by the CI pipeline.
And finally, a Kustomize step can be added at the end to support configuration that isn't supported by #1 and #2, without requiring teams to generate all the K8s config pieces by hand.
fsniper
At work we are using nolar/kopf for writing controllers that provisions/manages our kubernetes clusters. This also includes managing any infrastructure related apps that we deploy on them.
We were using whitebox controller at the start, which is also like metacontroller that runs your scripts on kubernetes events. That was easy to write. However not having full control on the lifecycle of the controller code gets in the way time to time.
Considering you are also writing Python did you review kopf before deciding on metacontroller?
clx75
Yes, we started with Kopf.
As we understood it, Kopf lets you build an entire operator in Python, with the watch/update/cache/expansion logic all implemented in Python. But the first operator we wrote in it just didn't feel right. We had to talk to the K8S API from Python to do all the expansions. It was too complex. We also had aesthetic issues with the Kopf API.
Metacontroller gave us a small, Go binary which takes care of all the complex parts (watch/update/cache). Having to write only the expansion part in Python felt like a great simplification - especially now that we have Pydantic.
branislav
Controllers are a complex topic, but as the linked talk describes, it all comes down to some basic control theory concepts. I wrote about them in my Desired state systems post https://branislavjenco.github.io/desired-state-systems/ if somebody wants a high-level overview of how to think about them.
Basically, declarative state implies value semantics which makes it easier to reason about. Underlying complexity is high though, and you need to judge how necessary it is.
Kinrany
I always thought that React and Kubernetes indeed have a lot in common. Thank you for the post!
never_inline
I'd please ask people to don't write operators unless absolutely necessary.
I used a certain tool which had its own config format, and it's "cloudnative" operator implemented CRDs of which multiple can exist and they would update the config file in some mounted volume. Such thing is a hell to debug. Why can't we just store the config file in configmap/ secret and listen to changes?
(If we had a better templating solution than helm, I think quite a few operators wouldn't need to exist.)
Havoc
Low barrier to entry was not a phrase I was expecting in that article.
Either way I’m going to try my hardest to avoid this. K8s is hard enough to get right as is
Vampiero
Why do devops keep piling abstractions on top of abstractions?
There's the machine. Then the VM. Then the container. Then the orchestrator. Then the controller. And it's all so complex that you need even more tools to generate the configuration files for the former tools.
I don't want to write a Kubernetes controller. I don't even know why it should exist.
stouset
Right now I’m typing on a glass screen that pretends to have a keyboard on it that is running a web browser developed with a UI toolkit in a programming language that compiles down to an intermediate bytecode that’s compiled to machine code that’s actually interpreted as microcode on the processor, half of it is farmed out to accelerators and coprocessors of various kinds, all assembled out of a gajillion transistors that neatly hide the fact that we’ve somehow made it possible to make sand think.
The number of layers of abstraction you’re already relying on just to post this comment is nigh uncountable. Abstraction is literally the only way we’ve continued to make progress in any technological endeavor.
zug_zug
I think the point is that there are abstractions that require you to know almost nothing (e.g. that my laptop has a SSD with blocks that are constantly dying is abstracted to a filesystem that looks like a basic tree structure).
Then there are abstractions that may actually increase cognitive load "What if instead of thinking about chairs, we philosophically think about ALL standing furniture types, stools, tables, etc. They may have 4 legs, 3, 6? What about a car seats too?"
AFAICT writing a kubernetes controller is probably overkill challenge-yourself level exercise (e.g. a quine in BF) because odds are that any resource you've ever needed to manage somebody else has built an automated way to do it first.
Would love to hear other perspectives though if anybody has great examples of when you really couldn't succeed without writing your own kubernetes controller.
root_axis
Yes, k8s is an abstraction, and it's a useful one, even though not everyone needs it. At this new level of abstraction, your hardware becomes homogeneous, making it trivial to scale and recover from hardware failures since k8s automatically distributes your application instances across the hardware in a unified manner. It also has many other useful capabilities downstream of that (e.g. zero downtime deployment/rollback/restart). There's not really any other (well supported) alternative if you want that. Of course, most organizations don't need it, but it's very nice to have in a service oriented system.
stouset
Those only require you to understand them because you’re working directly on top of them. If you were writing a filesystem driver you would absolutely need to know those details. If you’re writing a database backend, you probably need to know a lot about the filesystem. If you’re writing an ORM, you need to know a lot about databases.
Some of these abstractions are leakier than others. Web development coordinates a lot of different technologies so often times you need to know about a wide variety of topics, and sometimes a layer below those. Part of it is that there’s a lot less specialization in our profession than in others, so we need lots of generalists.
zenethian
Seemingly endlessly layered abstraction is also why phones and computers get faster and faster yet nothing seems to actually run better. Nobody wants to write native software anymore because there are too many variations of hardware and operating systems but everyone wants their apps to run on everything. Thus, we are stuck in abstraction hell.
I'd argue the exact opposite has happened. We have made very little progress because everything is continually abstracted out to the least common denominator, leaving accessibility high but features low. Very few actual groundbreaking leaps have been accomplished with all of this abstraction; we've just made it easier to put dumb software on more devices.
stouset
I encourage you to actually work on a twenty year old piece of technology. It’s easy to forget that modern computers are doing a lot more. Sure, there’s waste. But the expectations from software these days are exponentially greater than what we used to ship.
p_l
Another, huge in fact, reason is that we ask them to do a lot more.
Just the framebuffer for one of my displays uses more memory than a computer that was very usable for all sorts of tasks back in 1998. Rendering UI to it also takes a lot more resources because of that.
skydhash
> Nobody wants to write native software anymore because there are too many variations of hardware and operating systems but everyone wants their apps to run on everything.
So far we have: Android and i(pad)OS (mobile); MacOS, Windows, *nix? (desktop); And the web. That's not a lot of platform. My theory is that no one want to properly architect their software anymore. It's just too easy to build a ball of mud on top of electron and have a 5GB node_modules folder full of dependencies with unknown provenance.
root_axis
This is just totally wrong. Full stop. Today's devices are unimaginably orders of magnitude faster than the computers of old. To suggest otherwise is absolutely absurd, either pure ignorance or a denial of reality. I'm quite blown away that people so confidently state something that's so easily demonstrated as incorrect.
petercooper
Then all of that data is turned into HTTP requests which turn into TCP packets distributed over IP over wifi over Ethernet over PPPoE over DSL and probably turned into light sent over fiber optics at various stages... :-)
ok123456
The problem isn't abstractions. The problem is leaky abstractions that make it harder to reason about a system and add lots of hidden states and configurations of that state.
What could have been a static binary running a system service has become a Frankenstein mess of opaque nested environments operated by action at a distance.
danielklnstn
CRDs and their controllers are perhaps the reason Kubernetes is as ubiquitous as it is today - the ability to extend clusters effortlessly is amazing and opens up the door for so many powerful capabilities.
> I don't want to write a Kubernetes controller. I don't even know why it should exist.
You can take a look at Crossplane for a good example of the capabilities that controllers allow for. They're usually encapsulated in Kubernetes add-ons and plugins, so much as you might never have to write an operating system driver yourself, you might never have to write a Kubernetes controller yourself.
raffraffraff
One of the first really pleasant surprises I got while learning was that the kubectl command itself was extended (along with tab completion) by CRDs. So install external secrets operator and you get tab complete on those resources and actions.
dijit
> Why do devops keep piling abstractions on top of abstractions?
Mostly, because developers keep trying to replace sysadmins with higher levels of abstraction. Then when they realise that they require (some new word for) sysadmins still, they pile on more abstractions again and claim they don't need them.
The abstraction du-jour is not Kubernetes at the moment, it's FaaS. At some point managing those FaaS will require operators again and another abstraction on top of FaaS will exist, some kind of FaaS orchestrator, and the cycle will continue.
robertlagrant
I think it's clear that Kubernetes et al aren't trying to replace sysadmins. They're trying to massively increase the ratio of sysadmin:machine.
dijit
Fair point. Kubernetes seems to have been designed as a system to abstract across large physical machines, but instead we're using it in "right-sized" VM environments, which is solving the exact same set of problems in a different way.
Similar to how we developed a language that could use many cores very well, and compiles to a single binary, but we use that language almost exclusively in environments that scale by running multiple instances of the same executable on the same machine, and package/distribute that executable in a complicated tarball/zipping process.
I wonder if there's a name for this, solving the same problem twice but combining the solutions in a way that renders the benefits moot.
nejsjsjsbsb
There are no sysadmins though in the new model. There are teams of engineers who code Go, do kubernetes stuff and go on call. They may occasionally Google some sysadmin knowledge. They replace sysadmins like drivers replace the person in front of the Model T waving a flag. Or pilots replace navigators.
GiorgioG
I don’t want Kubernetes period. Best decision we’ve made at work is to migrate away from k8s and onto AWS ECS. I just want to deploy containers! DevOps went from something you did when standing up or deploying an application, to an industry-wide jobs program. It’s the TSA of the software world.
nijave
ECS is very very similar to Kubernetes and duplicates pretty much all of the functionality except AWS names and manages each piece as a separate service/offering.
ECS+Route53+ALB/ELB+EFS+Parameter Store+Secrets Manager+CloudWatch (Metrics, Logs, Events)+VPC+IAM/STS and you're pretty close in functionality.
frazbin
If I may ask, just to educate myself
where do you keep the ECS service/task specs and how do you mutate them across your stacks?
How long does it take to stand up/decomm a new instance of your software stack?
How do you handle application lifecycle concerns like database backup/restore, migrations/upgrades?
How have you supported developer stories like "I want to test a commit against our infrastructure without interfering with other development"?
I recognize these can all be solved for ECS but I'm curious about the details and how it's going.
I have found Kubernetes most useful when maintaining lots of isolated tenants within limited (cheap) infrastructure, esp when velocity of software and deployments is high and has many stakeholders (customer needs their demo!)
blazing234
Why don't you just deploy to cloud run on gcp and call it a day
k8sToGo
It is always this holier than thou attitude of Software engineers towards DevOps that is annoying. Especially if it comes from ignorance.
These days often DevOps is done by former Software Engineers rather than "old fashioned" Sys admins.
Just because you are ignorant on how to use AKS efficiently, doesn't mean your alternative is better.
sgarland
> These days often DevOps is done by former Software Engineers rather than "old fashioned" Sys admins.
Yes, and the world is a poorer place for it. Google’s SRE model works in part because they have _both_ Ops and SWE backgrounds.
The thing about traditional Ops is, while it may not scale to Google levels, it does scale quite well to the level most companies need, _and_ along the way, it forces people to learn how computers and systems work to a modicum of depth. If you’re having to ssh into a box to see why a process is dying, you’re going to learn something about that process, systemd, etc. If you drag the dev along with you to fix it, now two people have learned cross-areas.
If everything is in a container, and there’s an orchestrator silently replacing dying pods, that no longer needs to exist.
To be clear, I _love_ K8s. I run it at home, and have used it professionally at multiple jobs. What I don’t like is how it (and every other abstraction) have made it such that “infra” people haven’t the slightest clue how infra actually operates, and if you sat them down in front of an empty, physical server, they’d have no idea how to bootstrap Linux on it.
mugsie
Yeah, DevOps was a culture not a job title, and then we let us software engineers in who just want to throw something into prod and go home on friday night, so they decided it was a task, and the lowest importance thing possible, but simultaniously, the devops/sre/prod eng teams needed to be perfect, because its prod.
it is a wierd dichotomy I have seem, and it is getting worse. We let teams have access to argo manifiests, and helm charts, and even let them do custom in repo charts.
not one team in the last year has actually gone and looked at k8s docs to figure out how to do basic shit, they just dump questions into channels, and soak up time from people explaining the basics of the system their software runs on.
codr7
Nah, I'm delighted if someone wants to do it.
Not as delighted by the fact that many companies seem to want developers to do devops as well, like when the code is compiling or something.
It's not being taken seriously.
mugsie
Thats great if that works for you, and for a lot people and teams. You have just shifted the complexity of networking, storage, firewalling, IP management, L7 proxying to AWS, but hey, you do have click ops there.
> DevOps went from something you did when standing up or deploying an application, to an industry-wide jobs program. It’s the TSA of the software world.
DevOps was never a job title, or process, it was a way of working, that went beyond yeeting to prod, and ignoring it.
From that one line, you never did devops - you did dev, with some deployment tools (that someone else wrote?)
ninjha
You can have Click-Ops on Kubernetes too! Everything has a schema so it's possible to build a nice UI on top of it (with some effort).
My current project is basically this, except it edits your git-ops config repository, so you can click-ops while you git-ops.
Spivak
I'm so confused about the jobs program thing. I'm an infra engineer who has had the title devops for parts of my career. I feel like I've always been desperately needed by teams of software devs that don't want to concern themselves with the gritty reality of actually running software in production. The job kinda sucks but for some reason jives with my brain. I take a huge amount of work and responsibility off the plates of my devs and my work scales well to multiple teams and multiple products.
I've never seen an infra/devops/platform team not swamped with work and just spinning their tires on random unnecessary projects. We're more expensive on average than devs, harder to hire, and two degrees separated from revenue. We're not a typically overstaffed role.
bshacklett
K8s really isn't about piling up abstractions. The orchestrator sits beside containers (which can be run on bare metal, btw) and handles tasks which already need to be done. Orchestration of any system is always necessary. You can do it with K8s (or a related platform), or you can can cobble together custom shell scripts, or even perform the tasks manually.
One of these gives you a way to democratize the knowledge and enable self-service across your workforce. The others result in tribal knowledge being split into silos all across an organization. If you're just running a couple of web servers and rarely have to make changes, maybe the manual way is OK for you. For organizations with many different systems that have complex interactions with each other, the time it takes to get a change through a system and the number of potential errors that manual tasks add are just infeasible.
Controllers are just one way to bring some level of sanity to all of the different tasks which might be required to maintain any given system. Maybe you don't need your own custom controllers, as there are a huge number which have already been created to solve the most common requirements. Knowing how to write them allows one to codify business rules, reduce human error, and get more certainty over the behavior of complex systems.
solatic
Current example from work: an extreme single-tenant architecture, deployed for large N number of tenants, which need both logically and physically isolation; the cost of the cloud provider's managed databases is considered Too Expensive to create one per tenant, so an open-source Kubernetes controller for the database is used instead.
Not all systems are small-N modern multi-tenant architectures deployed at small scale.
bg24
This is the point. Right tool for the job. Kubernetes was incubated at Google and designed for deployments at scale. Lot of teams are happily using it. But it is definitely not for startups or solo devs, unless you are an expert user already.
ryandv
You have some computing resource that needs to be provisioned according to the specifications laid out in a Kubernetes manifest (YAML). Something needs to go out and actually "physically" create or retrieve that resource, with all the side-effects that involves, bring its state into accordance with whatever the manifest specifies, and continuously make adjustments when the resource's state diverges from the manifest throughout the lifetime of the resource.
One example is a controller responsible for fulfilling ACME challenges to obtain x509 certificates. Something needs to actually publish the challenge responses somewhere on the internet, retrieve the x509 certificate, and then persist it onto the cluster so that it may be used by other applications. Something needs to handle certificate renewal on an ongoing basis. That something is the controller.
MathMonkeyMan
> I don't want to write a Kubernetes controller. I don't even know why it should exist.
I don't want to write one either. Given the choice, I won't even touch one.
I think I know why they exist, though. Kubernetes is a system of actors (resources) and events (state transitions). If you want to derive new state from existing state, and to maintain that new state, then you need something that observes "lower" state transitions and takes action on the system to achieve its desired "higher" state.
Whether we invent terminology for these things or not, controllers exist in all such systems.
hellnowtf
[dead]
I used to be fascinated by the automation power of Kubernetes custom components. The declarative approach and reconciliation loop offers so many possibilities for creating higher level descriptions of domain specific infrastructure.
On reflection though, I think this stuff can lead to a lot of complexity layers which don't benefit the product relative to the time investment. You are probably not Google.