unlisted · im tosti

I recently got a new job, and as a part of this new role, I deal a lot with kubernetes. I don't bury the lede (at least not in this post). This post is about that, but also about a larger set of tendencies in our industry β€” one that permeates i from top to bottom, and therefore is worth looking into.

Nothing summarizes said tendency like this classic XKCD.

Why do we automate? #

The reason we do this tends to be twofold. First, sometimes writing the automation is simply more fun. Humans tend to get tired of boring repetitive tasks, so we find a way to do something new to replace them.

Of course updating software generally shouldn't be difficult, while deploying it from scratch shouldn't be a daily occurrence. The shoulds, sadly, are not RFC 2119-compliant, much like the world.

Secondly, sometimes there is truly no choice but to automate. There's a lot of reasons as to why this might happen, but the case that caused kubernetes to be made was google's. Google owns a lot of datacenters, and a lot of servers. Managing all of those without automation would not only require an absolutely ungodly amount of critically synchronized engineers, it would also require them to somehow be so synchronized all around the world. After all, if something happens to a server, and it's a physical issue, you now need to go there. This sheer scale, and the need to abstract over software deployment and hardware availability is why google made it to begin with.

There is of course a third reason β€” someone convinced you that you should. I'm going to count that as "fun", even if you probably aren't having fun doing it. This doesn't happen very often though, right?

The Cost #

Automation is not free. The platonic ideal case for automation goes about as follows: you no longer need [person that knows how to do X], you can outsource [X] over to [random guy on github that wrote X automation]. Now you only need someone that can operate this automation!

Problem: the random guy on github (even when this random guy is Microsoft) is not perfect. Fun fact, this is actually the same problem that NixOS has!

So what happens when there's a problem with the automation? Well, you have a double problem now: you need someone that understands the automation, how the automation works, and the thing you're trying to do to begin with.

Of course, you can go yell at the random github person, but your odds of that working out are just that: odds. As such, for any automation that is business critical, your business MUST be able to handle a break in that supply chain. If it doesn't though, don't worry β€” this is something that standards like ISO 27001 and SOC2 notoriously do not consider.

Also literally no worries. Most business owners do not understand the nature of automation, and therefore literally do not know to fear what is being wrought. Instead, they just know the standard "well google uses it so it's good".

Ultimately, the question of whether "kubernetes is good" is irrelevant! It may well be good at what it does, but is this relevant for you? Imagine there was an agreed-upon best way to… breed ants. Do you now start breeding ants as part of your tech pipeline?

At my new workplace, there actually is a point to using kubernetes. I would be ecstatic over this, were it not for the fact that while this one component is using it, so is every single other thing. Hell, kubernetes is acting as the message queue and db transaction system (don't ask how). Still, this post isn't about my position per se and most workplaces don't even have this much going for them.

In short, the cost to increasing your automation is: if you care about the thing you're automating, you now need a specialist on hand for both the thing, and the automation. If you don't have those things, you should be prepared for the thing to break, or even to lose it entirely. Oh but don't worry, if it happens, you can just hire a specialist like me, pay a bunch of money, and the problem will go away (until something else breaks). Just eat the downtime costs in the meanwhile. Or make sure you have a specialist on hand ad perpetuum, I won't complain.

Duplicated Work #

One of the reasons people might be tempted to automate is more predictable costs and performance. One search about "surprise bills" (mostly coming from EKS and similar users) should convince you otherwise. Let's talk about where all of this is coming from, though.

I'll start with an example.