start working on kubernetes alternative post

This commit is contained in:
Joseph Montanaro 2023-10-01 06:41:33 -07:00
parent 00300167eb
commit 924706b3b2

View File

@ -0,0 +1,47 @@
---
title: The Kubernetes Alternative I Wish Existed
date: 2023-10-01
draft: true
---
<script>
import Sidenote from '$lib/Sidenote.svelte';
</script>
I use Kubernetes on my personal server, largely because I wanted to get some experience working with it. It's certainly been helpful in that regard, but after a year and a half or so I think I can pretty confidently say that it's not the ideal tool for my use-case. Duh, I guess? But I think it's worth talking about _why_ that's the case, and what exactly _would_ be the ieal tool.
## The Kubernetes Way™
Kubernetes is a very intrusive orchestration system. It would very much like the apps you're running to be doing things _its_ way, and although that's not a _hard_ requirement it tends to make everything subtly more difficult when that isn't the case. In particular, Kubernetes is targeting the situation where you:
* Have a broad variety of applications that you want to support,
* Have written all or most of those applications yourself,<Sidenote>The "you" here is organizational, not personal.</Sidenote>
* Need those applications to operate at massive scale, e.g. concurrent users in the millions.
That's great if you're Google, and surprise! Kubernetes is a largely Google-originated project,<Sidenote>I'm told that it's a derivative of Borg, Google's in-house orchestration platform.</Sidenote> but it's absolute _garbage_ if you (like me) are just self-hosting apps for your own personal use and enjoyment. It's garbage because, while you still want to support a broad variety of applications, you typically _didn't_ write them yourself and you _most definitely don't_ need to scale to millions of concurrent users. More particularly, this means that the Kubernetes approach of expecting everything to be aware that it's running in Kubernetes and make use of the platform (via cluster roles, CRD's etc) is very much _not_ going to fly. Instead, you want your orchestration platform to be as absolutely transparent as possible: ideally, a running application should need to behave no differently in this hypothetical self-hosting-focused orchestration system than it would if it were running by itself on a Raspberry Pi in your garage. _Most especially_, all the distributed-systems crap that Kubernetes forces on you is pretty much unnecessary, because you don't need to support millions<Sidenote>In fact, typically your number of concurrent users is going to be either 1 or 0.</Sidenote> of concurrent users, and you don't care if you incur a little downtime when the application needs to be upgraded or whatever.
## But Wait
So then why do you need an orchestration platform at all? Why not just use something like [Harbormaster](https://gitlab.com/stavros/harbormaster) and call it a day? That's a valid question, and maybe you don't! In fact, it's quite likely that you don't - orchestration platforms really only make sense when you want to distribute your workload across multiple physical servers, so if you only have the one then why bother? However, I can still think of a couple of reasons why you'd want a cluster even for your personal stuff:
* You don't want everything you host to become completely unavailable if you bork up your server somehow. Yes, I did say above that you can tolerate some downtime, and that's still true - but especially if you like tinkering around with low-level stuff like filesystems and networking, it's quite possible that you'll break things badly enough<Sidenote>And be sufficiently busy with other things, given that we're assuming this is just a hobby for you.</Sidenote> that it will be days or weeks before you can find the time to fix them. If you have multiple servers to which the workloads can migrate while one is down, that problem goes away.
* You don't want to shell out up front for something hefty enough to run All Your Apps, especially as you add more down the road. Maybe you're starting out with a Raspberry pi, and when that becomes insufficient you'd like to just add more Pis rather than putting together a beefy machine with enough RAM to feed your [Paperless](https://github.com/paperless-ngx/paperless-ngx) installation, your [UniFi controller](https://help.ui.com/hc/en-us/articles/360012282453-Self-Hosting-a-UniFi-Network-Server), your Minecraft server(s), and your [Matrix](https://matrix.org) server.
Okay, sure, maybe this is still a bit niche. But you know what? This is my blog, so I get to be unrealistic if I want to.
## The Goods
### Firecracker
You didn't write all these apps yourself, and you don't trust them any further than you can throw them. Containers are great and all, but you'd like a little more organization. Enter Firecracker. This does add some complexity where resource management is concerned, especially memory, since by default Firecracker wants you to allocate everything up front. But maybe that's ok, or maybe we can build in some [ballooning](https://github.com/firecracker-microvm/firecracker/blob/main/docs/ballooning.md) to keep things under control.
### Storage
Kubernetes tends to work best with stateless applications. It's not entirely devoid of [tools](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/) for dealing with state, but state requires persistent storage and persistent storage is hard in clusters. I get the sense that for a long time you were almost completely on your own here, although recent options (Longhorn) are improving the situation.
Regardless, we're selfhosting here, which means virtually _everything_ has state. But fear not! Distributed state is hard, yes, but most of our apps aren't going to be truly distributed. That is, typically there's only going to be one instance running at a time, and it's acceptable to shut down the existing instance before spinning up a new one. So this problem becomes a lot more tractable.
* Asynchronous replication
* Single-writer, multi-reader
* Does this exist?