upgraded a kernel, kubernetes stopped running (cgroup_enable= wanted), i tried to upgrade it, pods/containers stopped having access to the internet, i tried switching to cgroup v2/unified, didn't fix stuff.

i've been deleting & building new single-node clusters for ~4 hours now. first time seeing a pod be able to ping out in a long long time. not super pleased with how much i had to delete to get here but at least i can end the night with something running again. also learned a bunch, and it's more neutral peer, "saw some shit". a lot of debugging attempts. a lot of trying to fiddle with parts of the system i hadn't touched.

i do keep thinking perhaps i should go back to kubernetes the hard way. k3s is great. but i keep dreaming of running multiple kubelets per node, for weird reasons, or multiple api servers but one etcd instance. and none of it feels that hard. none of it feels that far away.

that said though, like, my life is defined by feeling like i am woefully radically not going fast enough, ever. some of it is because i choose the right path, the hard path. but also, i'm just slow. my motivation/nose-to-the-grindstone ability is fickle. i drop in well, but it's hard to make myself drop in. in part because there's a >50% chance whatever i try is, like today, just going to blow up.

181 days of uptime today on this new sandbox/prototype vps. that's what i woke up to. i've rebooted it like 9 times since.

99.9% certain that the --data-dir=/var/lib/k3s-sandbox parameter that had been working fine is funky/busted in the new k3s upgrade, caused all this madness.

i need to make myself write a ticket, which i somewhat rather dread.

oh it's worse. i think i spent ~8 hours flailing because i thought i could restart k3s. but the iptables rules hang out when you bring the daemon down. and something doesn't work when you bring it back up. "something". there's so many rules.

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one