Experiment running Kubernetes in LXD

Try 1: Kubernetes storage support

Kubernetes filesystem support

The hardest issue with deploying Kubernetes on LXD/LXC containers is storage and filesystem support:

BTRFS

BTRFS does not work well with kubernetes, due to CAdvisor not playing well with BTRFS

ZFS

ZFS does not work as well on LXC and kubernetes, since it does not bad support for nested containers.

k3s/issues/19

One workaround is creating subvolumes for the container runtime and formatting them in Ext4:

using-docker-and-kubernetes-on-zfs-backed-host-systems

Another Workaround is to have a ZSF enabled containerd in the host and make it accessible inside LXC

There are other solution like using docker loopback plugin ...

k3s/issues/66

Containerd and overlay inside LXC

When running containerd inside LXC, due to Systemd being unable to execute modprobe overlay inside the container (module is already loaded in host kernel).

docker/for-linux/issues/475

Containerd is already patched and modprobe errors are ignored.

containerd/containerd/pull/2776

Cgroups v2 support

Containerd (and runC) supports Cgroups v2 already

I enabled it using this

[plugins]
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
      SystemdCgroup = true

  [plugins.cri.containerd.default_runtime]
    runtime_type = "io.containerd.runc.v2"
    runtime_engine = ""
    runtime_root = ""

Try 2: Weird problem

I have a weird problem now, When setting up a cluster with kubeadm, the containers keep restarting until everything crashes. The same thing with microk8s.

A weirder situation is that K0s works fine!

Hypothesis

There is something related to container technologies that's preventing the containers from running properly.

In the case of Kubeadm, the kubernetes components run in containers (containerd in my case).
In the case of Microk8s, the components run on top of snapd (to be verified).

--> In this case there should be something preventing them (containers, snaps) from running properly.

Where k8s components run

To verify this I will do the following experiments

Run k3s in LXD, since it uses containerd to run the k8s components, it should fail
Install kubernetes the hard way, this way I'll install the components as processes not as containers. In this case Everything should work fine.

Edit: Microk8s works fine, the problem was related to the dns plugin which was disabled for some reason. The reason for which Microk8s reports a not running status. microk8s enable dns and everything is working fine.

Kubeadm downgrade

Downgraded kubeadm from 1.22.0 to 1.20.4 and everything seems to work fine!

Can be a version problem! Digging deeper and maybe getting some help from serverfault.

A new problem arose: kube-proxy won't start and fails with open /proc/sys/net/netfilter/nf_conntrack_max: permission denied

The solution was to set nf_conntrack_max in the host.

sudo sysctl net/netfilter/nf_conntrack_max=131072

Managed to upgrade from 1.20.4 to 1.21.4 to 1.22.1 and the cluster is running almost fine, until they aren't.

For 1.21.4 everything was fine while in 1.22.1 nothing works.

It started with some CrashLoopBackOffs and now everything is down.

When I restart the kubelet, the containers start to show off a minute later, and then enter the crash loop again.

My Hypothesis is that this is a version issue, there is something wrong with v1.22 or with my lxd setup or both. To test that I am doing the following

Testing v1.22 using k3s or some other distribution.
Testing v1.22 with k8s the hard way.

Also v1.22 supports swap so maybe the problem has something to do with swap. I'll check that too:

Asked at k8s.slack.com and the responses suggested that the etcd server is the reason why everything fail and said that from kubernetes 1.21 to 1.22 etcd moved to 3.5.0

The best way and the least time consuming is Kubernetes the hard way since, it will help me in other thing as well. Ans since k8s distro haven't moved to 1.22 yet.

https://github.com/inercia/terraform-provider-kubeadm
Use Ansible + Terraform is better maybe

Going back to this project. This works on 1.31+ with a little bit of tweaking. It may work with previous versions, I have not tested them. Initialized a cluster on a 3 machines LXD container instances.

Created 3 LXD container instances using LXD terraform provider and cloud-init

It is weird that LXD cloud images for ubuntu/jammy do not come with sshd installed. So I had to install it manually.

Getting the annoying error

285 fs.go:595] Unable to get btrfs mountpoint IDs: stat failed on /dev/nvme0n1p3 with error: no such file or directory` error. But apparently it does not affect the cluster health. See Above for more information about the issue.

After initializing the cluster, the kube-proxy pod enter a CrashLoop state. A kubectl logs show that the container was failing with:

conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied

Apparently kube-proxy is trying to change the value of nf_conntrack_max even if it does not have the permission to do so. This is maybe related to the way LXC loads the kernel modules (Need to dig more on this).

root@k8s-node-0:~# sysctl -p
sysctl: setting key "net.netfilter.nf_conntrack_max": No such file or directory
sysctl: cannot stat /proc/sys/net/nf_conntrack_max: No such file or directory

The solution was to prevent kube-proxy from changing the nf_conntrack_max value by setting maxPerCore to 0 in the kube-proxy configMap. More

Installed Weaveworks network plugin.

kubectl apply -f https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s.yaml

Sample code: kubernetes · main · iduoad / Agora / Projects / Azourki · GitLab

References

sysbox/docs/quickstart/kind.md at master · nestybox/sysbox: Running kubernetes in Sysbox containers.

M'Goun