Storage¶
Stateless applications/services are fun, but eventually we would like to store some state/data. The options we have are:
- Rook/Ceph: I have tested rook and the options are and features are really extensive and very good. It is quite stable, but we are facing the limitations of our platform. Raspberry Pi is an arm64 platform. Ceph does support arm64, but some other usefull features (e.g. running NFS endpoints) did not have yet arm64 images. Furthermore it needs a lot of CPU and memory. To deploy a fully functional ceph cluster we need at least 10 Raspberry Pi (and they have to be raspberry pi 4 with at least 4GB of RAM). So technically it is possible and I have done it, but the headaches are too many. Do not get me wrong. I am pushing what is feasible by deploying it to Raspberry Pies and it still worked. So kudos to Ceph. But probably it is an overcomplicated solution for what I want to do.
- NFS: We can use an external nfs storage server. Probably this one will work like charm, but it is a single point of failure.
- Longhorn: This is another native Kubernetes storage, which seems quite promising. They only disadvantage is that it does not support sharding. So we will be limiting our selfes to the maximum size of our hard drives in the cluster. For my current use cases, it is more that fine. Furthermore it is supported from k3s. So we will go with it and see what happens.
Preparing the nodes¶
Before we are able to install longhorn, we have to prepare the nodes and mount the disks. To be able to do that we will have to able to find out where the devices are located. So we execute:
ansible disks -b -m ansible.builtin.shell -a 'blkid'
We will get something like:
/dev/sda: UUID="blah-blah-...-blah" TYPE="LVM2_member"
So complete these details in our hosts file in the section of the disks. E.g.:
disks:
hosts:
node_1:
disk_path: /dev/sda
Then we can execute our playbook
ansible-galaxy collection install ansible.posix
ansible-galaxy collection install community.general
ansible-playbook storage/setup_storage.yml
Warning
I have no idea why, but if I connect the Sata to USB3 cable I have to the USB3 ports, after a reboot the drive will not re-connect. To reconnect I have to power completely the raspberry pi (unplug and replug the PoE cable). Needs investigation.
Note
Be carefull to use the correct partition
Note
I used to have ceph installed on the disks and to clean I have first to execute:
ansible disks -b -m ansible.builtin.shell -a 'sgdisk --zap-all /dev/sdb'
ansible disks -b -m ansible.builtin.shell -a 'sudo dd if=/dev/zero of=/dev/sdb bs=1M count=100 oflag=direct,dsync'
ansible disks -b -m ansible.builtin.shell -a 'ls /dev/mapper/ceph-* | xargs -I% -- sudo dmsetup remove %'
ansible disks -b -m ansible.builtin.shell -a 'sudo rm -rf /dev/ceph-*'
Installing longhorn¶
# We have to use the longhrn-system namespace. It is mentioned in the
# documentation of the helm chart
helm repo add longhorn https://charts.longhorn.io
helm repo update
kubectl create namespace longhorn-system
helm install longhorn longhorn/longhorn --namespace longhorn-system -f values.yaml --version 1.7.2
# The vpa is causing instability in the longhorn. So do not activate it for the moment
#kubectl apply -f vpa.yml
kubectl apply -f dashboard.yml
# Apply our storage classes
kubectl create secret generic longhorn-crypto --namespace longhorn-system \
--from-literal=CRYPTO_KEY_VALUE=$(head -c 512 /dev/urandom | LC_CTYPE=C tr -cd 'a-zA-Z0-9' | head -c 64) \
--from-literal=CRYPTO_KEY_PROVIDER=secret
kubectl apply -f RepliccatedStorage.yaml
kubectl apply -f UnrepliccatedStorage.yaml
kubectl create secret generic longhorn-backup --namespace longhorn-system \
--from-literal=CIFS_USERNAME=longhorn \
--from-literal=CIFS_PASSWORD=MY_SECRET_PASSWORD
We have to add the disks from the UI of longhorn
- https://rpi4cluster.com/storage-setting/
- https://longhorn.io/docs/1.2.3/volumes-and-nodes/multidisk/#add-a-disk
On how to use them take a look at https://longhorn.io/docs/1.2.2/references/examples/#block-volume
Maybe of interest:
- https://longhorn.io/docs/1.2.2/monitoring/prometheus-and-grafana-setup/#install-longhorn-servicemonitor
- https://github.com/longhorn/longhorn/issues/1859#issuecomment-907057960 for when expanding a disk
Usefull commands¶
kubectl -n longhorn-system logs -f -l "app=longhorn-manager" --max-log-requests 10
Resources¶
- https://bryanbende.com/development/2021/05/15/k3s-raspberry-pi-volumes-storage
- https://github.com/gdha/pi4-longhorn
- https://gdha.github.io/pi-stories/pi-stories9/
- https://www.jericdy.com/blog/installing-k3s-with-longhorn-and-usb-storage-on-raspberry-pi
- https://longhorn.io/docs/1.2.2/advanced-resources/volume-encryption/
Take a look at https://longhorn.io/kb/troubleshooting-volume-with-multipath/ I had to do it in homados and aretusa, and I have to examine if it has to be done to all nodes ...
# https://github.com/longhorn/longhorn/issues/1826#issuecomment-1200005051
kubectl get snapshots.longhorn.io -n longhorn-system -l longhornvolume=pvc-1c16f507-ff8d-4b8b-aed4-7b108214618c | awk '/library-books-/{print $1}' | xargs kubectl -n longhorn-system delete snapshots.longhorn.io
qemu-img convert -f raw e5ec5a95 -O vmdk torrents-settings.img
sudo mount -o loop torrents-settings mount/
# https://edoceo.com/sys/qemu
modprobe nbd
qemu-nbd --connect=/dev/nbd0 disk.qcow2
qemu-nbd -d /dev/nbd0