If you find yourself having to emulate different CPU architectures in a Kubernetes environment you’ll probably end up running some version of binfmt as an init container or maybe manually run it once.
That would probably be ok in most cases, but wouldn’t work if, for example, you’re running multiple pod replicas on the same node concurrently (say, when you setup a CI that spawns pods for every job in your pipeline) or when nodes are autoscaled.
There’re probably ways to make sure only one pod sets up binfmt, but when running the pods concurrently, it’s very difficult to orchestrate.
But there’s light at the end of the tunnel. There’s the DaemonSet which we can use to run binfmt once on any newly created node.
You just need to create a simple config with:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
|
# Run binfmt setup on any new node
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: binfmt
labels:
app: binfmt-setup
spec:
selector:
matchLabels:
name: binfmt
# https://kubernetes.io/docs/concepts/workloads/pods/#pod-templates
template:
metadata:
labels:
name: binfmt
spec:
tolerations:
# Have the daemonset runnable on master nodes
# NOTE: Remove it if your masters can't run pods
- key: node-role.kubernetes.io/master
effect: NoSchedule
initContainers:
- name: binfmt
image: tonistiigi/binfmt
# command: []
args: ["--install", "all"]
# Run the container with the privileged flag
# https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
# https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#securitycontext-v1-core
securityContext:
privileged: true
containers:
- name: pause
image: gcr.io/google_containers/pause
resources:
limits:
cpu: 50m
memory: 50Mi
requests:
cpu: 50m
memory: 50Mi
|
And apply it:
1
|
kubectl apply -f ./daemonset.yaml
|
That’s it. binfmt should now be setup on your cluster.
Note
If later on you decide on a different approach, you can delete the daemonset and any resources associated with it:
1
|
kubectl delete daemonset binfmt --namespace=default
|