-0- -1- -2- -3-
Ubuntu systems have /etc/sysctl.conf -rw-r--r-- 1 root root 3751 Sep 27 13:07 /etc/sysctl.conf Entries like the following: kernel.pty.max=32768 Cause at boottime... cscf-adm@xsbook7:~% grep '^' /proc/sys/kernel/pty/max 32768 cscf-adm@xsbook7:~% A.B.C.D => /proc/sys/A/B/C/D Lots of parameters affect obsure details of performance (limits).
./qd-stop-demos Two other windows, both root@xsbook7:/home/cscf-adm/demo-20191025
-0- -1- -2- -3-
"lxc" is the containerization suite CSCF uses. - create "fake machines", e.g. even on your own workstation/laptop I found I could not simultaneously run more than about 7 usefully running lxc containers on my 16G workstation. https://github.com/lxc/lxd/blob/master/doc/production-setup.md (seemingly part of source documentation for lxd) (lxd is would-be successor to lxc) points to "ls -ld /proc/sys/*/inotify/*" -rw-r--r-- 1 root root 0 Oct 17 12:02 /proc/sys/fs/inotify/max_queued_events -rw-r--r-- 1 root root 0 Oct 17 12:02 /proc/sys/fs/inotify/max_user_instances -rw-r--r-- 1 root root 0 Oct 17 12:02 /proc/sys/fs/inotify/max_user_watches My laptop now uses something like... cscf-adm@xsbook7:~% grep '^' /proc/sys/*/inotify/* /proc/sys/fs/inotify/max_queued_events:262144 /proc/sys/fs/inotify/max_user_instances:131072 /proc/sys/fs/inotify/max_user_watches:196608 cscf-adm@xsbook7:~%
grep '^' /proc/sys/*/inotify/*
-0- -1- -2- -3-
To demonstrate. I use the following go back to the default state... root@xsbook7:~# cat /home/cscf-adm/demo-20191025/make-default-inotify #!/bin/bash echo 16384 > /proc/sys/fs/inotify/max_queued_events echo 128 > /proc/sys/fs/inotify/max_user_instances echo 8192 > /proc/sys/fs/inotify/max_user_watches root@xsbook7:~#
./make-default-inotify ./show-inotify
-0- -1- -2- -3-
Now, after setting the parameters to default, I use the following to start all my 41 trivial demo containers... root@xsbook7:~# cat /home/cscf-adm/demo-20191025/start-all-demos #!/bin/bash # Takes 30 seconds to run, maybe... # must be superuser CONTAINERS=`lxc-ls -f | grep '^u....tunedemo' | awk '{print $1}' | grep -v '00$' ` for c in $CONTAINERS ; do echo $c lxc-start -n $c done root@xsbook7:~#
./start-all-demos ./show-all-demos ./show-all-demos ./show-all-demos ./qd-show-hung-demos
-0- -1- -2- -3-
Very sad, we wait and wait, but a lot of containers fail to get an IP address. We detect that with the following command... root@xsbook7:~# cat /home/cscf-adm/demo-20191025/qd-show-hung-demos #!/bin/bash lxc-ls -f | grep RUNNING | grep -v ' 10[.]' root@xsbook7:~#
./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos ./qd-show-hung-demos
-0- -1- -2- -3-
Let's try restarting all the hung containers--but it won't totally help. Use the following script... root@xsbook7:~# cat ./qd-restart-demos #!/bin/bash STALLED=`lxc-ls -f | grep RUNNING | grep -v ' 10[.]' | awk '{print $1}' ` for c in $STALLED; do echo $c lxc-stop --timeout 2 -n $c lxc-start -n $c done root@xsbook7:~#
./qd-restart-demos ./show-all-demos ./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos | wc
Oh well. One or two containers might advance. If we're lucky I can show you the "too many open files" diagnostic from "tail". But that doesn't seem to happen in a live demonstration! (Red herring: "open files" is refering to inotify attempts)
tail -f /var/log/syslog
-0- -1- -2- -3-
So, having found the hints at https://github.com/lxc/lxd/blob/master/doc/production-setup.md I use a conservative version of them... (Actually while creating this demo I determined that max_user_instances is the only crucially important one). root@xsbook7:~# cat /home/cscf-adm/demo-20191025/make-good-inotify #!/bin/bash #https://github.com/lxc/lxd/blob/master/doc/production-setup.md # says use 1048576 = (1024*1024) for all three. # That semms sloppy. # In tests, 320 for max_user_instances seemed minimally adequate. # 262144 for all seemed (more than) adequate. # max_queued_events > max_user_watches > max_user_instances #echo 262144 > /proc/sys/fs/inotify/max_queued_events echo 360 > /proc/sys/fs/inotify/max_user_instances #echo 196608 > /proc/sys/fs/inotify/max_user_watches root@xsbook7:~#
Will not fix immediately.
./make-good-inotify ./show-inotify ./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos | wc
-0- -1- -2- -3-
So restart all hung containers. root@xsbook7:~# cat /home/cscf-adm/demo-20191025/qd-restart-demos #!/bin/bash STALLED=`lxc-ls -f | grep RUNNING | grep -v ' 10[.]' | awk '{print $1}' ` for c in $STALLED; do echo $c lxc-stop --timeout 2 -n $c lxc-start -n $c done root@xsbook7:~# ./qd-restart-demos ./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos | wc Hurray!
-0- -1- -2- -3-
Take-aways... "lxc" https://linuxcontainers.org/ is a powerful near-virtualization method CSCF uses sysctl command and /etc/sysctl.conf are interfaces to the more hacky manipulation of /proc/sys (for kernel tuning) Some such values can impact heavy container use, and need to be changed. man 5 proc ; man 7 inotify (Actually require "manpages" package; sometimes not on containers). "Too many open files" red herring. This is your computer. :/ This is your computer on lxc. :) This is your computer on lxc with some tuning. :) :) :) :) Any questions?
./qd-stop-demos ./make-new-inotify
-- AdrianPepper - 2019-10-23