Or how something seemingly perplexing can devolve into something trivial I almost cancelled this as being too trivial. However, in part of it we can see how some of the LXC creators seemed to be similarly confused about some of the details here.
Or how something seemingly perplexing can devolve into something trivial I almost cancelled this as being too trivial. However, in part of it we can see how some of the LXC creators seemed to be similarly confused about some of the details here.
Linux systems have /proc/sys Ubuntu systems have /etc/sysctl.conf -rw-r--r-- 1 root root 3751 Sep 27 13:07 /etc/sysctl.conf Entries like the following: kernel.pty.max=32768 Cause at boottime... cscf-adm@xsbook7:~% grep '^' /proc/sys/kernel/pty/max 32768 cscf-adm@xsbook7:~% A.B.C.D => /proc/sys/A/B/C/D Lots of parameters affect obsure details of performance (limits).
./qd-stop-demos Two other windows, both root@xsbook7:/home/cscf-adm/demo-20191025
"lxc" is a containerization suite that CSCF uses. https://linuxcontainers.org/ - create "fake machines", e.g. even on your own workstation/laptop I found I could not simultaneously run more than about 7 usefully running lxc containers on my 16G workstation. https://github.com/lxc/lxd/blob/master/doc/production-setup.md (seemingly part of source documentation for lxd) (lxd is would-be successor to lxc) points to "ls -ld /proc/sys/*/inotify/*" -rw-r--r-- 1 root root 0 Oct 17 12:02 /proc/sys/fs/inotify/max_queued_events -rw-r--r-- 1 root root 0 Oct 17 12:02 /proc/sys/fs/inotify/max_user_instances -rw-r--r-- 1 root root 0 Oct 17 12:02 /proc/sys/fs/inotify/max_user_watches (However, in details the information there seems to be wrong). My laptop (12G may be relevant) now uses something like... cscf-adm@xsbook7:~% grep '^' /proc/sys/*/inotify/* /proc/sys/fs/inotify/max_queued_events:262144 /proc/sys/fs/inotify/max_user_instances:131072 /proc/sys/fs/inotify/max_user_watches:196608 cscf-adm@xsbook7:~%
grep '^' /proc/sys/*/inotify/*
To demonstrate the problem containers run into, I use the following go back to the default state... root@xsbook7:~# cat /home/cscf-adm/demo-20191025/make-default-inotify #!/bin/bash echo 16384 > /proc/sys/fs/inotify/max_queued_events echo 128 > /proc/sys/fs/inotify/max_user_instances echo 8192 > /proc/sys/fs/inotify/max_user_watches root@xsbook7:~#
./make-default-inotify ./show-inotify
Now, after setting the parameters to default, I use the following to start all my 41 trivial demo containers... root@xsbook7:~# cat /home/cscf-adm/demo-20191025/start-all-demos #!/bin/bash # Takes 30 seconds to run, maybe... # must be superuser CONTAINERS=`lxc-ls -f | grep '^u....tunedemo' | awk '{print $1}' | grep -v '00$' ` for c in $CONTAINERS ; do echo $c lxc-start -n $c done root@xsbook7:~#
./start-all-demos ./show-all-demos ./show-all-demos ./show-all-demos ./qd-show-hung-demos
Very sad, we wait and wait, but a lot of containers fail to get an IP address. We detect that with the following command... root@xsbook7:~# cat /home/cscf-adm/demo-20191025/qd-show-hung-demos #!/bin/bash lxc-ls -f | grep RUNNING | grep -v ' 10[.]' root@xsbook7:~#
./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos ./qd-show-hung-demos
Let's try restarting all the hung containers--but it won't totally help. (But it did in my case let me demonstrate a red herring). Use the following script... root@xsbook7:~# cat ./qd-restart-demos #!/bin/bash STALLED=`lxc-ls -f | grep RUNNING | grep -v ' 10[.]' | awk '{print $1}' ` for c in $STALLED; do echo $c lxc-stop --timeout 2 -n $c lxc-start -n $c done root@xsbook7:~#
./qd-restart-demos ./show-all-demos ./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos | wc
Oh well. One or two containers might advance. If we are lucky I can show you the "too many open files" diagnostic from "tail". But that doesn't seem to happen in a live demonstration! (Red herring: "open files" is referring to inotify attempts) You can set /proc/sys/fs/file-max as high as you want and it won't help. cscf-adm@xsbook7:~$ grep -H '^' /proc/sys/fs/file-max /proc/sys/fs/file-max:2317350 cscf-adm@xsbook7:~$
tail -f /var/log/syslog
So, having found the hints at https://github.com/lxc/lxd/blob/master/doc/production-setup.md I use a conservative version of them... (Actually while creating this demo I determined that max_user_instances is the only crucially important one). root@xsbook7:~# cat /home/cscf-adm/demo-20191025/make-good-inotify #!/bin/bash #https://github.com/lxc/lxd/blob/master/doc/production-setup.md # says use 1048576 = (1024*1024) for all three. # That seems sloppy. # In tests, 320 for max_user_instances seemed minimally adequate. # 262144 for all seemed (more than) adequate. # max_queued_events > max_user_watches > max_user_instances #echo 262144 > /proc/sys/fs/inotify/max_queued_events echo 360 > /proc/sys/fs/inotify/max_user_instances #echo 196608 > /proc/sys/fs/inotify/max_user_watches root@xsbook7:~#
The above action will not immediately fix the problem. That is, the containers will not spontaneously unlock.
./make-good-inotify ./show-inotify ./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos | wc
So restart all hung containers. root@xsbook7:~# cat /home/cscf-adm/demo-20191025/qd-restart-demos #!/bin/bash STALLED=`lxc-ls -f | grep RUNNING | grep -v ' 10[.]' | awk '{print $1}' ` for c in $STALLED; do echo $c lxc-stop --timeout 2 -n $c lxc-start -n $c done root@xsbook7:~# ./qd-restart-demos ./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos | wc Hurray!
Take-aways... "lxc" https://linuxcontainers.org/ is a powerful near-virtualization method CSCF uses sysctl command and /etc/sysctl.conf are interfaces to the more hacky manipulation of /proc/sys (for kernel tuning) Some such values can impact heavy container use, and need to be changed. man 5 proc ; man 7 inotify (Actually require "manpages" package; sometimes not on containers). "Too many open files" red herring. This is your computer.This is your computer on lxc.
This is your computer on lxc with some tuning.
![]()
![]()
![]()
Any questions?
./qd-stop-demos ./make-new-inotify
/etc/sysctl.d/[0-9][0-9]-* Basic detailed problem in this small case is root is constrained as much as any other user for inotify instances. The sysctl(8) command is the correct interface to use for the tuning. But direct manipulation of /proc is easier and more fun. In actual practice you use /etc/sysctl.conf etc. Recently we observed that ubuntu1804-200 had... ubuntu1804-200% grep -C3 inotify /etc/sysctl.d/* /etc/sysctl.d/10-lxd-inotify.conf:# Increase the user inotify instance limit to allow for about /etc/sysctl.d/10-lxd-inotify.conf-# 100 containers to run before the limit is hit again /etc/sysctl.d/10-lxd-inotify.conf:fs.inotify.max_user_instances = 1024 ubuntu1804-200% ubuntu1804-200% dpkg-query -S /etc/sysctl.d/10-lxd-inotify.conf lxd: /etc/sysctl.d/10-lxd-inotify.conf ubuntu1804-200% So they backed off from their own suggestion of (1024*1024) for all three. Other things I have tweaked... /proc/sys/kernel/pty/max /proc/sys/fs/file-max (both probably red herrings) /proc/sys/kernel/keys/maxkeys 2000 from 200 /proc/sys/net/core/netdev_max_backlog 25000 from 1000 First is theoretical in my case. Second may be relevant if it applies to lxcbr0 local networking.
Start <prev>
-- AdrianPepper - 2019-10-25