Ubuntu Inotify Tuning Demo 20191025


Or how something seemingly perplexing can devolve into something trivial 

I almost cancelled this as being too trivial.  However, in part of it
we can see how some of the LXC creators seemed to be similarly confused
about some of the details here.

 

Start






























































Ubuntu Inotify Tuning Demo 20191025


Or how something seemingly perplexing can devolve into something trivial 

I almost cancelled this as being too trivial.  However, in part of it
we can see how some of the LXC creators seemed to be similarly confused
about some of the details here.

 

Screenshot_20191022-173054.png  


00-intro

   Linux systems have /proc/sys

   Ubuntu systems have /etc/sysctl.conf
   -rw-r--r-- 1 root root 3751 Sep 27 13:07 /etc/sysctl.conf

   Entries like the following:
   kernel.pty.max=32768

   Cause at boottime...

   cscf-adm@xsbook7:~% grep '^' /proc/sys/kernel/pty/max
   32768
   cscf-adm@xsbook7:~% 

   A.B.C.D => /proc/sys/A/B/C/D

   Lots of parameters affect obsure details of performance (limits).

#10-lxc+inotify #95-further













./qd-stop-demos Two other windows, both root@xsbook7:/home/cscf-adm/demo-20191025
































Screenshot_20191022-173054.png  

10-lxc+inotify

    "lxc" is a containerization suite that CSCF uses.
          https://linuxcontainers.org/
       - create "fake machines", e.g. even on your own workstation/laptop

    I found I could not simultaneously run more than about 7 usefully running
       lxc containers on my 16G workstation.
    https://github.com/lxc/lxd/blob/master/doc/production-setup.md
       (seemingly part of source documentation for lxd)
                (lxd is would-be successor to lxc)
       points to "ls -ld /proc/sys/*/inotify/*"
 -rw-r--r-- 1 root root 0 Oct 17 12:02 /proc/sys/fs/inotify/max_queued_events
 -rw-r--r-- 1 root root 0 Oct 17 12:02 /proc/sys/fs/inotify/max_user_instances
 -rw-r--r-- 1 root root 0 Oct 17 12:02 /proc/sys/fs/inotify/max_user_watches

    (However, in details the information there seems to be wrong).

    My laptop (12G may be relevant) now uses something like...
       cscf-adm@xsbook7:~% grep '^' /proc/sys/*/inotify/*
       /proc/sys/fs/inotify/max_queued_events:262144
       /proc/sys/fs/inotify/max_user_instances:131072
       /proc/sys/fs/inotify/max_user_watches:196608
       cscf-adm@xsbook7:~% 

#20-demo01 <prev>











grep '^' /proc/sys/*/inotify/*


































Screenshot_20191022-173156.png  

20-demo01

     To demonstrate the problem containers run into, I use the following
     go back to the default state...

     root@xsbook7:~# cat /home/cscf-adm/demo-20191025/make-default-inotify
     #!/bin/bash
     
     echo 16384  >  /proc/sys/fs/inotify/max_queued_events
     echo 128  >  /proc/sys/fs/inotify/max_user_instances
     echo 8192  >  /proc/sys/fs/inotify/max_user_watches
     
     root@xsbook7:~# 

#30-demo05-startlxc <prev>












./make-default-inotify ./show-inotify































Screenshot_20191022-173407.png  

30-demo05-startlxc


Now, after setting the parameters to default, I use the following to start
all my 41 trivial demo containers...

      root@xsbook7:~# cat /home/cscf-adm/demo-20191025/start-all-demos
      #!/bin/bash

      # Takes 30 seconds to run, maybe...

      # must be superuser
      CONTAINERS=`lxc-ls -f | grep '^u....tunedemo' | awk '{print $1}' | grep -v '00$' `
      for c in $CONTAINERS ; do
	      echo $c
	      lxc-start -n $c
      done

      root@xsbook7:~# 

#40-demo10-showlxc <prev>













./start-all-demos ./show-all-demos ./show-all-demos ./show-all-demos ./qd-show-hung-demos






































Screenshot_20191022-173501.png   Screenshot_20191022-173541.png  

40-demo10-showlxc

    Very sad, we wait and wait, but a lot of containers fail to get
    an IP address.  We detect that with the following command...


    root@xsbook7:~# cat /home/cscf-adm/demo-20191025/qd-show-hung-demos
    #!/bin/bash
    
    lxc-ls -f | grep RUNNING | grep -v ' 10[.]'

    root@xsbook7:~# 

#45-demo12-restart <prev>









./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos ./qd-show-hung-demos


























Screenshot_20191022-173742.png   Screenshot_20191022-173845.png  

45-demo12-restart

     Let's try restarting all the hung containers--but it won't totally help.
     (But it did in my case let me demonstrate a red herring).
     Use the following script...


     root@xsbook7:~# cat ./qd-restart-demos
     #!/bin/bash

     STALLED=`lxc-ls -f | grep RUNNING | grep -v ' 10[.]' | awk '{print $1}' `
     for c in $STALLED; do
	     echo $c
	     lxc-stop --timeout 2 -n $c
	     lxc-start -n $c
     done

     root@xsbook7:~# 

#47-demo14-restart <prev>


























Screenshot_20191022-173742.png   Screenshot_20191022-173845.png   Screenshot_20191022-173945.png  

47-demo14-restart

./qd-restart-demos
./show-all-demos
./show-all-demos
./qd-show-hung-demos
./qd-show-hung-demos | wc
 
     Oh well.  One or two containers might advance.

     If we are lucky I can show you the "too many open files" diagnostic
     from "tail".  But that doesn't seem to happen in a live demonstration!
     (Red herring: "open files" is referring to inotify attempts)


     You can set /proc/sys/fs/file-max as high as you want and it won't help.

     cscf-adm@xsbook7:~$ grep -H '^' /proc/sys/fs/file-max
     /proc/sys/fs/file-max:2317350
     cscf-adm@xsbook7:~$ 

#50-demo15-fixit <prev>












tail -f /var/log/syslog





















Screenshot_20191022-174107.png   Screenshot_20191022-174203.png  

50-demo15-fixit

     So, having found the hints at
        https://github.com/lxc/lxd/blob/master/doc/production-setup.md
     I use a conservative version of them...
     (Actually while creating this demo I determined that
      max_user_instances is the only crucially important one).

     root@xsbook7:~# cat /home/cscf-adm/demo-20191025/make-good-inotify
     #!/bin/bash
     #https://github.com/lxc/lxd/blob/master/doc/production-setup.md
     # says use 1048576 = (1024*1024) for all three.
     
     # That seems sloppy.
     
     # In tests, 320 for max_user_instances seemed minimally adequate.
     # 262144 for all seemed (more than) adequate.
     # max_queued_events > max_user_watches > max_user_instances
     #echo 262144 > /proc/sys/fs/inotify/max_queued_events
     echo  360  > /proc/sys/fs/inotify/max_user_instances
     #echo 196608 > /proc/sys/fs/inotify/max_user_watches

     root@xsbook7:~# 
 

The above action will not immediately fix the problem.
That is, the containers will not spontaneously unlock.

#60-demo20-restart <prev>





./make-good-inotify ./show-inotify ./show-all-demos ./qd-show-hung-demos ./qd-show-hung-demos | wc
































Screenshot_20191022-174349.png   Screenshot_20191022-174440.png   Screenshot_20191022-174529.png  

60-demo20-restart

     So restart all hung containers.
     root@xsbook7:~# cat /home/cscf-adm/demo-20191025/qd-restart-demos
     #!/bin/bash
     
     STALLED=`lxc-ls -f | grep RUNNING | grep -v ' 10[.]' | awk '{print $1}' `
     for c in $STALLED; do
	     echo $c
	     lxc-stop --timeout 2 -n $c
	     lxc-start -n $c
     done

     root@xsbook7:~# 


./qd-restart-demos
./show-all-demos
./qd-show-hung-demos
./qd-show-hung-demos | wc

       Hurray!

#90-questions <prev>
































90-questions

     Take-aways...
     "lxc" https://linuxcontainers.org/ is a powerful 
          near-virtualization method CSCF uses

     sysctl command and /etc/sysctl.conf are interfaces to the
          more hacky manipulation of /proc/sys (for kernel tuning)

     Some such values can impact heavy container use, and need to be changed.

     man 5 proc ; man 7 inotify
     (Actually require "manpages" package; sometimes not on containers).

     "Too many open files" red herring.


     This is your computer.                           
     This is your computer on lxc.                    
     This is your computer on lxc with some tuning.         


     Any questions?

#95-further <prev>



















./qd-stop-demos ./make-new-inotify























95-further

     /etc/sysctl.d/[0-9][0-9]-*

     Basic detailed problem in this small case is root is constrained
     as much as any other user for inotify instances.

     The sysctl(8) command is the correct interface to use for the
     tuning.   But direct manipulation of /proc is easier and more fun.
     In actual practice you use /etc/sysctl.conf etc.

     Recently we observed that ubuntu1804-200 had...
     ubuntu1804-200% grep -C3 inotify /etc/sysctl.d/*
     /etc/sysctl.d/10-lxd-inotify.conf:# Increase the user inotify instance limit to allow for about
     /etc/sysctl.d/10-lxd-inotify.conf-# 100 containers to run before the limit is hit again
     /etc/sysctl.d/10-lxd-inotify.conf:fs.inotify.max_user_instances = 1024
     ubuntu1804-200% 
     ubuntu1804-200% dpkg-query -S /etc/sysctl.d/10-lxd-inotify.conf
     lxd: /etc/sysctl.d/10-lxd-inotify.conf
     ubuntu1804-200% 

     So they backed off from their own suggestion of (1024*1024) for all three.

     Other things I have tweaked...
     /proc/sys/kernel/pty/max
     /proc/sys/fs/file-max
     (both probably red herrings)

     /proc/sys/kernel/keys/maxkeys            2000 from 200
     /proc/sys/net/core/netdev_max_backlog    25000 from 1000
     
     First is theoretical in my case.
     Second may be relevant if it applies to lxcbr0 local networking.



















Start <prev>


















-- AdrianPepper - 2019-10-25

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2019-10-28 - AdrianPepper
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback