Linux Working Group



AGENDA LOCKED - 2022-04-06

Invitees - Attendees

  • Invited: Anthony, Adrian, Guoxiang, Clayton, Lori, Fraser, Nathan, Dave, Omar, Devon, Todd, Nick, Lawrence
  • Attended: Anthony, Adrian, Guoxiang, Clayton, Lori, Fraser, Nathan, Omar, Devon, Todd, Lawrence, Dave
  • Absent: Nick

Review and accept previous meeting minutes.

Last meeting's tasks [10 minutes]

  • "Dirty Pipe" - https://rt.uwaterloo.ca/Ticket/Display.html?id=1213217
  • linux.student.cs loadaverage is much higher and more variable than in the past.
    • Anthony - will create a ticket for monitoring processor data - will work with Devon
    • Fraser - would like to see a combined load average all on one screen
      • Devon - yes we can do that
      • Lori - would also like that for DFSc
      • Lori - would also like to see network usage (netstat -ai )
        • Devon - there are some stats - are they sufficient? needs to be graphed
  • Clayton - document process of adding hosts to AD and move to a generally accessible place
    • Fraser - can be a beta tester for the graphics lab
    • where to put such a script/files?
      • git?
      • Anthony uses /usr/local/* (comes from the Salt master or debian packages)
    • RT#1217894
  • Lawrence / RSG - update jerusalem and graceland to mount new NFS share - RT#1194157
    • no news
  • Dave - put up Beta version of new Virtual Host Index / Anthony to create a ticket - RT#1211603 -> working on it
    • not yet, Anthony seen work in progress
    • Adrian - is there any plan to have a clean text dump report to look at without a web browser?
      • Anthony: should be doable, but original output were meant to be parsed by other tools
      • Fraser also interest in machine-readable output
  • Fraser to create a ticket for updating the Graphics Lab machines - RT#1217894
    • Anthony will reach out when patched kernel is available, machines will need to be rebooted
  • Anthony - will create a ticket for monitoring processor data - will work with Devon
    • not yet, combine with ticket Fraser was looking for (see below)
    • Fraser - create request for Devon for combined general use cpu load graph (and possibly other metrics)
  • Lori - create ticket for Devon to create combined graph for DFSc
    • not yet
  • Lori - create RT for graphing network statistics data
  • Lawrence - follow-up with SuperMicro re: RT#1079451 - ubuntu1804-006 CPU hardware errors
    • LF to create the RMA for CPU only

New Items

  • What are the plans for IaaS systems not in use - Lawrence - 15 minutes
    • we've talked about needing more machines in the teaching environment, for example
    • according to https://cs.uwaterloo.ca/cscf/internal/infrastructure/inventory/virtual-host-index/ a number of systems have nor running containers, specifically:
      • dc-3558-208.cloud.cs: Ubuntu 20.04.3 LTS (Kernel: 5.13.0-35-generic)
      • dc-3558-209.cloud.cs: Ubuntu 16.04.7 LTS (Kernel: 4.15.0-142-generic)
        • Newly vacant, destined for teaching environment (a2brenna)
      • dc-3558-210.cloud.cs: Ubuntu 18.04.6 LTS (Kernel: 5.4.0-91-generic)
      • dc-3558-211.cloud.cs: Ubuntu 20.04.4 LTS (Kernel: 5.13.0-35-generic)
      • m3-3101-207.cloud.cs: Ubuntu 20.04.3 LTS (Kernel: 5.13.0-35-generic)
        • Running KVM instance(s) (a2brenna)
      • m3-3101-211.cloud.cs: Ubuntu 20.04.3 LTS (Kernel: 5.13.0-35-generic)
        • Being used for network filesystem testing (a2brenna)
      • mc-3015-206.cloud.cs: Ubuntu 20.04.3 LTS (Kernel: 5.13.0-35-generic)
      • mc-3015-211.cloud.cs: Ubuntu 20.04.4 LTS (Kernel: 5.13.0-35-generic)
        • Possibly being used for ZFS related testing, fhgunn knows more (a2brenna)
    • see also RT#1151451
      • some have stated plans in inventory in the Purpose field
    • discussion about spare machines in machine rooms - hot spare or
  • at the end of term as part of update and reboot cycle will enable memory and CPU limit enforcement
    • Adrian: is there a way for a process to know what the CPU limit within a container?
    • Anthony - possibly not, eg: pbzip2 - detects number of CPUs, may get messed up
    • Fraser: VSCode wakes up and runs all sleeping tasks
    • Lawrence - is there a way a of determining actual current CPU and memory usage?
    • Fraser - difficult to estimate reasonable limits, would like to see the data
    • Anthony: the data should be available
    • plan - look at containers that don't have limits defined now (base limit/minimum would be 2 cores / 4GB RAM)
      • Dave - default would be 4 cores / 16GB RAM
    • Dave - new Virtual Host Index is planned to include CPU and RAM configuration
    • Anthony - disk space limits are very difficult to enforce
    • Nathan - have we considered zvols or moving LXC into KVMs
    • discussion about disk space limits ...
  • HWE Kernels - Anthony - 15 minutes - RT: 1217576

Action Items for next meeting

  • Anthony - will create a ticket for monitoring processor data - will work with Devon
    • Fraser - create request for Devon for combined general use cpu load graph (and possibly other metrics)
  • Clayton - document process of adding hosts to AD and move to a generally accessible place
  • Lawrence / RSG - update jerusalem and graceland to mount new NFS share - RT#1194157
  • Dave - put up Beta version of new Virtual Host Index / Anthony to create a ticket - RT#1211603 -> working on it
  • Lori - create ticket for Devon to create combined graph for DFSc
  • Lawrence - follow-up with SuperMicro re: RT#1079451 - ubuntu1804-006 CPU hardware errors
  • Anthony - update the Purpose field of currently "unused" machines in the Virtual Host Index
  • All on ticket: Document usage of HWE Kernels RT#1217576
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2022-04-06 - LawrenceFolland
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback