Linux Working Group



AGENDA LOCKED

Invitees - Attendees

  • Invited: Anthony, Adrian, Guoxiang, Clayton, Lori, Fraser, Nathan, Dave, Omar, Devon, Todd, Nick, Lawrence
  • Adrian, Anthony (group leader), Clayton, Lori, Fraser, Devon, Nathan, Todd, Omar

Review and accept previous meeting minutes.

Last meeting's tasks [10 minutes]

  • Lawrence - confirm with Daniel the maintenance of OpenDCIM
    • OpenDCIM is being maintained as a project. We need to catch up on some updates, but no security issues
    • will plan an update in the summer, I think
  • Lawrence - work with Graham to update the OpenDCIM data
    • will be working with Graham on this project
  • Clayton - document process of adding hosts to AD and move to a generally accessible place
    • keytab work
    • 4.13 auto updates of Samba causing issues. Currently pinned.
    • at a point where Clayton believes that people can use it * where will it be documented and where will it live?
    • Clayton - have updated most keytabs, but any updates let him know
  • Lawrence / RSG - update jerusalem and graceland to mount new NFS share - RT#1194157
    • not yet, will be working with Tom
  • Dave - put up Beta version of Virtual Host Index / Anthony to create a ticket - RT#1211603 -> working on it
    • Dave's not here - but no new version up yet, but Anthony understands that it should be ready soon
  • "Dirty Pipe" - https://rt.uwaterloo.ca/Ticket/Display.html?id=1213217
    • https://ubuntu.com/security/CVE-2022-0847
    • Present from 5.8 onward, possibly on older kernels via backport
    • Fraser: prudent to rebuild all login servers from scratch. O(hours) to rebuild all student login machines
    • ctucker: reasonable to replace keytab files, invalidate existing keytabs
    • Rebuild webservers and other container servers
    • Appetite for risk on rebuilding servers? Question for management - RT#??? * outstanding question - no response from management * however, technical consensus is that we are not prepared to do that now * could do on a round-robin basis (one at a time) * Lori: could "cloud init" be used to start with a standard image? * Anthony: probably not - would likely be more work to maintain than it would save * would eventually like to get to a stateless setup
    • Lori - setuid binaries in user filesystems? Mount a survey (per term?). Set ACL on snapshots? https://rt.uwaterloo.ca/Ticket/Display.html?id=1213685 * in upcoming Ceph version there is a "root squash" feature - add to to-do list
    • Run vulnerability check on questionable systems (e.g. 5.4 kernels)
    • Fraser to create a ticket for updating the Graphics Lab machines

New proposed agenda items (include name and desired time)

  • linux.student.cs loadaverage is much higher and more variable than in the past. (Fraser, 10 minutes)
    • concern is load average is 10x what it used to be
    • afternoons and evenings much high
    • ratio of load average to number of cores
    • processes in the "runable" state, but may be waiting for disk
    • used to be that more processes were waiting for Ceph, but not so much anymore
    • right now on cs-general general use hosts, a single user slamming a machine
      • Peter van Beek; correctly niced +10; Fraser talked to Peter and moved the (CPU cores and RAM intensive) research work to ugster73*
    • teaching hosts - load from actual compute jobs / VS Code
    • Omar - how closely does load average correlate with user response
      • Anthony: if load stays below number of cores, should not impact response
      • Fraser - would see it jump up to 200 then back to 60 - very jumpy
      • Anthony - does Icinga have process usage monitoring?
      • Devon - used to, with smtp daemon
      • Anthony - will create a ticket for monitoring processor data - will work with Devon
      • Fraser - would like to see a combined load average all on one screen
        • Devon - yes we can do that
        • Lori - would also like that for DFSc
        • Lori - would also like to see network usage (netstat -ai )
          • Devon - there are some stats - are they sufficient? needs to be graphed

  • Looking for some out-of-production hardware for OpenNebula work (Lawrence, 5 minutes)
    • 2 machines - 32 GB+, SSDs
      • Devon has an R815 we can use
      • Fraser offered the ugster200s - although they have only 16GB RAM and only one 7200rpm SATA drive each (0 or all 5)
      • as an aside - CS013499 (formerly ubuntu2004-006) is dead and needs RMA

Action items

  • Clayton - document process of adding hosts to AD and move to a generally accessible place
  • Lawrence / RSG - update jerusalem and graceland to mount new NFS share - RT#1194157
  • Dave - put up Beta version of Virtual Host Index / Anthony to create a ticket - RT#1211603 -> working on it
  • Fraser to create a ticket for updating the Graphics Lab machines
  • Anthony - will create a ticket for monitoring processor data - will work with Devon
  • Fraser - create request for Devon for combined general use cpu load graph (and possibly other metrics)
  • Lori - create ticket for Devon to create combined graph for DFSc
  • Lori - create RT for graphing network statistics data
  • Lawrence - follow-up with SuperMicro re: RT#1079451
Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r10 - 2022-03-26 - FraserGunn
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback