Linux Working Group

Meeting Date

  • TEAMS: 2021-12-01

Invitees - Attendees

  • Adrian, Anthony (group leader), Clayton, Guoxiang, Lori, Fraser, Devon, Nathan, Nick, Todd, Dave, Lawrence, Omar

Review and accept previous meeting minutes.

Proposed Agenda Items

Netapp Retirement

  • Dave has committed to MFCF that we will be all done with the NetApp by end of January 2022

Migrate remaining data

  • In General
    • Any progress on moving Dan Berry's /opt/csw?
      • RSG discuss moving to jerusalem directly
      • Configure an NFS share via gateways: lfolland will create ticket for ctucker - Ticket created: RT#1194157
    • Any word from Isaac about use of Netapp for storing web logs? Ticket (a2brenna) for ijmorlan was created late, no movement as of yet.
      • Ticket created: RT#
  • TEACHING - Progress reports on
    • Unmounting /oldhome immediately (gxshen)
    • Perform final (end of year backup) of /oldhome data on the Netapp and remove data from Netapp in January (gxshen)
      • Guoxiang: did a full backup in May 2020
      • /opt/CSCF/packages is also being unmounted (collections of really old packaged software - no longer used)
    • xhier regional trees for CS-GENERAL and TEACHING: Destinations exist on DFSC. Schedule migration?
      • currently in /opt/uwcs - just local file storage, no longer on NetApp - so, done!
  • Are we still on track for moving mail in December?
    • lfolland sent notification for move CS mail to IST on 2021-11-24 - Done
      • Do not move mail homes to DFSC until after move to IST
      • lfolland to discuss with sdinney
    • /var/mail (still on NetApp)
      • can choose a time to migrate
      • a few people still using it
      • need to schedule a time to unmount /var/mail * expected not to take too long * will need to update mounts for both teaching and general regions (can be done separately) * all of the data can be copied ahead of time, then diffs at the time * Dave and Guoxiang to plan the specific process * Date for the switch?
      • should be coordinated with unmount and reboot of teaching and general use servers at end of term
      • Fraser notes that he is still getting some new mail on CS ( can be from CS to CS)
      • Adrian points out there are several reasons mail still comes directly
        • spammers, Let's Encrypt, other CS mail users, CS systems using mx.cs
        • servers to be changed to send mail directly to the destination - has that happened?
          • configuration change to postfix required
          • will need a recipe for other machines that may be setup to do that

TEACHING environment

CS Teaching - slowness of systems - how to address? (All)

  • Ceph:
    • Gateway systems (NFS/Samba) upgrade status? Schedule blacklisting of old client.
    • 5.11 HWE kernel has been deployed but not all TEACHING and CS-GENERAL hosts have been rebooted. Will be handled by end of term update/reboot cycle. (a2brenna)
  • Unscheduled reboots
    • ubuntu2004-008 rebooted. Why?
      • Caused by general protection fault in kernel space, kernel panicked. Either a kernel bug, or possibly a memory issue. No other evidence of hardware memory issues on the machine at this time. (a2brenna)
        • have a kernel core dump if someone has time to investigate
    • ubuntu2004-004 rebooted. Why?
      • According to Dell support this was caused by a "Corrected Memory Error"? (gxshen)
      • Machine was left in a bad state (failed to POST) see RT: 1196085
      • Initiate warranty claim (a2brenna)
      • concerns about possible power issues - power bounced, dirty power?
      • Devon notes that there are deviations of 5V every 6 hours

Other Issues


  • How much disk space does it actually need? (nfish)
    • 256GB for root, /var/lib/postgresql on an Optane
    • larger storage on dfs
    • need to be clearer on the expected paths/configuration
  • Expected arrival date of new hardware? (dlgawley) - RT#1159697
    • 2TB Samsung Pro (for 203 systems - newest HP ProLiant)
    • expected to be "204" systems, but haven't arrived yet
  • Possibly recover 960GB Optane cards from 422 systems? (ldpaniak)
    • need to follow-up with Lori

mysql systems

  • Dave: what increments do we need?
  • Fraser - don't have a quick answer, 1 should be sufficient, as long as backups on a different

Changes to Regions:

  • OpenEdx has requested modest increase in disk space for test environment. Not a problem. (Narrator: It was a problem) (a2brenna)
    • done, Todd is happy

Storage option catalogue:

* NFS, Samba, S3, RBD, posix, web, Nextcloud: ticket for ldpaniak/gxshen to create list * ongoing * ticket still to be created?

OdysseyDB backup volume (s8weber) on DFSC - RT#1193614

* nfish: only a temp requirement * need to determine how to connect - on the ring, or NFS mount

SSH and git - desir"> VScode using SSH and git - desire for a workflow / instructional material

  • Omar - using VScode and SSH with git
  • for Spring 2022
  • planning to assign leadership from the TOP group
  • Nathan - will need to work on inotify issue
  • Omar will make tickets

20.04 Prototype NetTop

  • Clayton - not ready for production
  • it is in MC ...?

Action items:

  • Clayton: RT#1194157 - NFS mount for /opt/csw
  • Dave/Guoxiang - decide on a date for the /var/mail switch
  • Anthony/Adrian - work on new postfix recipe to have servers send mail out directly
  • Anthony/Guoxiang - initiate warranty - RT#1196085 - under warranty until 2025-03-30
  • Devon - collect power data to show Plant Operations
  • Lori - Possibly recover 960GB Optane cards from 422 systems?
  • Guoxiang/Lori - create or report# ticket for Storage option catalogue
  • Omar - create ticket(s) for VScode/git workflow
Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2021-12-13 - AnthonyBrennan
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback