Linux Working Group
Meeting Date
Invitees - Attendees
- Adrian, Anthony (group leader), Clayton, Guoxiang, Lori, Fraser, Devon, Nathan, Nick, Todd, Dave, Lawrence, Omar
Review and accept previous meeting minutes.
Proposed Agenda Items
Netapp Retirement
- Dave has committed to MFCF that we will be all done with the NetApp by end of January 2022
Migrate remaining data
- In General
- Any progress on moving Dan Berry's /opt/csw?
- RSG discuss moving to jerusalem directly
- Configure an NFS share via gateways: lfolland will create ticket for ctucker - Ticket created: RT#1194157
- Any word from Isaac about use of Netapp for storing web logs? Ticket (a2brenna) for ijmorlan was created late, no movement as of yet.
- TEACHING - Progress reports on
- Unmounting /oldhome immediately (gxshen)
- Perform final (end of year backup) of /oldhome data on the Netapp and remove data from Netapp in January (gxshen)
- Guoxiang: did a full backup in May 2020
- /opt/CSCF/packages is also being unmounted (collections of really old packaged software - no longer used)
- XHIER
- xhier regional trees for CS-GENERAL and TEACHING: Destinations exist on DFSC. Schedule migration?
- currently in /opt/uwcs - just local file storage, no longer on NetApp - so, done!
- Are we still on track for moving mail in December?
- lfolland sent notification for move CS mail to IST on 2021-11-24 - Done
- Do not move mail homes to DFSC until after move to IST
- lfolland to discuss with sdinney
- /var/mail (still on NetApp)
- can choose a time to migrate
- a few people still using it
- need to schedule a time to unmount /var/mail * expected not to take too long * will need to update mounts for both teaching and general regions (can be done separately) * all of the data can be copied ahead of time, then diffs at the time * Dave and Guoxiang to plan the specific process * Date for the switch?
- should be coordinated with unmount and reboot of teaching and general use servers at end of term
- Fraser notes that he is still getting some new mail on CS ( can be from CS to CS)
- Adrian points out there are several reasons mail still comes directly
- spammers, Let's Encrypt, other CS mail users, CS systems using mx.cs
- servers to be changed to send mail directly to the destination - has that happened?
- configuration change to postfix required
- will need a recipe for other machines that may be setup to do that
TEACHING environment
CS Teaching - slowness of systems - how to address? (All)
- Ceph:
- Gateway systems (NFS/Samba) upgrade status? Schedule blacklisting of old client.
- 5.11 HWE kernel has been deployed but not all TEACHING and CS-GENERAL hosts have been rebooted. Will be handled by end of term update/reboot cycle. (a2brenna)
- Unscheduled reboots
- ubuntu2004-008 rebooted. Why?
- Caused by general protection fault in kernel space, kernel panicked. Either a kernel bug, or possibly a memory issue. No other evidence of hardware memory issues on the machine at this time. (a2brenna)
- have a kernel core dump if someone has time to investigate
- ubuntu2004-004 rebooted. Why?
- According to Dell support this was caused by a "Corrected Memory Error"? (gxshen)
- Machine was left in a bad state (failed to POST) see RT: 1196085
- Initiate warranty claim (a2brenna)
- concerns about possible power issues - power bounced, dirty power?
- Devon notes that there are deviations of 5V every 6 hours
Other Issues
mc-3015-postgres-2004
- How much disk space does it actually need? (nfish)
- 256GB for root, /var/lib/postgresql on an Optane
- larger storage on dfs
- need to be clearer on the expected paths/configuration
- Expected arrival date of new hardware? (dlgawley) - RT#1159697
- 2TB Samsung Pro (for 203 systems - newest HP ProLiant)
- expected to be "204" systems, but haven't arrived yet
- Possibly recover 960GB Optane cards from 422 systems? (ldpaniak)
- need to follow-up with Lori
mysql systems
- Dave: what increments do we need?
- Fraser - don't have a quick answer, 1 should be sufficient, as long as backups on a different
Changes to Regions:
- OpenEdx has requested modest increase in disk space for test environment. Not a problem. (Narrator: It was a problem) (a2brenna)
Storage option catalogue:
* NFS, Samba, S3, RBD, posix, web, Nextcloud: ticket for ldpaniak/gxshen to create list
* ongoing
* ticket still to be created?
OdysseyDB backup volume (s8weber) on DFSC - RT#1193614
* nfish: only a temp requirement
* need to determine how to connect - on the ring, or NFS mount
SSH and git - desir"> VScode using SSH and git - desire for a workflow / instructional material
- Omar - using VScode and SSH with git
- for Spring 2022
- planning to assign leadership from the TOP group
- Nathan - will need to work on inotify issue
- Omar will make tickets
20.04 Prototype NetTop
- Clayton - not ready for production
- it is in MC ...?
Action items:
- Clayton: RT#1194157 - NFS mount for /opt/csw
- Dave/Guoxiang - decide on a date for the /var/mail switch
- Anthony/Adrian - work on new postfix recipe to have servers send mail out directly
- Anthony/Guoxiang - initiate warranty - RT#1196085 - under warranty until 2025-03-30
- Devon - collect power data to show Plant Operations
- Lori - Possibly recover 960GB Optane cards from 422 systems?
- Guoxiang/Lori - create or report# ticket for Storage option catalogue
- Omar - create ticket(s) for VScode/git workflow