Linux Working Group



Meeting Date

  • TEAMS: 2023-11-29

Invited

Anthony (group leader), Lori, Dave, O, Clayton, Guoxiang, Nathan, Nick, Todd, Ed, Devon

Attendees

Review and accept previous meeting minutes.

CsLWGMeeting20231115

Review last meeting's Action Items

Homedirectory quotas (a2brenna) - RTs: RT#1112506, RT#1288354, RT#1298614, et al

There is "CSCF exec" doubt about the value of managing per user quota versus flagging excessive usage and peer-pressure
  • Plan to implement and enforce quotas will move forward as per Director
  • Delayed due to preparations for chilled water outtage on Dec 6th.

100GB quotas in staged roll out, plan detailed in tickets above

  • Email sent, see RT#1288354
    • Nick sent second reminder in mid Oct - no replies - Small group of students (5 or 6) have gone over quota during the grace period
    • How do we monitor this monthly?
  • Check with SAT development team about status of storing a storage quota in appropriate sponsorship tables.
    • Clayton has set all the "maxstorage" user entries in each Domain to an initial 100,000,000B base amount.
    • Note that in long run (by Aug 2024 hopefully) this ends up being a base amount plus additional SAT entry sponsorships once that mechanism is established. * Build a data file that contains the user and summed quota information. Specifics to be worked out between Clayton and Nathan. * Nathan has been pulled off this
  • Update quota CEPHfs xattr on homedirectories (Nathan)
    • Nathan has been pulled off this
  • Tooling to implement quotas
    • Calculate current quotas from sponsorship information, see RT#1298614 (Clayton)
  • There appears to be 3 (teaching, non-course-account) users with sponsored quota in excess of 100GB
    • Maybe re-run that query as sponsorships can change

CS Mailservers are going away - by Jan 1

  • provided Lori with list of hosts still using it as a relay, will check next week for action
  • alias / vanity addresses will stop working
  • csadviso@cs.uwaterloo.ca special forwarding will cease working, consulting with IST and Brad Lushman (a2brenna)

Ongoing problems with Inventory and IPAM are hobbling Infrastructure operations - RT1285291

Will schedule some time to talk with Inventory team about the following (a2brenna)
  • Inventory is unaware of this IP / domain limitations in IPAM as well as DHCP and MAC address requirements
  • Some CSCF do not have access to create manual DNS entries (Devon, Lori, Guoxiang, Todd have access. Dave?)
  • Inventory bug: Changing room field on a record with IPAM DNS & DHCP causes DHCP to break
  • Anthony to reach out to IST for clarification regarding is this a policy vs technological limitation.
    • Delayed due to preparations for chilled water outtage on Dec 6th.
  • Invalid records were imported from Infoblox that work until they are edited
  • Delayed due to preparations for chilled water outtage on Dec 6th.

What's still using old MySQL?

NextCloud (Vault) pending migration

  • Scheduling Nextcloud DB migration (Nathan, Fraser)

Web server (includes Inventory)

  • needs OS (whole LAMP stack) to be updated

Retire CS-GENERAL and associated domain controllers

  • Last user is Vault
    • Vault upgrade needs to be performed
    • Vault migration from GENERAL to CS-GENERAL may take place after upgrade. Nathan to determine what is priority
    • Why can't vault switch domains to GENERAL? (a2brenna) - File space in vault is mapped to the user's UUID. Clayton has provided a mapping from GENERAL to CS-GENERAL.

Ongoing problems with NFS ganesha server RT#1303795

  • needs further enhancements to monitoring service?
    • Devon and Anthony to preparing doc for help desk
    • More comprehensive monitoring of NFS performance is in the works (a2brenna, dmerner) ~ end November
    • Delayed due to preparations for chilled water outtage on Dec 6th.

Web Service failure [https://rt.uwaterloo.ca/Ticket/Display.html?id=1304871][RT#1304871]

  • HAProxy hit open file limit (4096 open file descriptor), change made by a2brenna will not survive a reboot
  • Migrate off 18.04 and on to 20.04
  • Multiple failures over multiple days
  • Need to meet with team running service to provide permanent fix

Monitoring Services

  • Number of false alerts is a concern.
  • Lack of Service Maintenance outside of standard working hours has been more of a problem lately.
    • Management is aware and need to review this.

More usage data needed for labs (Mac and Linux) [https://rt.uwaterloo.ca/Ticket/Display.html?id=1284635][RT #1284635]

New business

linux.cscf.uwaterloo.ca

  • New linux.cscf.uwaterloo.ca running Ubuntu 22.04 is almost ready
    • Needs authentication set up with 2fa

Incremental backups of block devices

  • Possible solutions include rsync and borg but neither is ideal
  • gxshen to investigate Legato NetWorker backups of block devices

Comments

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2023-11-28 - AnthonyBrennan
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback