Linux Working Group
Meeting Date
Invited
Anthony (group leader), Lori, Dave, O, Clayton, Guoxiang, Nathan, Nick, Todd, Ed
Attendees
Anthony (group leader), Dave, Devon, Clayton, Fraser, Guoxiang, Nick, Steve
Review and accept previous meeting minutes.
CsLWGMeeting20230920
Urgent Business
After patching requires system to be rebooted
RT#1304195
Review last meeting's Action Items
There is "CSCF exec" doubt about the value of managing per user quota versus flagging excessive usage and pear-pressure.
100GB quotas in staged roll out, plan detailed in tickets above
- Email sent, see RT#1288354
- Nick to send second reminder in mid Oct (two weeks before Nov 1)
- Check with SAT development team about status of storing a storage quota in appropriate sponsorship tables.
- Clayton will work on setting all the "maxstorage" user entries in each Domain to an initial 100,000,000B base amount.
- Note that in long run (by Aug 2024 hopefully) this ends up being a base amount plus additional SAT entry sponsorships once that mechanism is established. * Build a data file that contains the user and summed quota information. Specifics to be worked out between Clayton and Nathan.
- Update quota CEPHfs xattr on homedirectories (Nathan)
- Has this default quota been implemented for new accounts / accounts under 100GB?
- Tooling to implement quotas
- Calculate current quotas from sponsorship information, see RT#1298614 (Clayton)
- There appears to be 3 (teaching, non-course-account) users with sponsored quota in excess of 100GB
- Maybe re-run that query as sponsorships can change
Regarding URAs
Does SAT sponsorship contain URAs?
Do we have a Domain "netgroup" that contains URAs? (Clayton? )
CS Mailservers are going away
No update
Ongoing problems with Inventory and IPAM are hobbling Infrastructure operations - RT1285291
Will schedule some time to talk with Inventory team about the following (a2brenna)
- Inventory is unaware of this IP / domain limitations in IPAM as well as DHCP and MAC address requirements
- Some CSCF do not have access to create manual DNS entries (Devon, Lori, Guoxiang, Todd have access. Dave?)
- Inventory bug: Changing room field on a record with IPAM DNS & DHCP causes DHCP to break
- Anthony to reach out to IST for clarification regarding is this a policy vs technological limitation.
- Invalid records were imported from Infoblox that work until they are edited
What's still using old MySQL?
NextCloud (Vault) pending migration
- Scheduling Nextcloud DB migration ~21st (Nathan, Fraser)
Web server (includes Inventory)
- needs OS (whole LAMP stack) to be updated
NFS ganesha server needed rebooting on Monday, RT#1303795
GaneshaNFS needed to be rebooted, similar that cephfs locked up on web server
- needs further enhancements to monitoring service?
- add tests/checks for mounts
New Items
Cephfs timeout evicted the web server?
Icinga did alert on this. (It checks both accuracy and response time.) See
Web page showing status of linux.student.cs hosts that students and course staff can check
RT#1279831
- Requested by CS136 staff due to how many linux.student.cs issues we had in W23 term
- CS 136 will have ~1000 users
- Devon will produce dedicated Grafana page (dashboard) for this.
Ticket hygiene
Many people are not "closing" tickets.
- Anthony will mention at next CSCF group meeting.
Monitoring Services
- Number of false alerts is a concern.
- Lack of Service Maintenance outside of standard working hours has been more of a problem lately.
- Management will need to review this.
- For NetTops monitoring, Devon and Anthony are putting something together.
- Web servers needs have it's network storage access monitored
- check URL's for Web monitoring dashboards in "Webserver failure" item above if this is still true.
- MFCF using LOKI to analyse and alert based on system log entries.
- MFCF has noted that IST pen testing can cause false positives.
Comments