Term Goals Winter 2016 for Research Support and Special Projects

General environment

File storage

  • Distributed File Service - Lawrence/Lori w/Dave/Anthony (ST#98307)

Database services

  • Clustered MySQL server - ST#85594 (and ST#101098, ST#102303)
    • Daniel - assist with migration to the new cluster, including planning and documenting affected applications
    • will need to coordinate with Dave, Fraser, Omar, Lori to assist with configuration planning
    • March 7 update - well in progress, on target for end of March
    • April 12 - official completion report submitted

Networking

Machine Room Interconnect

  • Lori to work out 10Gb interconnect between research server rooms - check with Dave (Reference ST#98551?)
    • a census of the servers capable of r/w 1GB sustained (eg: Asimov, hops, Daytona, chinook, m160, Ripple) and how are they connected. And can they be 10Gb connected?
    • Dave says that 10Gb already exists, at least between machine rooms

Firewall

    • Overall: 88134
      • Finish group consultations (send email to all groups)- Lawrence / RSG staff
      • have client networks behind firewall
      • (Optional) Have Daniel create a utility to tell a client the status of their IP address (eg: 129.97.168.x -> "you will be on a client network, with VPN accees only") - ST#103237
      • reminder from MikeP to keep it moving along
      • most groups have now been contacted

CSCF Special Projects

Exam Management System - Isaac

  • modify process to embed fonts to improve workflow with New Media Services (ST#101963)
    • completed, printing working well
  • implement UI for multiple versions (ST#101625) * done
  • have a designated backup
    • software/development/database - Daniel
      • Daniel tracking such notes in ST#104023
      • Daniel now involved
    • operation/PoC - Nick
      • Nick and Isaac have met, Nick is getting training on being the operational PoC - ST#104199
      • Nick has read the documentation, and will be starting on working on requests in the new term

Grad TA Evaluations - Isaac

  • create project plan for this software tool - ST#52768
    • begin implementation of evaluations - Phase 1 document
      • design is set
      • goal is to be ready-to-use, even though can't be used yet
      • trial for Spring term

Grad Visit Day - Daniel

  • Grad Visit Day web application deployed February 2013 (and annually thereafter) for Heather Steinmetz and Jesse Hoey - ST#91837 - Daniel
    • status for February 2016 - ST#102563
      • Daniel will work with co-op to have updated version for W16
      • preparation work is done, event later in March
    • request to implement for Rest-of-Math - ST#103448
      • setup is complete for Rest-of-Math, Jim Johnston is responsible for that instance

Inventory - Daniel

  • Inventory web application deployed 2009 for CSCF - ST#76671 - Daniel
    • Version 3.5.1 - bugfixes - to be worked on in F2015/W2016 - ST#101745
      • bugfixes complete as of end-of-February
      • Desirable feature-add: plan to integrate ONA (likely won't happne this term)
    • Explore Machine Room mapping / inventory (RackTables / MachineRoomMap) - will need input from Dave, Dan and MFCF
      • need concept of "reserved" space for future planning of space (rack, room, etc.)
      • in progress by Evan/Devon - ST#95545
    • See also #Nagios below

OAT - Isaac

  • Overhaul data import to use new data source and provide admissions data - ST#92317
    • still waiting on IST Enterprise Architecture to provide all of the required data

OGSAS - Isaac

  • Overhaul to use new datasource and deal with requests from Associate Director Grad Studies (Urs) - ST#?
    • blocked on waiting for the OAT data (see above)

Research Subscription System - Daniel

  • Research Subscription web application for CSCF management (re-)deployed June 2011 - ST#78100 - Daniel
    • Daniel working on bug fixes for W16
    • fixed "Add a subscription" (which was the critical issue, some others minor issues remain)

ST

  • by end of W16: have decision about the future of ST / job-tracking in CSCF - ST#103960
    • met with MFCF, received RT requirements spreadsheet
    • updated spreadsheet to V2

CSCF internal services

Machine Rooms

DC 3556 (Research Machine Room)

  • Finish Insurance Claim - November 2015 - Lawrence
    • 95303
    • Mar 2016: insurance money received. Balance of $50,000 deductible still to come from UW FInance
    • need to determine how to handle faculty refund based on labour refund ($46k)

DC 3558 (CS Infrastructure Room)

Monitoring

  • Report when server hits predefined temperature limit - Gordon
    • 97155
    • Note: confirm with PLG that they're ok with them paying for it (confirmed - see ST#97155, original email from Peter asking about these sensors)

Research computing

Graduate student workstations

  • Streamline the AD join process - Mike / Clayton
    • 101333
    • Milestone: revised script that works with current image for Winter 2016 post-install steps
      • Future: (Winter 2016) Update for Spring/Fall 2016 image
      • Mike says that he has done some work in improving the AD script. Need to document in above ST or note related ST
      • ensure that these scripts are in a system directory (not under ~ctucker or ~magore)

HPC

generic cluster ("paper")

  • Rack hardware we have in DC 3558 A1 - Lawrence/Lori/Mike/CSCF Coop
    • 99112
    • Rack existing hardware into A1, start building cluster - Lori
    • Report time spent on this project
      • Rack A1 has been emptied and we're now adding hardware to it
      • snowballs and squall can be taken - coordinate with Ronaldo

Ganglia Portal - Wishlist - ST#98196

  • VM to be created
  • build initial ganglia system - Lori

Research Storage capacity

  • catalogue existing storage systems, capacity and current use "swing space" - Lori/RSG
    • part of the 10Gb initiative (see above)

Visitor loaner equipment / Researcher-owned machines

  • Workstation imaging: Streamline via Clonezilla - Click'N'Go - Mike
  • Milestone: coordinate with Phil (and Dave and Lawrence) to ensure this process is available to CSI

Administration

Billing

  • Fall 2015 bills by end of November - ST#103192
    • Fall bills to be generated in January 2016
    • Winter bills to be generated in late February - ST#103591

CSCF Retreat

  • Planning lead - Lawrence - ST#103743
  • Milestone: hold the retreat, have a report of results - late February, early March
  • Update: 2016-03-11 (lfolland) Retreat was held yesterday, staff reaction seemed very positive. Sandra will send a summary of our notes. ST items to be created for each team

Research Groups

AI - Mike

  • Setup guest computers - ST#102441
    • Mike has worked with John (our co-op) to review existing machines and update/replace as needed
    • now complete

BIF - Mike

  • rebuild m160 File storage from 40TB and 60TB xfs to 100TB zfs - W16 - Mike / Lori - ST#104138
    • Mike has discussed with students and Ming - and they are excited to have this happen
    • need to transfer all of the data over to Asimov (or other storage server), once hops data removed
    • Given the amount of work that was done to rebuild m160 this term after the compromise, Mike and I have agreed to leave this project for this term -> Spring 2016 Term Goals

Boutaba - Ronaldo

  • update Documentation of CN cluster - ST#93800
    • remove reference to old servers/setup and replace with the new
  • decommission the insurance-replaced servers - Lawrence/Ronaldo

Brecht - Lori

  • Migrate rocket to new server - Ronaldo / Lori
    • Lori - ST#101572
    • Milestone: remove the temp server from Ronaldo's office
      • done

Cabernet - Mike/Lori

  • Upgrade cluster OS to 14.04 (priority)
  • Upgrade GPUs? Lori/Mike
  • Goals for W16:
    • have Justin decide whether he will get updated hardware
    • upgrade the OS to 14.04
  • Status:
    • GPU and OS updated on Node 16
    • query sent to Justin
    • cabernet has a bad drive and a bad PSU - is it worth spending time/money? (Lori)

Daytona - Lori

  • Reorganize, add 3 nodes - December 2015 - Lori
  • 100216
  • Status:
    • done: Feb. 2016
  • Goal:
    • have all nodes fully functional
    • done: Feb 2016
  • Needs:
    • L5-20 PDU - ask Dan/Dave
    • done - Jesse Hoey bought one

DB/DSG - Gordon

  • DC 3312 - clear out equipment - Gordon / Mike - ST#103356

Games Institute - Lori

  • Fileserver / Backup server installation - November 2015 - Lori
    • done!
  • 97715
  • Goal:
    • get ACO involved with support/administration of systems
    • done!

HCI - Ronaldo

  • migrate hci-web to a newer machine (formerly snap-host) - ST#95233
  • purchase and install new equipment - ST#102594

HI - Gordon

Himrod - Lori

ISS4E (Keshav)- Ronaldo

  • Documentation - Ronaldo
    • update to reference new systems and remove old (including MachineNotes)
  • re-organize file space / backup of NAS, repurpose the backup server (tsunami&flood)

NPSG - Gordon

PLG - Gordon

  • retire plg1 - ST#102083

Ripple - Lori

SciCom - Mike

  • Upgrade elora.cs to 14.04 - Mike - ST#103776

SWAG - Mike

  • Install in DC 3558

Watform - Ronaldo

  • migrate repository to new server - ST#90710
  • Documentation - update to remove reference to old servers - ST#78634

ideas for RSG for Spring Term

  • rebuild m160 File storage from 40TB and 60TB xfs to 100TB zfs - W16 - Mike / Lori - ST#104138
  • Service Catalogue - Daniel / Lawrence - ST#104292
  • CSCF Client survey - Lawrence/Omar/Dave
  • From the retreat
    • Create an index of our current ST items based on ST categories and keywords
    • Develop / execute survey of faculty, staff, students about what services they would like
    • document inventory testing
    • analyse web logs for client searches/requests
    • link index to documentation
    • other possible service offerings
    • direction of development tools / DBS
    • Accounts management

Other ideas for Spring 2016 - not necessarily RSG

  • Course Master's lab - move/refresh - ST#104443
  • linux.cs - two servers @ 14.04
  • migrate CS web site
  • migrate to IST absence management - ST#104369
  • create a 3-year plan for all of CSCF services
  • Unified job description - managers
Topic revision: r26 - 2016-04-13 - LawrenceFolland
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback