Sharing (Critical) Information

There are several things in common in the approaches taken for critical situations and ongoing work; both of which can affect our clients and other support staff. The following describes the possibilities, with guidelines about when they apply.

When dealing with an urgent problem, e.g. the failure of a service used by many people, it's important to both communicate with those affected, while at the same time maximizing the time available for the staff working to resolve the problem. This is done by said staff handing off the communication to other staff, who become the "Point of Contact" for the problem.

The kind of communication used may depend upon the problem, as some problems limit what communication can be used (e.g. failure of e-mail). In general, we have at least these possibilities:

Math Help Centre

Anything that could cause a question of the Help Centre should be reported to them. That's primarily the teaching environment, or grad student use of the general.cs environment. In principle it could be almost anything, so it's safest to always let them know of a serious problem. Calling Lori (Suess) directly works best. For planned outages, letting consultant@math.uwaterloo.ca know in advance works best (and Lori sees that too).

CS Staff

Depending upon the nature of the problem, it can be quite effective to let the "Administrative Officer" know, who will spread the word to other staff, e.g. the "Department Secretary", "Secretary/Receptionist", and various "Administrative Coordinators". I.e. tell Sher, Michelle, Helen, Debbie and Wendy. If the problem affects the teaching environment, then let someone in ISG know, and depending upon the severity, the (Associate) Director of Undergraduate Studies.

E-Mail announcement

EMail to

scs-everybody
is useful in situations where it's expected that most of CS will want to know. Other lists also exist, e.g.
  • scs-staff
  • scs-faculty
  • scs-grads

If you do not know the possible email hostnames associated with these lists, ask someone else to remind you. For obvious reasons we do not want to mention them on publicly crawlable web pages.

Web Bulletin

There is an Important Bulletins section in the CSCF home page. The goal is to record all CSCF service outages. Both the downtime and uptime are to be recorded. The downtime is recorded as soon as we know of the outage, the uptime is recorded when the problem is resolved.

It's updated via:

  • reach the bulletin via either
    • login to any of the CS general systems (e.g. via `ssh linux.cs`)
    • cd /software/wwwdata_cs.uwaterloo.ca/data/vhosts/cs/cscf
    or
    • on linux.cscf.uwaterloo.ca, cd /var/www/cs/cscf
  • co -l BulletinsCurrent.ihtml (it's an include file)
  • edit the file using your favourite text editor
  • ci -u BulletinsCurrent.ihtml
Once the problem is over, the related bulletins are moved from BulletinsCurrent.ihtml to BulletinsPrevious.shtml using a similar approach. It's important to keep this up-to-date, to avoid cluttering the page.

Telephone

The telephone extension 31100 is to be used for announcements of computing or network problems that prevent access to the bulletins in the CSCF home page. Doing this in advance for scheduled outages is a good idea.

Information needed to update the announcement is in the usual place for secure information, or check AnnouncementVoiceMailPassword. To update the message, use the temporary greeting feature (8 2 3) of Call Pilot and set an expiry date/time when you have finished recording the greeting. It's good to have an accurate message, so choose a generous expiry time if it's not obvious when service will be restored. In such cases, that then requires manual removal of the message shortly after restoration of the service.

Sample Announcements

Sample templates for updating the telephone announcement message:

  • This is the Computing Science Computing Facility for the David R Cheriton School of Computer Science at TIME on DATE. There is currently a problem with SERVICE. BRIEF DESCRIPTION OF PROBLEM AS IT AFFECTS PEOPLE. CSCF staff are aware of the problem and expect to have service restored by TIME. (or "are currently working on resolving the issue") If you have concerns about this outage, please contact your CSCF point of contact or leave a message for the CSCF Help Desk.

  • This is the Computer Science Computing Facility Help Desk for the David R Cheriton School of Computer Science. Known computing or network problems are recorded in our home page www.cs/cscf. If you are aware of a problem, please contact your CSCF point of contact or send email to cscfhelp@uwaterloo.ca. Thank you.

Signs

In cases where electronic communication is affected, or simply when enough people show up asking about a problem, paper signs on the doors at the ends of the CSCF corridor have proven effective.

It was once suggested that such signs would be placed in the main CS office area.

CSCF Staff Poster

The CSCF Staff poster is put up in the display cases at each end of the CSCF staff hallway. The latest version can be found here:

CSCF Staff Door signs

  • CSCF Staff are expected to have some sort of indication of whether they are in or not and, if not, when they might be expected back
    • many staff have chosen to use whiteboards that were provided
    • some staff use a printed sheet with paperclips to indicate their whereabouts (see sample document)

MC Teaching Lab posters

DC Research Lab posters

Newsgroups

The uw.cscf.fyi newsgroup is handy for letting CSCF staff know about problems. Its use isn't restricted to serious problems. While not advertised as required reading for clients, some do.

uw.cscf.system is intended to be displayed upon login to a standard environment, to relate "changes to the local computing environment that will likely affect how people do their computing". It uses the `read_system_news` command, which in practice means that only XHier'd Solaris systems see it. So while still present, it's likely not very effective.

You can post to the newsgroups via mail to uw.cscf.fyi@news.cs.uwaterloo.ca and uw.cscf.system@news.cs.uwaterloo.ca

What to do in case of:

Emergency

Planned Outage

By CSCF

By Others that affect DRCSCS

--++ Door Sign for daily update of your whereabouts

Topic attachments
I Attachment Action Size Date Who Comment
Microsoft Word filedoc WhiteBoardStatusGeneric.doc manage 32.5 K 2012-01-19 - 14:37 LawrenceFolland White Board sheet to indicate whereabouts
Topic revision: r22 - 2016-07-14 - LawrenceFolland
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback