Machine Room Fire Suppression
Here we gather the techniques we will use to protect machine rooms from fire,
in a way that will minimize the occurrence of floods such as happened 2014-07-08
when an A/C failure resulted in a sprinkler tripping.
Work from
ST #95457 will be recorded here.
Options
A/C Repair
One working theory is a flaky controller board failed to open the
valve for the chilled water.
The valve has been replaced, and once cleaning is done and power restored,
Plant Ops will check that the A/C is working to their satisfaction.
Sprinkler Head Update
The engineer from the insurance company said that we have sprinkler
heads that aren't rated for an ongoing temperature above 100°F (38°C).
The one sprinkler that tripped was directly above the "hot aisle",
and which could have been running at or above that temperature.
We currently have the "red" sprinklers in
http://www.vikinggroupinc.com/databook/sprinklers/052104.pdf
which we hope to change, depending on building code requirements.
Inquiries of the rest of campus (CTSC and Plant Ops) have been made.
Many don't have fire suppression. Some have fire extinguishers.
Plant Operations (one staff member) is recommending against sprinklers. Instead use smoke/heat detectors.
Monitoring
There are multiple forms of monitoring that can be done.
Having more than one improves reliability.
Responding to the monitors automatically is preferred.
We once had sensors be monitored by Nagios and producing Nagios alerts, before our Nagios server failed.
They are being reimplemented in the replacement system.
(look for "esensor" in
Nagios).
Long term monitoring statistics can be seen via
Cacti.
What to Monitor
A/C
Plant Ops will (when paid) install monitoring for A/C.
We will do that.
The sensors are monitored by Plant Ops staff.
To be determined: response time after an alert is sent, e.g: "HIGH TEMPERATURE" .
The newer A/C in DC3558 has recorded at least 5 high temperature alerts, since 2013-12-01.
Plant Ops says it's being monitored. We don't know what kind of response there was to the alerts.
Temperature/Moisture
Internal Machine Sensors
The BIOS of many (most/all ?) modern machines can report CPU temperature.
It can also be possible to monitor motherboard and disk temperatures.
We should determine how. We have a co-op working on the various forms of sensors.
External Room Sensors
We have some external room sensors (in DC3558, DC3548h, M3-3101).
They are very old, and have not recently been in production (they were being monitored a few years ago when PLG initially sponsored their installation and development). That went away when our initial Nagios server died.
They should be replaced (why?), and a sensor added to DC3556.
We have power bars that can be controlled
by optional directly connected sensors.
We have a student investigating how to drive them.
Actions
When a sensor discovers an inappropriate temperature or water level,
then as thresholds for temperature/moisture are reached,
it should be used to:
- suspiciously high: issue a Nagios alert
- unreasonably high: shutdown the operating system (i.e. for a clean halt)
- extreme: shutdown the power to the machine (via the BIOS or via external power control)
The action is applied to the machine if the sensor is machine specific,
and to the room if it's a room sensor.
The thresholds for the two kinds of sensors will likely be different.
Shutting down the power for an entire room could require
a non-trivial 3 phase controllable power switch.
For DC3556 it would reside in the penthouse above the room.
Alternate Fire Suppression Technology
IST estimates that to install dry sprinklers, gas (FM200),
fire remote panel and detection (in DC3556) would cost from
$75K to $125K, depending upon the options.
And there would be room sealing requirements as well.
Presumably room ventilation would also be a concern.
Links:
Dry Sprinklers
The sprinklers are not "charged" (with water pressure) until external
sensors determine the need.
It can be done to require multiple sensors to report a problem.
When the sprinklers are charged, there can still be a conventional
sprinkler head temperature trigger.
Links:
VESDA (ASD) Smoke Detectors
VESDA (Very Early Smoke Detection Apparatus), and specifically
ASD
(Aspirating Smoke Detector) use a network of pipes with small holes to
sample air throughout the room, which is then analyzed for smoke content.
Gas
Safely containing and venting gas is a non-trivial cost.
E.g. it's rumoured to be expensive to recharge a gas system.
Foam
We've no experience with foam fire suppression.
Plant Operations says that it can be almost as much of a problem
as sprinklers.
Self-Contained Racks
Racks with self-contained power, A/C, and fire suppression exist.
As of a few years ago, a rough price was $60K for a controller rack
and the first 3 equipment racks.
Self-Contained External Server Room
An extension to the self-contained rack is a self-contained
server room in a cargo container.
It would save the space for an internal server room.
It is suspected that it costs about $1M.