CurupiraConfiguration

Documentation

EDOCS

Console Access

  • cscf.cs# rsh curuhead.cs
  • cu -s38400 -l/dev/ttyd2

Power-up

  • There are 4 power buttons next to an green backlit LCD display - each shows the power status
  • Make sure each unit is on - curahead is the top part and the bottom two are curupira

Manual Power up of curupira

  • log onto cscf and become root
  • suw-2.03# rsh curuhead.cs
IRIX Release 6.5 IP22 curuhead
Copyright 1987-2002 Silicon Graphics, Inc. All Rights Reserved.
Last login: Wed Jul  3 11:33:32 EDT 2013 by root@cscf.cs.uwaterloo.ca
curuhead 1# ^[[A^[[A^?
curuhead 1# cu -s38400 -l/dev/ttyd2
Connected

System Maintenance Menu

1) Start System
2) Install System Software
3) Run Diagnostics
4) Recover System
5) Enter Command Monitor

Option? 5
Command Monitor.  Type "exit" to return to the menu.
>> auto 

Hardware Problems

Bottom two sections of curupira are shutting down frequently

  • Likely due to the following problems
curupira.cs console login: 001c04
001c04 ATTN: 1.5V low warning limit reached @  1.340V.
WARNING: 001c04 ATTN: 1.5V low warning limit reached @  1.340V. 

disks dksc0d118vol and dksc0d125vol are dead

  • Note: This Array will not be fixed! - see RT#85393
  • See D Brick Notes below
  • /dev/xlv/xlv0 on /share/disk/curupira1 and /dev/xlv/xlv1 on /share/disk/curupira1.mirror is broken
  • During startup you will see this message concerning failed disks: dksc0d118vol and dksc0d125vol
  • There are two RAID0 arrays disks (dksc0d114vol...dksc0d118vol ) and (dksc0d119vol...dksc0d122vol and dksc0d125vol) each form a RAID0 array
    • Both of these arrays were mirrored for redundancy - however each disk in the corresponding array has failed so the whole array is broken.
    • When we backed up images of the good disks we found that a few of the other disks had bad blocks

Selecting Default Server
NOTICE: Starting failsoftd
dksc0d118vol: Device not ready, spinning up
dksc0d118vol: Device not ready: Not ready to perform command (asc=0x4, asq=0x0) (FRU=0x2)
dksc0d118vol: Device spin up failed, unable to use device -- corrective action necessary
ioconfig: ERROR:scsi_ctlr_walk_fn : Cannot open the file : /hw/module/001c04/Ibrick/xtalk/15/pci/1/scsi_ctlr/0/target/118/lun/0/disk/volume/char
        error is: I/O error
dksc0d125vol: Device not ready, spinning up
dksc0d125vol: Device not ready: Not ready to perform command (asc=0x4, asq=0x0) (FRU=0x2)
dksc0d125vol: Device spin up failed, unable to use device -- corrective action necessary
ioconfig: ERROR:scsi_ctlr_walk_fn : Cannot open the file : /hw/module/001c04/Ibrick/xtalk/15/pci/1/scsi_ctlr/0/target/125/lun/0/disk/volume/char
        error is: I/O error 
...
...

mount: /dev/xlv/xlv0 on /share/disk/curupira1: No such file or directory
mount: giving up on:
   /share/disk/curupira1
mount: /dev/xlv/xlv1 on /share/disk/curupira1.mirror: No such file or directory
mount: giving up on:
   /share/disk/curupira1.mirror 

Hardware notes

Curupira is an SGI 3200 server. It has C-, G-, I-, and D-Bricks. This twiki concerns the D-Bricki hard drive configuration.

Viewing the D-Brick

The D-Brick consists of 12 drive bays of which 10 are populated. They are numbered differently depending on what SGI data manual or section is referred. This is a view of the drive looking from the front of the machine, numbered col/row, #drive, and system number (dks0dXXXvh):

1/1   #9      2/1  #10      3/1  #11      4/1  #12
dks0d122vh       X             X          dks0d125vh

1/2   #5      2/2   #6      3/2   #7      4/2   #8
dks0d118vh    dks0d119vh    dks0d120vh    dks0d121vh

1/3   #2      2/3   #3      3/3   #4      4/3   #1
dks0d115vh    dks0d116vh    dks0d117vh    dks0d114vh 

Drive grouping

The two drives at the bottom corners dks0d115vh and dks0d114vh are reserved for the system.
The drives dks0d116vh, dks0d117vh, dks0d118vh, and dks0d119vh are striped and mirrored by the drives dks0d120vh, dks0d121vh, dks0d122vh, and dks0d125vh.
The two drive groups are managed by the XFS volume manager referred to as XLV:

curupira 26# mount
/dev/root on / type xfs (rw,raw=/dev/rroot)
/hw on /hw type hwgfs (rw)
/proc on /proc type proc (rw)
/dev/fd on /dev/fd type fd (rw)
/dev/xlv/xlv0 on /share/disk/curupira1 type xfs (rw,grpid,raw=/dev/rxlv/xlv0)
/dev/dsk/dks0d1s4 on /home type xfs (rw,grpid,raw=/dev/rdsk/dks0d1s4)
/dev/xlv/xlv1 on /share/disk/curupira1.mirror type xfs (rw,grpid,raw=/dev/rxlv/xlv1)
/dev/dsk/dks0d1s3 on /.software type xfs (rw,grpid,raw=/dev/rdsk/dks0d1s3) 

-- GordBoerke - 05 Nov 2012

Topic attachments
I Attachment Action Size Date Who Comment
JPEGjpg 20130703_120402.jpg manage 2020.9 K 2013-07-03 - 12:20 MikeGore Curupira - Front Panel Open
Topic revision: r5 - 2013-07-03 - MikeGore
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback