Disk Enclosures
cancel
Showing results for 
Search instead for 
Did you mean: 

Failed virtual disks in direct attached storage

Passata Sotto
Occasional Visitor

Failed virtual disks in direct attached storage


I hope someone can help with the following.
My immediate problem, as I read the docs and
prepare to contact the vendor, is that I'm
used to getting answer X given a specific
case Y, but on this occasion the online
support docs provide less specific information
about the steps I need to take (ie, beyond
step 13 below). Is someone here with more
experience with direct attached storage systems,
e.g. PowerVault 220S & PERC RAID controllers,
able to tell me that I have been doing the right
thing as described, so I do not lose sleep
this weekend before start of work on Monday?
(assuming I do not resign on grounds of gross
incompetence).


OVERVIEW

A direct attached storage system, a PV220S,
has RAID-5 virtual disks that are not being
recognised at bootup of a server with a PERC
RAID controller. The actions that follow
consist of one change within BIOS and several
changes to SCSI cable connected to the PERC.
Nothing apparently has been done that may
cause data loss.


HARDWARE

PowerEdge 4400 with PERC3/Di and PERC3/DC.

PowerVault 220S with three RAID-5 containers
(each with three disks, no hot spares used).

The PowerVault 220S is plugged into the
PE4400's PERC3/DC card via SCSI cable.


PROBLEM

1) Powered down the PE4400 and PV220S to
move them to a different location.

2) Turned on PV220S, then PE4400

3) Realised that the SCSI cable was not
connecting the PV220S with PE4400's
PERC3/DC card

4) Allowed the PE4400 to boot up fully
before powering it and the PV220S down.
(The operating system runs off PE4400's
disks).

5) Connected SCSI cable between PV220S
and PERC 3/DC's Ch-1 SCSI connector
(this controller has two SCSI connectors
the other unused one is Ch-0).

6) Turned on PV220S, then PE4400

7) Error at startup:

Configuration of NVRAM and drives
mismatch

Run View/Add Configuration option
of Configuration Utility

8) At this point I considered that the
uncabled SCSI at 3) above caused the
NVRAM in the controller to change to
an incorrect configuration, because
PV220S disks were not changed and the
use of PERC 3/DC Ch-1 SCSI connector
was as originally. Therefore I chose
to get the disks' configuration data
-- see next.

9) Went into "View/Add Configuration",
selected "Disk", exited at once with
"Save". This presumably caused the
configuration at the disks' end to
synch to the NVRAM. I made the mistake
of not writing down the disk states
-- but I recall no FAILED errors.

ASIDE: there is some ambiguity in
chap7.htm#1056771 of "PERC 3/DC User's
Guide":

Select either Disk to use the
configuration data on the hard disk
or NVRAM to use the configuration
on the NRVAM.

Does selecting "Disk" restore the
data config _from_ the disk _to_
NVRAM? I hope so, and if not I hope
I can still get my RAID-5 virtual
disk data back...

10) Now, on reboot, "3 Logical drive(s)
found on host adapter" appeared but
after booting into Windows 2K3 Server,
the Event Viewer displayed "Virtual
Disk failed" PercPro 508 errors.

11) Shutdown PE4400 and PS220S, turned
them back on.

12) Two messages:

The following SCSI IDs are not responding:
Channel-1 0,1,2,3,4,5,7,8,9

3 Logical drive(s) failed

appear. Go into to find:

RAID Ch-0 RAID Ch-1

ID ID
0 0 FAIL A00-01
1 1 FAIL A00-02
2 2 FAIL A00-03
3 3 FAIL A02-01
4 4 FAIL A02-02
5 5 FAIL A02-03
6 6
7 7 FAIL A01-01
8 8 FAIL A01-02
9 9 FAIL A01-03

13) Turned off everthing and plugged
SCSI cable into PERC 3/DC's Ch-0
connector and now get, following a
message :

RAID Ch-0 RAID Ch-1

ID ID
0 READY 0 FAIL A00-01
1 READY 1 FAIL A00-02
2 READY 2 FAIL A00-03
3 READY 3 FAIL A02-01
4 READY 4 FAIL A02-02
5 READY 5 FAIL A02-03
6 PROC 6
7 READY 7 FAIL A01-01
8 READY 8 FAIL A01-02
9 READY 9 FAIL A01-03
10 PROC

At this point I've become confused about the
next step to take because, I'd assumed that,
by 10), I'd have recovered the virtual disks
-- which has some recent data not on backup
tape, so getting it off is the main priority.

Does anyone have experience of this? Is it
now about the meta-data on both the NVRAM
and disks (they are now in sync following
step 9 above) being incorrect? Is a "rebuild"
the next step? I'm slightly lost because
after all these years I've never experienced
RAID array failures. There are a few concepts
I am trying to get my head around.


Cheers,
passata


1 REPLY
Passata Sotto
Occasional Visitor

Re: Failed virtual disks in direct attached storage

--------
After checking each physical disk in BIOS
to verify that they were OK, performed "force
online"