HPE EVA Storage
1826780 Members
1424 Online
109702 Solutions
New Discussion

Re: W2K Disk error for HSG80 hosted unit

 
SOLVED
Go to solution
David Borojevic
Trusted Contributor

W2K Disk error for HSG80 hosted unit

Hello,

We have a Exchange 2000 database hosted on a W2K server using a unit from an HSG80 SAN. The system event log reports an error every night but the time does vary a bit. It doesn't always coincide with backups or indexing.

The error is Source:Disk, Event id:11, Text: The driver detected a controller error on \Device\Harddisk3\DR3.

It is a raid 5 set with all disks reporting as "normal".

The server in question is also running the steam agent if that is relevant? HSG80 doesn't report errors at the time of the event log errors.

Any ideas on how I should proceed or other diagnostics I could initiate?

Thanks, Dave.
7 REPLIES 7
Mike Naime
Honored Contributor
Solution

Re: W2K Disk error for HSG80 hosted unit

Find out which controller that the LUN is online to. Plug a laptop into the CLI port and log the output. (SHOW Dxx, if it says that it is ONLINE TO THIS CONTROLLER you have the correct CLI port) You can see the next morning what the HSG80 is reporting.

Also find out what Event ID 11 is.
VMS SAN mechanic
David Borojevic
Trusted Contributor

Re: W2K Disk error for HSG80 hosted unit

Yes numerous errors for same disk - Instance Code: 0328450A...
SCSI Sense Data:
Error Code: 112.(70) {current command execution}
Information field is valid
Segment: 0.(00)
Sense Key: 1.(01) RECOVERED ERROR

So I assume that disk is on the way out and should be replaced.

I haven't replaced a "Normal" disk before - only failed disks. From reading the manuals I should???:

set r3 nopolicy
reduce r3 disk50200 (is this correct/necessary? - I have done without it before?)
set r3 remove=disk50200
delete disk50200

Then hot swap in new disk then I assume

run config
set r3 replace=disk50200
set r3 policy=...

Thanks

PS: Even id 11, source disk I found is typically a disk error being sent from a raid controller or HBA as in this case.

David Borojevic
Trusted Contributor

Re: W2K Disk error for HSG80 hosted unit

Oh I forgot,

Because of the EISA partition problem, should I DILX the new disk before the set replace??

Thanks in anticipation.
Wayne Rippy
Occasional Advisor

Re: W2K Disk error for HSG80 hosted unit

I typically DILX new disk before placing them in to a Raid Set. That is as long as the other disks were also DILX'd prior to creating the set. Otherwise you end up with an error if the disk size varies, or they reconstruct to the lowest size in the set.

When installing Exchange 5.5, on NT4 and a MA8000 with two Microsoft Clustered servers I ran in to the exact same type of error.

We found out two things in the end, use KGPSA HBA's that are 66 MHZ and not 33MHZ, and ensure they are in slot 3 (DEC/Compaq Recommended)and slot 5. Never the first or last slot. Can't hurt to try.
Uwe Zessin
Honored Contributor

Re: W2K Disk error for HSG80 hosted unit

David,
it sounds like R3 is a RAID(5)set. The REDUCE command does only work on mirrorsets to split off a clone.

set r3 nopolicy ! so that no spare disk is selected in
set r3 remove=disk50200
delete disk50200
- swap disk
add disk disk50200 5 2 0
set r3 replace=disk50200
set r3 policy=?

I don't see a need to erase the disk due to the 'EISA partition' (I really wonder which 'genius' has forced this mess on us...) as it will be overwritten from the RAID5 reconstruct anyway.

It is never the less a good idea to use DILX and create a bit load on the disk before it is put into use. One of my colleagues encountered a new disk this week that almost crushed the whole storage system when it was accessed.

--
Wayne,
I have never seen that DILX reduces a physical disk's size. As far as I know you can only 'DILX' a unit. Can you explain a bit, please?
.
David Borojevic
Trusted Contributor

Re: W2K Disk error for HSG80 hosted unit

Thanks to all, New disk in happy, so I expect no more errors !?!?

Yes Uwe, the first line of the command reference does state that reduce is only for mirrorsets - I obviously missed that bit.

Note that the commands required were slightly different. The set remove=disk50200 puts it in the failedset so it has to be deleted from the failedset before the delete disk50200. Also I put it into a unit to dilx it just to be sure.

Thanks again.
Uwe Zessin
Honored Contributor

Re: W2K Disk error for HSG80 hosted unit

Oh, sorry I missed that. I must have still been focused on the REDUCE somehow. My apologies - it is good to hear that you could handle it anyway.
.