1822152 Members
3552 Online
109640 Solutions
New Discussion юеВ

RA8000

 
Ben Pettit
Occasional Advisor

RA8000

I have an RA8000 FCAL with 2 NT servers, MSCS, 2 X 7 port fibre hubs, dual redundent controllers both using port1 only - active/active, multiple bus failover configuration with secure path. One of the disks in a raid_set has failed and I need to replace it. I know how to do this using the CC but am unsure which controller to queisce in order to physically remove it. Any advice?
23 REPLIES 23
Michael Schulte zur Sur
Honored Contributor

Re: RA8000

Hi,

assuming, you have a hsz80, you should recognize the controller by the wwwid of the disk to remove, correct me, if I am wrong.

greetings,

Michael
John Silk
Advisor

Re: RA8000

Questions:
1) You don't mention whether you have a hot spare
2) You dont mention what type of raidset, how many drives you have in the system or whether you desire to maintain the layout.
3) does the drive have an amber LED on

Assumptions:
1) RAID5 or RAID1
1) No Spare
2) Amber LED on

Suggestions:
1) Enable Autospare
2) Remove the drive
3) Replace the drive
4) Monitor the rebuild
Striving for perfection is a worthwhile goal
Ben Pettit
Occasional Advisor

Re: RA8000

Michael, sorry I dont understand what you mean could you elaborate. Are the disks labeled in accordance with the controller?

John, I dont have a hot spare in place though I have a spare drive, it is a raid 5 raidset and the amber light is on the drive and is showing failed in the command console.

I have read the manual and am clear on the commands and process for removing the drive from the failedset and so forth using the console.

What I am unsure of is which controller in my configuration I need to quiesce in order to physically remove the drive. As I understand it with my configuration both controllers are connected to the storage unit using port 1 therefore there is only one bus?

In other words how do I know which controller to quiesce in order to stall the bus which the faulty disk is located on - and the new disk when I have replaced the failed drive - in order to do a warm swap.

Hope this explains things more clearly. Unfortunately I have inherited this Storage Unit and it is now out of warranty. Fortunately the volume which is broken is not mission critical.

Many thanks for your help.
Michael Schulte zur Sur
Honored Contributor

Re: RA8000

Hi,

as much as I could find was, that a simple removal may interrupt the loop.

Hi Ben,

do you have a hsz80? If so, can you post show disk full for that disk?

greetings,

Michael
Ben Pettit
Occasional Advisor

Re: RA8000

The contollers are HSG80, disk map attached.

As explained I need to know the process to remove a disk from an active/active controller configuration.
Ben Pettit
Occasional Advisor

Re: RA8000

In case anyone is in a similar situation to my own - I rang HP and spoke to a techie who told me that I would need to press the quiesce buttons on both controllers for the bus that my faulty drive was connected to. I tried this and nothing happend. I tried again with just one controller and the quiesce sequence started and I was able to swap my faulty drive. From this "experiment" I deduce that either controller will do as they are both on the same bus. And don't listen to HP support personnel....
Michael Schulte zur Sur
Honored Contributor

Re: RA8000

Hi,

I have found this in the manual for hsg80. It does not say anything about quiescing, only in cases related to the controller/cache.

greetings,

Michael




2├в 16 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide
Removing a Failed RAIDset or Mirrorset Member
Use the following steps to remove a failed RAIDset or mirrorset member:
1. Connect a PC or terminal to the controller maintenance port that accesses the reduced
RAIDset or mirrorset.
2. Enable AUTOSPARE with the following command:
SET FAILEDSET AUTOSPARE
With AUTOSPARE enabled, any new disk drive├в one that has not been in an array
before├в inserted into the Port-Target-LUN (PTL) location of a failed disk drive is
automatically initialized and placed into the spareset.
3. Remove the failed di
Rob Buxton
Honored Contributor

Re: RA8000

I have never quiesced an HSG80 Controller prior to removing a drive.

We still have some old HSD Controllers here and on those you quiesce the channel.

In our configuration we have a pair of HSG80s in Multibus failure mode and have both channels connected. Running W2000 and W2003 Servers. No NT Clusters though.

Also, I've never quiesced an HSG80 Controller when adding drives and I've performaed that action a number of times.
Ben Pettit
Occasional Advisor

Re: RA8000

http://wwss1pro.compaq.com/support/reference_library/viewdocument.asp?source=OD011115_CW01.xml&dt=3

I read this such that you can hot swap but warm swap is the recommended course of action for data protection.
Michael Schulte zur Sur
Honored Contributor

Re: RA8000

Hi,

to me it looks like warm swap is for removing working disks or for those exceptions described in the requirement for host swap. In you case I would enable autospare, delete all references to the failed disk, remove it and if possible, replace by an unused disk. The system should take care of it automatically then. But I guess, warm swap is never a bad thing to do.

greetings,

Michael
Andrew_168
Regular Advisor

Re: RA8000

If you look at the other drives closely when you add a drive, all activity stops whilst the drive spins up, this can cause delayed write errors on the servers, the answer is always quiesce the bus involved.
Ben Pettit
Occasional Advisor

Re: RA8000

The saga continues.... I replaced my drive - using the quiesce function - for it to fail within a few days. I have since replaced that drive but it doesn't go into the spareset or into the reduced array. I have tried to use the replace function:

upper >set r0 nopolicy

upper >show r0

Name Storageset Uses Used by
------------------------------------------------------------------------------

R0 raidset DISK20000 D1
DISK30000
DISK40000
DISK50000
DISK60000
Switches:
NOPOLICY (for replacement)
RECONSTRUCT (priority) = NORMAL
CHUNKSIZE = 256 blocks
State:
REDUCED
DISK20000 (member 1) is NORMAL
DISK30000 (member 2) is NORMAL
DISK40000 (member 3) is NORMAL
DISK50000 (member 4) is NORMAL
DISK60000 (member 5) is NORMAL
Size: 88824180 blocks

upper >set r0 rep=disk10000

Error 3190: Unable to replace DISK10000 in R0
upper >

But get error 3190 I cant find any related fault for this number. Is it more likely there is a fault with contoller port?

Leon Rosier
Respected Contributor

Re: RA8000

For the later HSOF versions for the HSG 80 controller you don't have to queuesc the bus anymore.
Did you first do an "init disk10000" before you tried to add it to the raid set?
You should do this, perhaps the disk has old raidinfo on it.

Leon
Michael Schulte zur Sur
Honored Contributor

Re: RA8000

Hi,

can you post the protion of show disk full of disk10000 and disk20000?

Michael
Ben Pettit
Occasional Advisor

Re: RA8000

Yes I did initialise the disk first but this didnt seem to make any difference. After the attempt to replace the disk failed it put the disk into the failedset as below.

show disk10000

Name Type Port Targ Lun Used by
------------------------------------------------------------------------------

DISK10000 disk 1 0 0 FAILEDSET
COMPAQ ST19171WC 9A10
Switches:
NOTRANSPORTABLE
TRANSFER_RATE_REQUESTED = 20MHZ (synchronous 20.00 MHZ negotiated)
Size: NOT YET KNOWN

upper >
show disk20000

Name Type Port Targ Lun Used by
------------------------------------------------------------------------------

DISK20000 disk 2 0 0 R0
DEC RZ2DD-LS (C) DEC 0306
Switches:
NOTRANSPORTABLE
TRANSFER_RATE_REQUESTED = 20MHZ (synchronous 20.00 MHZ negotiated)
Size: 17769177 blocks
upper >
Leon Rosier
Respected Contributor

Re: RA8000

Take the disk out of the failed set, re-do a init disk10000 and check the size compared to disk20000. Maybe there is a problem with the size.

Leon
Michael Schulte zur Sur
Honored Contributor

Re: RA8000

Hi,

try this:
delete fail disk10000
run config
set r0 rep=disk10000

greetings,

Michael
Ben Pettit
Occasional Advisor

Re: RA8000

I ran your instructions Michael but no joy - got this message back - the repair code 54 says "The device may be in a state that prevents adding the device as a replacement member or may not be large enough for the storageset. Use another device for the add action and perform repair action 57 for the device that failed to be added." Action 57 just tells you to put the disk into the spareset to be re-used.
The disk is the same size as those already in there the only difference being it's a seagate and the others are orginal DEC - maybe disk/firmware is causing the problem?

Config - Normal Termination
upper >set r0 rep=disk10000

%EVL--upper >--28-JAN-2004 09:33:07-- Instance Code: 02695401
Template: 81.(51)
Occurred on 28-JAN-2004 at 09:33:07
Power On Time: 4. Years, 88. Days, 1. Hours, 16. Minutes, 43. Seconds
Controller Model: HSG80
Controller Model: HSG80
Software Version: V83G(53)
Unit Number: 1.(0001)
Unit Software Version: 1.(01) Unit Hardware Version: 54.(36)
Retry Level: 1. Retries: 1.
Port: 1. Target: 0. LUN: 0.
SCSI Device Type: 0.(00)
Device ID: "ST19171WC" Device Serial Number: "LA422802"
Device Software Revision Level: "9A10"
SCSI Command Opcode: 0.(00)
Sense Data Qualifiers: 0.(00)
SCSI Sense Data:
Error Code: 112.(70) {current command execution}
Information field is valid
Segment: 0.(00)
Sense Key: 6.(06) UNIT ATTENTION
ILI: 0 EOM: 0 FM: 0
Information: 00000000
Additional Sense Length: 0.(00)
Command-Specific Information: 00000000
ASC: 160.(A0) ASCQ: 7.(07)
FRU: 0.(00) Sense-Key Specific: 000000
Instance Code: 02695401
Error 3190: Unable to replace DISK10000 in R0
upper >
Leon Rosier
Respected Contributor

Re: RA8000

Can you also place a printout here of:

Show disk10000
Show disk20000

At the previous examples the size of disk10000 is not shown (size: not yet known)

Thx, Leon
Michael Schulte zur Sur
Honored Contributor

Re: RA8000

Hi,

disk10000 is a 9.1gb disk, so it should be large enough. Unbelievable.
try add spare disk10000
Please post show this full.

Michael
Leon Rosier
Respected Contributor

Re: RA8000

No it is not that simple..it's the amount of blocks available what makes a disk suitable or not.

Leon
Michael Schulte zur Sur
Honored Contributor

Re: RA8000

Hi,

I have found one more thing. According to this page:
http://www.seagate.com/support/disc/specs/scsi/st19171w.html

disk10000 is ultra wide scsi

DEC RZ2DD-LS 9.1GB 10K RPM Ultra2DskDrv(ES40) - LVD Carrier only

I have searched a lot, but I have not come to a final conclusion. Does ra8000 support only lvd drives?

Michael

Michael Schulte zur Sur
Honored Contributor

Re: RA8000

Hi,

one more thing. Post a link to this thread in tru64 forum. Perhaps Ralf Puchner has more insight into this. I don't know, if he looks into the other forums.

Michael