Operating System - OpenVMS
1753534 Members
5360 Online
108795 Solutions
New Discussion юеВ

hsz70 ctl, lost pwr to storage box

 
SOLVED
Go to solution
Dean McGorrill
Valued Contributor

hsz70 ctl, lost pwr to storage box

I have ds20's connected to storage works boxes
2 in a 6' rack, one other was added on the side. each have 24 disks scsi, model don't know
and too po'd right now to remember.

lab folks decided to move the systems a couple
of feet and thought it could be moved power on.

problem, during move the side box power got
removed for a while. when I finally got the
hsz70 console back connected, all the side storage disks were failed out into the failedset.

The Question, all the failed disks are blinking
like they are bad. Does the hsz70 controller make them blink bad if they are in the failedset?? If so, if I remove them from the
failed set then I can rebuild all my mirrors etc. tx, all posts pointed - sigh tx dean
11 REPLIES 11
Hein van den Heuvel
Honored Contributor
Solution

Re: hsz70 ctl, lost pwr to storage box

I think they are blinking for want of some TLC. Sounds like you are not ready for that just now. Take a deep breath and give them the attention they bleeb for. Given the scenario you know that the drives are likely just fine, even though moving spinning disks around is not a great idea. The HSZ however just saw them dissapear and miracoulesly come back. Thus they are labelled suspect. I think you can just remove from faileset and reassign to spareset. CHeck out the HSZ70 - HSOF doc, for example page 4-28 in:
ftp://ftp.compaq.com/pub/products/storageworks/techdoc/controllers/EK-HSZ70-CG-B01.pdf

hth,
Hein.
Hoff
Honored Contributor

Re: hsz70 ctl, lost pwr to storage box

I'd start with a serious and careful system upgrade to the lab servants. This potentially involves replacement of all memory contents, and potentially also a cold reboot to clear existing all existing errant processes within the servants. Even servant replacement. I'd also check the backup and related processing, as cases such as this one tend to have additional and parallel secondary manifestations within the BACKUP processing; these sorts of servant failures tend to involve multiple systems within the environment.

As for the HSZ70, you are likely going to end up rebuilding the RAID sets. The blinking disks failed, after all, secondary to the operational failures of the servants.

The documentation for the controller has the sequence for clearing out and re-adding disks, assuming your data still exists.

The documentation on recovering from servant failure and -- if deemed necessary -- servant replacement is generally available separately, and not included with the OpenVMS nor the HSZ70 documentation.

:-)

Dean McGorrill
Valued Contributor

Re: hsz70 ctl, lost pwr to storage box

Gentlemen,
I was able to clear the
blinking lights by removing all the drives
from the failed set. fortunately this storage box I had configured for only for mirrors, thus my system is up for use. I now must rebuild all my mirrorsets. Thank you! Dean
Dean McGorrill
Valued Contributor

Re: hsz70 ctl, lost pwr to storage box

theres some foot notes to the power loss story. it turns out the storage ba370 box was not
moved but its power had to be cut to unentangle the wiring maze. power was removed from one side only (redundant supplies) returned then the other side was
removed and returned. power swiches were
on but no juice after that, why? flipping both switchs reset the power. next I did
lose 2 disks from the power supply maneuver. that surprised me! ? they were
under warrenty, their replacesments came
today. power hits are common, anyone else
lose disks this way? I would not think
they would be that fragile. Dean
Ian Miller.
Honored Contributor

Re: hsz70 ctl, lost pwr to storage box

power blips can often lead to hardware failure especially for systems which have been on a long time.
____________________
Purely Personal Opinion
Rob Leadbeater
Honored Contributor

Re: hsz70 ctl, lost pwr to storage box

If they're in a BA370 shelf, chances are the drives have been spinning and their bearings wearing out for a *long* time.

It's not at all surprising that some didn't spin back up after a power outage. Pulling the drives and giving them a slight knock may well have got them spinning again...

Cheers,

Rob
Dean McGorrill
Valued Contributor

Re: hsz70 ctl, lost pwr to storage box

Rob,
these were new drives out of the box.
they could have been sitting a few years
at the vendor but they had not been spinning long. It really surprized me
how fragile they were! -Dean
Wim Van den Wyngaert
Honored Contributor

Re: hsz70 ctl, lost pwr to storage box

I'm already 22 hours on site (with a short sleep in the middle). We had FDDI problems. The card was to blame. We replace the card. PCI power has gone dead. We replace the power. New card is malfunctioning. Old card again. Dead. The neighbour GS160 member also want to fail. Just as the first node is functioning again, the neigbour node starts with a hang. Power cycle. Node boots but give memory ecc errors after a while. And power problems in error log. And we're still busy.

Was power on for a few years.

Do not touch the nodes if possible !

Wim
Wim
Ian Miller.
Honored Contributor

Re: hsz70 ctl, lost pwr to storage box

Dean,
what model of drives where they?
They may have been made a while ago and they may have been repaired (recycled).
____________________
Purely Personal Opinion