Disk Enclosures
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to delete/reinitialize RAID array on HSJ40 controller pair

SOLVED
Go to solution
Brad Emrich
Occasional Visitor

Unable to delete/reinitialize RAID array on HSJ40 controller pair

As requested I am starting a new thread instead of replying to an existing one.

I lost a RAID array in a power failure. I do not need to recover any lost data. I just need to re-initialize and wipe the drive. However no commands work, and I am unable to free up any of the spindles. I am following my normal procedure for adding and removing RAID arrays but I seem to be getting stuck in a loop at this point:

HSJ014>
HSJ014>delete d1443
Error 9410: Cannot delete unit -- LOST_DATA error exists on unit that
must be cleared first. To clear error type:
CLEAR_ERRORS D1443 LOST_DATA
HSJ014>
HSJ014>CLEAR_ERRORS D1443 LOST_DATA
Error 7120: CLEAR LOST_DATA attempt failed on D1443.
HSJ014>
HSJ014>delete d1443
Error 9410: Cannot delete unit -- LOST_DATA error exists on unit that
must be cleared first. To clear error type:
CLEAR_ERRORS D1443 LOST_DATA
HSJ014>
HSJ014>CLEAR_ERRORS D1443 LOST_DATA
Error 7120: CLEAR LOST_DATA attempt failed on D1443.
HSJ014>
HSJ014>

Here are requested responses from the main controller:

Copyright Compaq Computer Corporation 1993, 1998. All rights reserved.
HSJ40 Firmware version V37J-1, Hardware version H09

Last fail code: 018700A0

Press " ?" at any time for help.


HSJ014>
HSJ014>
HSJ014>
HSJ014>show this full
Controller:
HSJ40 (C) DEC ZG54612319 Firmware V37J-1, Hardware H09
Configured for dual-redundancy with ZG54412018
In dual-redundant configuration
SCSI address 7
Time: 13-APR-2010 17:03:01
Host port:
Node name: HSJ014, valid CI node 3, 32 max nodes
System ID 420010031922
Path A is ON
Path B is ON
MSCP allocation class 5
TMSCP allocation class 5
CI_ARBITRATION = SYNCHRONOUS
MAXIMUM_HOSTS = 15
Cache:
32 megabyte write cache, version 2
Cache is GOOD
Battery is FAILED
No unflushed data in cache
CACHE_FLUSH_TIMER = DEFAULT (10 seconds)
CACHE_UPS
Licensing information:
RAID (RAID Option) is ENABLED, license key is VALID
WBCA (Writeback Cache Option) is ENABLED, license key is VALID
MIRR (Disk Mirroring Option) is DISABLED, license key is INVALID
Extended information:
Terminal speed 9600 baud, eight bit, no parity, 1 stop bit
Operation control: 00000000 Security state code: 77685
Configuration backup disabled
HSJ014>
HSJ014>
HSJ014>show other full
Controller:
HSJ40 (C) DEC ZG54412018 Firmware V37J-1, Hardware H07
Configured for dual-redundancy with ZG54612319
In dual-redundant configuration
SCSI address 6
Time: 13-APR-2010 17:03:14
Host port:
Node name: HSJ14B, valid CI node 4, 32 max nodes
System ID 42001004F128
Path A is ON
Path B is ON
MSCP allocation class 5
TMSCP allocation class 5
CI_ARBITRATION = SYNCHRONOUS
MAXIMUM_HOSTS = 15
Cache:
32 megabyte write cache, version 2
Cache is GOOD
Battery is FAILED
No unflushed data in cache
CACHE_FLUSH_TIMER = DEFAULT (10 seconds)
CACHE_UPS
Licensing information:
RAID (RAID Option) is ENABLED, license key is VALID
WBCA (Writeback Cache Option) is ENABLED, license key is VALID
MIRR (Disk Mirroring Option) is DISABLED, license key is INVALID
Extended information:
Terminal speed 9600 baud, eight bit, no parity, 1 stop bit
Operation control: 00000000 Security state code: 77739
Configuration backup disabled
HSJ014>
HSJ014>
HSJ014>show storage full
Name Storageset Uses Used by
------------------------------------------------------------------------------

R0 raidset DISK210 D1410
DISK320
DISK410
Switches:
POLICY (for replacement) = BEST_PERFORMANCE
RECONSTRUCT (priority) = NORMAL
CHUNKSIZE = 256 blocks
State:
NORMAL
DISK410 (member 0) is NORMAL
DISK210 (member 1) is NORMAL
DISK320 (member 2) is NORMAL
Size: 35529666 blocks

R2 raidset DISK110 D1411
DISK220
DISK330
DISK440
DISK550
DISK600
Switches:
POLICY (for replacement) = BEST_PERFORMANCE
RECONSTRUCT (priority) = NORMAL
CHUNKSIZE = 256 blocks
State:
NORMAL
DISK600 (member 0) is NORMAL
DISK110 (member 1) is NORMAL
DISK220 (member 2) is NORMAL
DISK330 (member 3) is NORMAL
DISK440 (member 4) is NORMAL
DISK550 (member 5) is NORMAL
Size: 88824180 blocks

R3 raidset DISK120 D1412
DISK230
DISK340
DISK450
DISK500
DISK610
Switches:
POLICY (for replacement) = BEST_PERFORMANCE
RECONSTRUCT (priority) = NORMAL
CHUNKSIZE = 256 blocks
State:
NORMAL
DISK500 (member 0) is NORMAL
DISK610 (member 1) is NORMAL
DISK120 (member 2) is NORMAL
DISK230 (member 3) is NORMAL
DISK340 (member 4) is NORMAL
DISK450 (member 5) is NORMAL
Size: 41879900 blocks

R4 raidset DISK130 D1413
DISK240
DISK350
DISK400
DISK530
DISK620
Switches:
POLICY (for replacement) = BEST_PERFORMANCE
RECONSTRUCT (priority) = NORMAL
CHUNKSIZE = 256 blocks
State:
NORMAL
DISK400 (member 0) is NORMAL
DISK130 (member 1) is NORMAL
DISK240 (member 2) is NORMAL
DISK350 (member 3) is NORMAL
DISK530 (member 4) is NORMAL
DISK620 (member 5) is NORMAL
Size: 41879900 blocks

R5 raidset DISK140 D1414
DISK250
DISK300
DISK420
DISK510
DISK630
Switches:
POLICY (for replacement) = BEST_PERFORMANCE
RECONSTRUCT (priority) = NORMAL
CHUNKSIZE = 256 blocks
State:
NORMAL
DISK140 (member 0) is NORMAL
DISK250 (member 1) is NORMAL
DISK300 (member 2) is NORMAL
DISK420 (member 3) is NORMAL
DISK510 (member 4) is NORMAL
DISK630 (member 5) is NORMAL
Size: 41879900 blocks

R6 raidset DISK150 D1415
DISK200
DISK310
DISK520
Switches:
POLICY (for replacement) = BEST_PERFORMANCE
RECONSTRUCT (priority) = NORMAL
CHUNKSIZE = 256 blocks
State:
NORMAL
DISK150 (member 0) is NORMAL
DISK200 (member 1) is NORMAL
DISK310 (member 2) is NORMAL
DISK520 (member 3) is NORMAL
Size: 53294505 blocks

R7 raidset DISK100 D1443
DISK430
DISK540
DISK650
Switches:
NOPOLICY (for replacement)
RECONSTRUCT (priority) = NORMAL
CHUNKSIZE = 256 blocks
State:
NORMAL
DISK100 (member 0) is NORMAL
DISK430 (member 1) is NORMAL
DISK540 (member 2) is NORMAL
DISK650 (member 3) is NORMAL
Size: 53294505 blocks

SPARESET spareset

FAILEDSET failedset
Switches:
NOAUTOSPARE
HSJ014>
HSJ014>
HSJ014>show unit full
MSCP unit Uses
--------------------------------------------------------------

D1410 R0
Switches:
RUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
ONLINE to the other controller
No exclusive access
PREFERRED_PATH = OTHER_CONTROLLER
Size: 35529666 blocks
D1411 R2
Switches:
RUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
ONLINE to the other controller
No exclusive access
PREFERRED_PATH = OTHER_CONTROLLER
Size: 88824180 blocks
D1412 R3
Switches:
RUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
ONLINE to the other controller
No exclusive access
PREFERRED_PATH = OTHER_CONTROLLER
Size: 41879900 blocks
D1413 R4
Switches:
RUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
ONLINE to this controller
No exclusive access
PREFERRED_PATH = THIS_CONTROLLER
Size: 41879900 blocks
D1414 R5
Switches:
RUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
ONLINE to this controller
No exclusive access
PREFERRED_PATH = THIS_CONTROLLER
Size: 41879900 blocks
D1415 R6
Switches:
RUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
ONLINE to this controller
No exclusive access
PREFERRED_PATH = THIS_CONTROLLER
Size: 53294505 blocks
D1443 R7
Switches:
RUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
INOPERATIVE
Unit has lost data
PREFERRED_PATH = OTHER_CONTROLLER
WRITE_PROTECT - DATA SAFETY
Size: 53294505 blocks
D1455 DISK640
Switches:
RUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
ONLINE to the other controller
No exclusive access
PREFERRED_PATH = OTHER_CONTROLLER
Size: 8378028 blocks
HSJ014>
HSJ014>


I am running Open VMS 7.3-1 ALPHA. Two nodes are giving this response:

sho dev/full $5$dua1443:

Disk $5$DUA1443: (HSJ14B), device type MSCP served SCSI disk array, is online,
file-oriented device, shareable, available to cluster, error logging is
enabled.

Error count 0 Operations completed 0
Owner process "" Owner UIC [ZEBEC,SYSTEM]
Owner process ID 00000000 Dev Prot S:RWPL,O:RWPL,G:R,W
Reference count 0 Default buffer size 512
Host name "HSJ14B" Host type, avail HSJ4, yes
Allocation class 5


...while the other node is giving this response:

sho dev/full $5$dua1443:

Disk $5$DUA1443: (HSJ14B), device type MSCP served SCSI disk array, is online,
file-oriented device, shareable, available to cluster, error logging is
enabled.

Error count 36 Operations completed 33
Owner process "" Owner UIC [ZEBEC,SYSTEM]
Owner process ID 00000000 Dev Prot S:RWPL,O:RWPL,G:R,W
Reference count 0 Default buffer size 512
Total blocks 53294505 Sectors per track 169
Total cylinders 5256 Tracks per cylinder 60
Host name "HSJ14B" Host type, avail HSJ4, yes
Allocation class 5


The drive in question seems to be controlled by the backup controller. I have run the command "clear_errors other_controller invalid_cache destroy_unflushed_data" many many times but it doesn't seem to be clearing the cache successfully. Is there any way to wipe this RAID array without wiping the whole controller full of drives? Any help is
appreciated. Thanks!


4 REPLIES
cnb
Honored Contributor
Solution

Re: Unable to delete/reinitialize RAID array on HSJ40 controller pair

Hi Brad,


D1443 R7
Switches:
NORUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
NO VOLUME MOUNTED
Unit has lost data
PREFERRED_PATH = OTHER_CONTROLLER
WRITE_PROTECT - DATA SAFETY
Size: 53294505 blocks
HSJ014>set d1443 run
HSJ014>show d1443
MSCP unit Uses
--------------------------------------------------------------

D1443 R7
Switches:
RUN NOWRITE_PROTECT READ_CACHE
WRITEBACK_CACHE
MAXIMUM_CACHED_TRANSFER_SIZE = 32
State:
INOPERATIVE
Unit has lost data
PREFERRED_PATH = OTHER_CONTROLLER
WRITE_PROTECT - DATA SAFETY
Size: 53294505 blocks
HSJ014>show r7
Name Storageset Uses Used by
------------------------------------------------------------------------------

R7 raidset DISK100 D1443
DISK430
DISK540
DISK650
Switches:
NOPOLICY (for replacement)
RECONSTRUCT (priority) = NORMAL
CHUNKSIZE = 256 blocks
State:
NORMAL
DISK100 (member 0) is NORMAL
DISK430 (member 1) is NORMAL
DISK540 (member 2) is NORMAL
DISK650 (member 3) is NORMAL
Size: 53294505 blocks
HSJ014>
HSJ014>

The drive is write-protected and issuing the command from the non-preferred path won't clear the errors.

Access the controller that currently owns the drive, either through the physical maintenance cable port or the VMS DCL command: $ SET HOST/DUP/SERV=MSCP$DUP/TASK=CLI HSJ14B

Now try the clear your LOST_DATA errors.

HSJ14B> CLEAR_ERRORS LOST_DATA D1443

Let me know if that doesn't work.

Rgds,

If it does please assign points. http://forums11.itrc.hp.com/service/forums/helptips.do?#33
Brad Emrich
Occasional Visitor

Re: Unable to delete/reinitialize RAID array on HSJ40 controller pair

I tried the command from the backup controller and it worked great! After replacing 2 of the 4 drives and initializing I am now ready to go!

Thanks a BUNCH!!!!!
cnb
Honored Contributor

Re: Unable to delete/reinitialize RAID array on HSJ40 controller pair

Glad you're back in business!

Rgds,
Louis Henninger_1
Regular Advisor

Re: Unable to delete/reinitialize RAID array on HSJ40 controller pair

You should also consider replacing your batteries in both controllers.

Things don't always work when controllers have batteries flagged as bad.

Regards,

Louis H.