HPE 3PAR StoreServ Storage
1827232 Members
2121 Online
109716 Solutions
New Discussion

Re: help fixing botched drive replacement

 

help fixing botched drive replacement

We have a couple drives that were attempted to be replaced, incorrectly, a while back. I am attempting to bring the new drives online but have been unsuccessful. I am assuming I need to move chunklets and other commands but am not clear what steps to take, since proper procedure was not followed to start with.

Assuming an expert can determine what I need to do with the following info:

 

3par.lctn.org cli% servicemag status -d
Cage 0, magazine 0:
A servicemag resume command failed on this magazine.
The command completed at Tue Jan 2 11:56:32 2018.
The output of the servicemag resume was:
servicemag resume 0 0
... mag 0 0 already onlooped
... firmware is current on pd WWN [2000B45253744ED3] Id [ 0]
... firmware is current on pd WWN [2000B45253744FCB] Id [ 6]
... checking for valid disks...
... checking for valid disks...
... disks not normal yet..trying admit/onloop again
... onlooping mag 0 0
... checking for valid disks...
... checking for valid disks...
... disks not normal yet..trying admit/onloop again
... onlooping mag 0 0
... checking for valid disks...
... checking for valid disks...
... disks not normal yet..trying admit/onloop again
... onlooping mag 0 0
... checking for valid disks...
Failed --
disk WWN [2000B45253744FCB] Id [ 6] is not normal. Please use showpd -s to see details of disk state
servicemag resume 0 0 -- Failed

Cage 0, magazine 6:
A servicemag resume command failed on this magazine.
The command completed at Wed Dec 20 08:14:51 2017.
The output of the servicemag resume was:
servicemag resume 0 6
... onlooping mag 0 6
Failed --
Unable to access drive magazine via one or more loops in cage cage0. No drive magazine is present or detected in that position.
Failed --
Unable to access drive magazine via one or more loops in cage cage0. No drive magazine is present or detected in that position.
servicemag resume 0 6 -- Failed

Cage 0, magazine 12:
A servicemag resume command failed on this magazine.
The command completed at Tue Jan 2 12:40:44 2018.
The output of the servicemag resume was:
servicemag resume 0 12
... mag 0 12 already onlooped
... firmware is current on pd WWN [2000B45253744ED3] Id [ 0]
... checking for valid disks...
... checking for valid disks...
... disks not normal yet..trying admit/onloop again
... onlooping mag 0 12
... checking for valid disks...
... checking for valid disks...
... disks not normal yet..trying admit/onloop again
... onlooping mag 0 12
... checking for valid disks...
... checking for valid disks...
... disks not normal yet..trying admit/onloop again
... onlooping mag 0 12
... checking for valid disks...
Failed --
disk WWN [2000B45253744ED3] Id [ 0] is not normal. Please use showpd -s to see details of disk state
servicemag resume 0 12 -- Failed

 

3par.lctn.org cli% showpd -s
Id CagePos Type -State- -------------------------------Detailed_State-------------------------------
0 0:12:0 FC failed vacated,notready,not_available_for_allocations,invalid_media,failed_hardware
1 0:3:0 NL normal normal
2 0:4:0 FC normal normal
3 0:7:0 NL normal normal
4 0:8:0 FC normal normal
5 0:11:0 NL normal normal
6 0:0:0 FC failed vacated,notready,invalid_media,servicing
7 0:15:0 NL normal normal

 

 

 

 

11 REPLIES 11
Torsten.
Acclaimed Contributor

Re: help fixing botched drive replacement

Something went wrong. What means "incorrectly" - what exactly was done?

What 3PAR model is it?

0:0:0 and 0:12:0 are obviously failed and should be replaced.

There is probably no disk in 0:6:0


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   

Re: help fixing botched drive replacement

Drives were replaced with new without going through proper steps.

Model F200

 

I believe the replacement drives are good. At least they show drive size, etc... and there are not any amber lights. I can run recommended commands to confirm.

Torsten.
Acclaimed Contributor

Re: help fixing botched drive replacement

>> without going through proper steps.

What has been done?

Pulling the drives without running servicemag?

Please show

CLI%  showpd -failed -degraded


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   

Re: help fixing botched drive replacement

Correct. Servicmag was not run and old drives are no longer available.

 

showpd -failed -degraded
-Size(MB)-- ----Ports----
Id CagePos Type RPM State Total Free A B Capacity(GB)
0 0:12:0 FC 15 failed 285440 0 0:0:1* 1:0:1 300
6 0:0:0 FC 15 failed 285440 0 0:0:1 1:0:1* 300

I will need to order new drives if these are indeed bad.

Torsten.
Acclaimed Contributor

Re: help fixing botched drive replacement

Please confirm - the original bad disks are pulled and the new disks inserted without running any command?

Please post the output of

 

cli%  showcage -d cage0


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   

Re: help fixing botched drive replacement

No commands were run.

 

showcage -d cage0
Id Name LoopA Pos.A LoopB Pos.B Drives Temp RevA RevB Model Side
0 cage0 0:0:1 0 1:0:1 0 8 33-34 08 08 DC3 n/a

-----------Cage detail info for cage0 ---------

Position: bay9 D2
-----------Midplane Info------------
VendorId,ProductId 3PARdata,DC3
Assembly Serial_Num OPS69907C01C3C6
Node_WWN 20000050CC01C3C6
TempSensor_State OK
TempSensor_Value 35
OpsPanel_State OK
Audible_Alarm_State Muted
ID_Switch 1
Cage_State OK

Interface Board Info LoopA LoopB
Firmware_status Current Current
Product_Rev 08 08
IFC_State OK OK
ESH_State OK OK
Master_CPU Yes No
Loop_Map valid valid
Link_Speed 4Gbps 4Gbps
Port0_State OK OK
Port1_State No_SFP No_SFP
Port2_State No_SFP No_SFP
Port3_State No_SFP No_SFP

Power Supply Info State Fan State AC Assy_Part
ps0 OK MedSpeed OK --
ps1 OK MedSpeed OK --

--------------Drive Info--------------- ----LoopA----- ----LoopB-----
Drive NodeWWN State Temp(C) ALPA LoopState ALPA LoopState
0:0 2000b45253744fcb Degraded N/A 0xe1 OK 0xe1 OK
3:0 2210000a33015be8 Normal 33 0xda OK 0xda OK
4:0 2000b452537450d9 Normal 34 0xd9 OK 0xd9 OK
7:0 2210000a33015b5d Normal 34 0xd4 OK 0xd4 OK
8:0 2000b452537438ab Normal 34 0xd3 OK 0xd3 OK
11:0 2210000a33015b42 Normal 34 0xce OK 0xce OK
12:0 2000b45253744ed3 Degraded N/A 0xcd OK 0xcd OK
15:0 2210000a33015b1b Normal 34 0xca OK 0xca OK
3par.lctn.org cli%

Torsten.
Acclaimed Contributor

Re: help fixing botched drive replacement

Please try a

cli% admithw -checkonly


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   

Re: help fixing botched drive replacement

admithw -checkonly
Checking for drive table upgrade packages
Checking nodes...

Checking volumes...

Checking system LDs...

After an upgrade from 2.2.4 or earlier, if sufficient space is present,
admithw will automatically remove and recreate the preserved data LDs
with a larger set size to improve their availability.

Checking ports...

Checking state of disks...
The following disks are NOT in an acceptable state:
Id CagePos Type -State- -------------------------------Detailed_State-------------------------------
0 0:12:0 FC failed vacated,notready,not_available_for_allocations,invalid_media,failed_hardware
6 0:0:0 FC failed vacated,notready,invalid_media,servicing
----------------------------------------------------------------------------------------------------
2 total

Re: help fixing botched drive replacement

Bump

Dennis Handly
Acclaimed Contributor

Re: help fixing botched drive replacement

> % servicemag status -d

 

Did you do a servicemag start?  Or was it automatically done?

What OS version do you have?  How far back is "a while back"?

 

> 0 0:12:0 FC failed vacated,notready,not_available_for_allocations,invalid_media,failed_hardware

> 6 0:0:0 FC failed vacated,notready,invalid_media,servicing

 

For 6, it just seems you need to put a good drive there.

After that's fixed, try a servicemag start on 0.

 

What does "showpd -i" show?  Do these WWNs match those under "showcage -d cage0"?

 

Re: help fixing botched drive replacement

Release version 3.1.3 (MU1)
Patches: P09,P10

Component Name Version
CLI Server 3.1.3 (P10)
CLI Client 3.1.3
System Manager 3.1.3 (P10)
Kernel 3.1.3 (MU1)
TPD Kernel Code 3.1.3 (MU1)
TPD Kernel Patch 3.1.3 (P10)

 

3par.lctn.org cli% showpd -i
Id CagePos State ----Node_WWN---- --MFR-- -----Model------ ----Serial---- -FW_Rev-- Protocol MediaType
0 0:12:0 failed 2000B45253744ED3 SEAGATE SEGLE0300GBFC15K 6SJ3GF73 3P00 FC Magnetic
1 0:3:0 normal 2210000A33015BE8 Hitachi HUA722020ALA330 JK11B1BFJ17TXF A3GG,1610 SATA Magnetic
2 0:4:0 normal 2000B452537450D9 SEAGATE SEGLE0300GBFC15K 6SJ3GDMC 3P00 FC Magnetic
3 0:7:0 normal 2210000A33015B5D Hitachi HUA722020ALA330 JK11B1BFJ1DT5F A3GG,1610 SATA Magnetic
4 0:8:0 normal 2000B452537438AB SEAGATE SEGLE0300GBFC15K 6SJ3GGX0 3P00 FC Magnetic
5 0:11:0 normal 2210000A33015B42 Hitachi HUA722020ALA330 JK11B1BFJ14VXF A3GG,1610 SATA Magnetic
6 0:0:0 failed 2000B45253744FCB SEAGATE SEGLE0300GBFC15K 6SJ3GDH9 3P00 FC Magnetic
7 0:15:0 normal 2210000A33015B1B Hitachi HUA722020ALA330 JK11B1BFJ1HAEF A3GG,1610 SATA Magnetic
8 1:0:0 normal 2000B4525374526E SEAGATE SEGLE0300GBFC15K 6SJ3GE0T 3P00 FC Magnetic
9 1:3:0 normal 2210000A33015BCD Hitachi HUA722020ALA330 JK11B1BFJ1B1EF A3GG,1610 SATA Magnetic
10 1:4:0 normal 2000B45253745214 SEAGATE SEGLE0300GBFC15K 6SJ3GDXK 3P00 FC Magnetic
11 1:7:0 normal 2210000A33015B84 Hitachi HUA722020ALA330 JK11B1BFJ17TWF A3GG,1610 SATA Magnetic
12 1:8:0 normal 2000B4525374359A SEAGATE SEGLE0300GBFC15K 6SJ3GFF2 3P00 FC Magnetic
13 1:11:0 normal 2210000A33015B91 Hitachi HUA722020ALA330 JK11B1BFJ1HAVF A3GG,1610 SATA Magnetic
14 1:12:0 normal 2000B45253743BA2 SEAGATE SEGLE0300GBFC15K 6SJ3G7LX 3P00 FC Magnetic
15 1:15:0 normal 2210000A33015BCA Hitachi HUA722020ALA330 JK11B1BFJ1BJ5F A3GG,1610 SATA Magnetic
16 2:0:0 normal 2000B45253743415 SEAGATE SEGLE0300GBFC15K 6SJ3GFCC 3P00 FC Magnetic
17 2:3:0 normal 2210000A33015BEA Hitachi HUA722020ALA330 JK11B1BFJ1910F A3GG,1610 SATA Magnetic
18 2:4:0 normal 2000B45253743C91 SEAGATE SEGLE0300GBFC15K 6SJ3GHDH 3P00 FC Magnetic
19 2:7:0 normal 2210000A330123B0 Hitachi HUA722020ALA330 JK11B1BFJ1B68F A3GG,1610 SATA Magnetic
20 2:8:0 normal 2000B452537298EC SEAGATE SEGLE0300GBFC15K 6SJ3CTPS 3P00 FC Magnetic
21 2:11:0 normal 2210000A33015BF7 Hitachi HUA722020ALA330 JK11B1BFJ11YWF A3GG,1610 SATA Magnetic
22 2:12:0 normal 2000B452537452B2 SEAGATE SEGLE0300GBFC15K 6SJ3GE3C 3P00 FC Magnetic
23 2:15:0 normal 2210000A33015BD1 Hitachi HUA722020ALA330 JK11B1BFJ1B36F A3GG,1610 SATA Magnetic
24 3:0:0 normal 2000B45253743A86 SEAGATE SEGLE0300GBFC15K 6SJ3DRZB 3P00 FC Magnetic
25 3:3:0 normal 2210000A33015BFE Hitachi HUA722020ALA330 JK11B1BFJ19UGF A3GG,1610 SATA Magnetic
26 3:4:0 normal 2000B452537433F3 SEAGATE SEGLE0300GBFC15K 6SJ3GFBX 3P00 FC Magnetic
27 3:7:0 normal 2210000A33015BEB Hitachi HUA722020ALA330 JK11B1BFJ19BGF A3GG,1610 SATA Magnetic
28 3:8:0 normal 2000B45253743A25 SEAGATE SEGLE0300GBFC15K 6SJ3E0XH 3P00 FC Magnetic
29 3:11:0 normal 2210000A33015A9D Hitachi HUA722020ALA330 JK11B1BFJ19XBF A3GG,1610 SATA Magnetic
30 3:12:0 normal 2000B45253744F7F SEAGATE SEGLE0300GBFC15K 6SJ3GDFZ 3P00 FC Magnetic
31 3:15:0 normal 2210000A33015BA4 Hitachi HUA722020ALA330 JK11B1BFHX0LBF A3GG,1610 SATA Magnetic
-------------------------------------------------------------------------------------------------------
32 total
3par.lctn.org cli% showcage -d cage0
Id Name LoopA Pos.A LoopB Pos.B Drives Temp RevA RevB Model Side
0 cage0 0:0:1 0 1:0:1 0 8 34-35 08 08 DC3 n/a

-----------Cage detail info for cage0 ---------

Position: bay9 D2
-----------Midplane Info------------
VendorId,ProductId 3PARdata,DC3
Assembly Serial_Num OPS69907C01C3C6
Node_WWN 20000050CC01C3C6
TempSensor_State OK
TempSensor_Value 36
OpsPanel_State OK
Audible_Alarm_State Muted
ID_Switch 1
Cage_State OK

Interface Board Info LoopA LoopB
Firmware_status Current Current
Product_Rev 08 08
IFC_State OK OK
ESH_State OK OK
Master_CPU Yes No
Loop_Map valid valid
Link_Speed 4Gbps 4Gbps
Port0_State OK OK
Port1_State No_SFP No_SFP
Port2_State No_SFP No_SFP
Port3_State No_SFP No_SFP

Power Supply Info State Fan State AC Assy_Part
ps0 OK MedSpeed OK --
ps1 OK MedSpeed OK --

--------------Drive Info--------------- ----LoopA----- ----LoopB-----
Drive NodeWWN State Temp(C) ALPA LoopState ALPA LoopState
0:0 2000b45253744fcb Degraded N/A 0xe1 OK 0xe1 OK
3:0 2210000a33015be8 Normal 35 0xda OK 0xda OK
4:0 2000b452537450d9 Normal 34 0xd9 OK 0xd9 OK
7:0 2210000a33015b5d Normal 35 0xd4 OK 0xd4 OK
8:0 2000b452537438ab Normal 34 0xd3 OK 0xd3 OK
11:0 2210000a33015b42 Normal 35 0xce OK 0xce OK
12:0 2000b45253744ed3 Degraded N/A 0xcd OK 0xcd OK
15:0 2210000a33015b1b Normal 35 0xca OK 0xca OK
3par.lctn.org cli%

 

I will order a couple replacement drives.