3PAR StoreServ Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

Failed and degraded disks in 7400

 
SOLVED
Go to solution
Highlighted
hpuser99
Occasional Contributor

Failed and degraded disks in 7400

Hello,

I am new to the 3PAR. We have 7400 system but does not have support anymore. I noticed that our 3PAR system has 4 failed disks. I think they are in failed status for a while.  I want to work on one disk at a time and replace them. Based on my understanding the admin replaced failed disk but it did not take the new disk. he just did a hot swap. I contacted HP support as we had support before but expired. They said the disk 8 was replaced with incorrect type. Not sure how that happened as same disk was ordered that matches the part number displayed on the disk.  Need some direction on how to replace these discs.

 

cli% showpd -s (pasted only ones with issue)
Id CagePos Type -State-- -------------------------------------------Detailed_State------------------------
8 0:8:0 FC failed vacated,invalid_media,servicing
95 3:23:0 SSD failed vacated,disabled_A_port,disabled_B_port,invalid_media,no_valid_ports,invalid,inquiry_failed,servicing
109 4:13:0? FC failed vacated,missing,invalid_media,no_valid_ports,servicing
126 0:8:0? SSD degraded missing,no_valid_ports,servicing

************************
cli% showpd -failed -degraded

---Size(MB)--- ----Ports----
Id CagePos Type RPM State Total Free A B Capacity(GB)
8 0:8:0 FC 10 failed 417792 0 1:0:1* 0:0:1 450
95 3:23:0 SSD 150 failed 189440 0 1:0:2- 0:0:2- 200
109 4:13:0? FC 10 failed 417792 0 ----- ----- --
126 0:8:0? SSD 150 degraded 189440 189440 ----- ----- --
-----------------------------------------------------------------------
4 total 1214464 189440

 

sanhou01 cli% servicemag status -d
Cage 0, magazine 8:
The magazine was successfully brought offline by a servicemag start command.
The command completed Sat Jan 12 20:47:48 2019.
The output of the servicemag start was:
servicemag start -wait -pdid 126
... servicing disks in mag: 0 8
... normal disks:
... not normal disks: WWN [5000CCA02231FC5B] Id [ 8] diskpos [0]
.................... WWN [5000CCA0131BB443] Id [126]
... relocating chunklets to spare space...
... spinning down disk WWN [5000CCA02231FC5B] Id [ 8]
... bypassing mag 0 8
... bypassed mag 0 8
servicemag start -wait -pdid 126 -- Succeeded

Cage 3, magazine 23:
The magazine was successfully brought offline by a servicemag start command.
The command completed Fri May 17 10:19:48 2019.
The output of the servicemag start was:
servicemag start -wait -pdid 95
... servicing disks in mag: 3 23
... normal disks:
... not normal disks: WWN [5000CCA0131BAF63] Id [95] diskpos [0]
... relocating chunklets to spare space...
... spinning down disk WWN [5000CCA0131BAF63] Id [95]
... bypassing mag 3 23
... bypassed mag 3 23
servicemag start -wait -pdid 95 -- Succeeded

 

 

 

 

4 REPLIES 4
sanj_s
HPE Pro
Solution

Re: Failed and degraded disks in 7400

Hello,

For the disk at location 0:8:0

It looks like the disk Failed at location was SSD disk

126 0:8:0? SSD 150 degraded 189440 189440 ----- ----- --

Which was replaced with  the FC disk

8 0:8:0 FC 10 failed 417792 0 1:0:1* 0:0:1 450

The current status of the service mag is "servicemag start " is succeeded for the PD at 0:8:0

sanhou01 cli% servicemag status -d
Cage 0, magazine 8:
The magazine was successfully brought offline by a servicemag start command.
The command completed Sat Jan 12 20:47:48 2019.
The output of the servicemag start was:
servicemag start -wait -pdid 126
... servicing disks in mag: 0 8
... normal disks:
... not normal disks: WWN [5000CCA02231FC5B] Id [ 8] diskpos [0]
.................... WWN [5000CCA0131BB443] Id [126]
... relocating chunklets to spare space...
... spinning down disk WWN [5000CCA02231FC5B] Id [ 8]
... bypassing mag 0 8
... bypassed mag 0 8
servicemag start -wait -pdid 126 -- Succeeded

The Action needs to be performed with the disk at location 0:8:0

  1. Replace the PD with the SSD drive with the same capacity.
  2. Execute the command servicemag status -d 0 8
  3. Check if the servicemag resume starts
  4. If it does not then manually trigger the servicemag resume with command 
    servicemag resume 0 8
  5. Please share the following output as a private message by clicking here

showpd -p -cg 0 -mg 8
showpd -p -cg 3 -mg 23
showpd -p -cg 13 -mg 4

showpd -i -p -cg 0 -mg 8
showpd -i -p -cg 3 -mg 23
showpd -i -p -cg 13 -mg 4

showpd -s -p -cg 0 -mg 8
showpd -s -p -cg 3 -mg 23
showpd -s -p -cg 13 -mg 4

servicemag status -d

We will continue on the next PD after the current result.

Regards,

I am an HPE Employee


Accept or Kudo
hpuser99
Occasional Contributor

Re: Failed and degraded disks in 7400

Hello, Thank you. I will try with the 0:8:0 first.  I don't know what the orginial disk size was ie 450 GB or 200GB. So based on below can I assume I need to buy HRALP0200GBASSLC  which is 200GB.?  Or is there a way to figure out? Also what is the command to make the LED blink on the PD location so I can identify which one to take out?

sanhou01 cli% showpd -p -cg 0 -mg 8
--Size(MB)--- ----Ports----
Id CagePos Type RPM State Total Free A B Capacity(GB)
8 0:8:0 FC 10 failed 417792 0 1:0:1* 0:0:1 450
126 0:8:0? SSD 150 degraded 189440 189440 ----- ----- --
----------------------------------------------------------------------
2 total 607232 189440
sanhou01 cli% showpd -i -p -cg 0 -mg 8
Id CagePos State ----Node_WWN---- --MFR-- -----Model------ -Serial- -FW_Rev- Protocol MediaType
8 0:8:0 failed 5000CCA02231FC5B HITACHI HCBRE0450GBAS10K KMVWH68F 3P02 SAS Magnetic
126 0:8:0? degraded 5000CCA0131BB443 HITACHI HRALP0200GBASSLC XTVH7A2A 3P00 SAS --
---------------------------------------------------------------------------------------------------

sanhou01 cli% showpd -s -p -cg 0 -mg 8
Id CagePos Type -State-- ---------Detailed_State---------
8 0:8:0 FC failed vacated,invalid_media,servicing
126 0:8:0? SSD degraded missing,no_valid_ports,servicing
----------------------------------------------------------

sanj_s
HPE Pro

Re: Failed and degraded disks in 7400

Hello,

Action Plan

 Replace the following disk

  0:8:0  with SSD disk
 3:23:0 with SSD disk
 4:13:0 with FC disk

Execute the following command to check the new disk status and its PDID

 showpd -p -cg 0 -mg 8
 showpd -p -cg 3 -mg 23
 showpd -p -cg 13 -mg 4

 showpd -i -p -cg 0 -mg 8
 showpd -i -p -cg 3 -mg 23
 showpd -i -p -cg 13 -mg 4

 showpd -s -p -cg 0 -mg 8
 showpd -s -p -cg 3 -mg 23
 showpd -s -p -cg 13 -mg 4

Check the status with command again, you may have to check 2-3 time till it shows the remaining time to complete the resume. Which should show the servicemag resume has started,. Upon completion the new disk wil be shown as normal and old disk will dismissed.

 servciemag status -d 0 8
 servicemag status -d 2 23
 servicemag status -d 4 13

After ompletion execute the following cammand again, to confirm the old PD has been dismissed.

 showpd -p -cg 0 -mg 8
 showpd -p -cg 3 -mg 23
 showpd -p -cg 13 -mg 4

 showpd -i -p -cg 0 -mg 8
 showpd -i -p -cg 3 -mg 23
 showpd -i -p -cg 13 -mg 4

 showpd -s -p -cg 0 -mg 8
 showpd -s -p -cg 3 -mg 23
 showpd -s -p -cg 13 -mg 4

 showpd -c -p -cg 0 -mg 8
 showpd -c -p -cg 3 -mg 23
 showpd -c -p -cg 13 -mg 4

 

Regards,
I am an HPE employee


Accept or Kudo
hpuser99
Occasional Contributor

Re: Failed and degraded disks in 7400

Hello,

 

I replaced the disk with SSD 200GB on 0:8 but got below error on service mag. They both are 200GB. Please advise.

sanhou01 cli% servicemag status -d 0 8
The magazine is being brought online due to a servicemag resume.
The last status update was at Wed Oct 2 16:18:47 2019.
failed to retrieve time that relocation started, no estimate available
The cumulative output so far is:
servicemag resume 0 8
... mag 0 8 already onlooped
... firmware is current on pd WWN [5000CCA01331143B]
... firmware is current on pd WWN [5000CCA02231FC5B] Id [ 8]
... firmware is current on pd WWN [5000CCA0131BB443] Id [126]
... checking for valid disks...
sanhou01 cli%

sanhou01 cli%
sanhou01 cli% servicemag status -d 0 8
A servicemag resume command failed on this magazine.
The command completed at Wed Oct 2 16:19:47 2019.
failed to retrieve time that relocation started, no estimate available
The output of the servicemag resume was:
servicemag resume 0 8
... mag 0 8 already onlooped
... firmware is current on pd WWN [5000CCA01331143B]
... firmware is current on pd WWN [5000CCA02231FC5B] Id [ 8]
... firmware is current on pd WWN [5000CCA0131BB443] Id [126]
... checking for valid disks...
... checking for valid disks...
... disks in mag : 0 8
... normal disks: WWN [5000CCA01331143B] Id [146] diskpos [0]
... not normal disks: WWN [5000CCA02231FC5B] Id [ 8]
.................... WWN [5000CCA0131BB443] Id [126]
... verifying spare space for disks 8 and 146
Failed --
New disk 146 is smaller than replaced disk 8
servicemag resume 0 8 -- Failed