HPE 3PAR StoreServ Storage
1767171 Members
3297 Online
108959 Solutions
New Discussion

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

 
victor5120
Established Member

HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

There is failed this in the storage and i have run "showpd -failed -degraded" to check the disk has been brought offline as per below:

The magazine was ssuccessfully brought offline by a servicemag start command.

The command completed at Sun Mar xxxxxxxx

servicemag start -wait -pdid 1 -- Succeeded.

--Replace new harddisk and wait for awhile

I managed to run servicemag resume 0 1 :

... mag 0 1 already onlooped

...firmware is current on pd WWN [5000C500C1A21C48] Id [16]

...firmware is current on pd WWN [500C5009991C588] Id [ 1]

...checking for valid disk...

...disks in mag : 0 1

...        normal disks: WWN [5000C500C1A21C48] Id [16] diskpos [0]

... not normal disks:  WWN [500C5009991C588] Id [ 1]

...verifying spare space for disks 1 and 16

...playback chunklets from pd WWN [5000C500C1A21C48] Id [16]

The servicemag resume operation will continue in the background

---------------------------------------------------------------------------------

Check the resume status today with "servicemag status" but showed "No servicemag operations logged"

 

Run "showpd" and the disk state is still showing failed for the pdid 1. How can i clear the failed state and is it correct to see 2 disk is showing in same magazine? As now the disk pdid 1 and disk pdid 16 are showing the same CagePos - 0 : 1 : 0

 

Anyone encounter this and can help please?

9 REPLIES 9
Cali
Honored Contributor

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

Hi,

I can only answer your last Question:

Run "showpd" and the disk state is still showing failed for the pdid 1. How can i clear the failed state and is it correct to see 2 disk is showing in same magazine? As now the disk pdid 1 and disk pdid 16 are showing the same CagePos - 0 : 1 : 0

Yes, as long, as the new Disk did not completely replace the old Disk, both IDs are showing.
Example: If Chunklets are moved from the Old Disk to Spare Space, you need a reference to them, if you replace the Disk. Then the System can move Chunklets from 1 to 16. If there is no more data that was on ID 1, the ID 1 will be free.
Or something like that, simply talk.
Cali

ACP IT Solutions AGI'm not an HPE employee, so I can be wrong.
victor5120
Established Member

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

@Cali Hi Cali,

Thanks for the info. The issue is that i have replaced the failed disk and run the resume command but the replaced disk on id1 is still showing failed after the resume operation from id16 to id1 has been completed. What is the recommend steps to be carried out? Shoud l re-run the resume command again or I should bring down the disk again > remove the disk > re-insert the disk > run resume command? I was informed that there is another disk failed again and do you think it is because the failed disk that is causing more disks to fail or it is just hardware issue? 

Victor

victor5120
Established Member

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

Hi All,

 

The steps i have carried out last time as below:

1) Run "showpd -failed -degraded" to check failed disk:

--Size(MB)-- ----Ports----
Id CagePos Type RPM State Total Free A B Capacity(GB)
1 0:1:0? FC 10 failed 1142784 0 ----- ----- 1200
19 0:19:0 FC 10 failed 1715200 0 ----- ----- 1800
------------------------------------------------------------------
2 total 2857984 0

2) Run "servicemag status" to confirm failed disk has been brought down

Cage 0, magazine 1:
The magazine was successfully brought offline by a servicemag start command.
The command completed at Sat Mar 10 08:42:39 2024.
servicemag start -wait -pdid 1 -- Succeeded

3) Replace the failed disk

4) Run "servicemag resume 0 1"

5) Once the resume operation completed but replaced disk still showing failed as below:
XXXXXXXX cli% showpd -s
Id CagePos Type -State- --------------------Detailed_State--------------------- -SedState--
0 0:0:0 FC normal normal not_capable
1 0:1:0? FC failed vacated,missing,invalid_media,failed_hardware,servicing unknown
2 0:2:0 FC normal normal not_capable
3 0:3:0 FC normal normal not_capable
4 0:4:0 FC normal normal not_capable
5 0:5:0 FC normal normal not_capable
6 0:6:0 FC normal normal not_capable
7 0:7:0 FC normal normal not_capable
8 0:8:0 FC normal normal not_capable
9 0:16:0 FC normal normal not_capable
10 0:10:0 FC normal normal not_capable
11 0:11:0 FC normal normal not_capable
12 0:9:0 FC normal normal not_capable
13 0:13:0 FC normal normal not_capable
14 0:14:0 FC normal normal not_capable
15 0:15:0 FC normal normal not_capable
16 0:1:0 FC normal relocating,servicing not_capable
17 0:17:0 FC normal normal not_capable
18 0:18:0 FC normal normal not_capable
19 0:19:0 FC failed vacated,missing,invalid_media,slow_drive,servicing unknown
20 0:12:0 FC normal normal not_capable
-------------------------------------------------------------------------------------------
21 total

XXXXXXXX cli% showpd -i 1
Id CagePos State ----Node_WWN---- --MFR-- -----Model------ -Serial- -FW_Rev- Protocol MediaType -----AdmissionTime-----
1 0:1:0? failed 5000C5009991C588 SEAGATE STHB1200S5xeN010 **Confidential info erased** 3P03 SAS Magnetic 2017-01-11 03:40:32 SGT
------------------------------------------------------------------------------------------------------------------------
1 total

XXXXXXXX cli% showpd -c 1
------- Normal Chunklets -------- ---- Spare Chunklets ----
- Used - -------- Unused -------- - Used - ---- Unused ----
Id CagePos Type State Total OK Fail Free Uninit Unavail Fail OK Fail Free Uninit Fail
1 0:1:0? FC failed 1116 0 0 0 0 1026 90 0 0 0 0 0
----------------------------------------------------------------------------------------
1 total 1116 0 0 0 0 1026 90 0 0 0 0 0
XXXXXXXX cli% showpd -c 16
-------- Normal Chunklets -------- ---- Spare Chunklets ----
- Used -- -------- Unused -------- - Used - ---- Unused ----
Id CagePos Type State Total OK Fail Free Uninit Unavail Fail OK Fail Free Uninit Fail
16 0:1:0 FC normal 1116 1031 0 0 0 70 0 15 0 0 0 0
-----------------------------------------------------------------------------------------
1 total 1116 1031 0 0 0 70 0 15 0 0 0 0

XXXXXXXX cli% showpdch -from 1
No chunklet information available.

 

I have brought down the magazine again to carry out my next action plan as below:

Current status:
XXXXXXXX cli% servicemag status
Cage 0, magazine 1:
The magazine is being brought offline due to a servicemag start.
The last status update was at Mon Mar 25 09:09:11 2024.
Chunklets relocated: 15 in 29 minutes and 27 seconds
Chunklets remaining: 1012
Chunklets marked for moving: 1012
Estimated time for relocation completion based on 117 seconds per chunklet is: 1 days, 8 hours, 53 minutes and 24 seconds
servicemag start -pdid 1 -- is in Progress

 

Next action plan:

1. Bring down failed disk once again - in progress
servicemag start -pdid 1
2. Reinsert the disk
3. Stop allocation of chunklets on this pd
setpd ldalloc off 1
4. Reallocate any used chunklets to other disks of the same type
movepdtospare -f -vacate 1
5. Remove any space chunklet
removespare 1
6. Remove the failed pd
dismisspd 1
admithw

Do let me know if i missed out anything in my plan. Thank you.

Link that i have refer to - https://community.hpe.com/t5/hpe-3par-storeserv-storage/3par-8200-cannot-dismisspd-quot-pd-is-in-use-quot/td-p/7069384

 

Cali
Honored Contributor

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

Just wait.

This means:

Chunklets relocated: 15 in 29 minutes and 27 seconds
Chunklets remaining: 1012
Chunklets marked for moving: 1012

There are still 1012 Chunklets that need to be moved, and 1 Chunklet to move needs 2 Min.

Re Issue the Status Command and check, if the "Chunklets remaining" Count is going down.

Cali

ACP IT Solutions AGI'm not an HPE employee, so I can be wrong.
Ramesh_Kumar_P
HPE Pro

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

Hello,

Could you please share the below commands output to check the current status.

showpd -p -cg 0 -mg 1
showpd -failed -degraded
servicemag status -d

 

 

Thank you!

I'm an HPE employee.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
victor5120
Established Member

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

Hi Cali,

 

Thanks for the info and appreciate for your help.

The issue is that i never dismiss the old pd which is why we still seeing the failed state during checking.

 

Victor

victor5120
Established Member

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

Hi Ramesh,

Thanks for your reply but we managed to resolve it.

 

Victor 

Sunitha_Mod
Moderator

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

Hello @victor5120,

That's Awesome! 

We are glad to know you were able to find the solution and we appreciate you for keeping us posted. 

It would be great if you could provide the steps you have taken to address the problem, as it will assist other users in the Community.

Thanks,
Sunitha G
I'm an HPE employee.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Ramesh_Kumar_P
HPE Pro

Re: HPE 3Par 8200 - Disk still showing failed after replaced the new harddisk

Hello Victor,

 

Nice to hear that the issue has been resolved now. It was nice working with you. You may please feel free to get in touch with us if any further assistance is needed.

 

Thank you!

I'm an HPE employee.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo