HPE 3PAR StoreServ Storage
1752726 Members
5833 Online
108789 Solutions
New Discussion

Re: Trying to remove bad disks from 3PAR

 
Bob19508
Occasional Contributor

Trying to remove bad disks from 3PAR

I have to failed disks in our 3PAR.  We don't have support.

What is the cleanest way to remove these drives?   We added one new one in, but I don't want to run a tunesys yet till i get these drives figured out.   #9 should be replace already ... just need it out of the list.  #24 we still need to fix

The two drives say  (output from showpd -s)

9 0:9:0? FC failed vacated,missing,invalid_media,smart_threshold_exceeded unknown
24 0:9:0? FC failed vacated,prolonged_missing unknown


I ran these commands:

cli% dismisspd 9
Error : Pd id 9 is referenced by chunklet 0:242

cli% dismisspd 24
Error : Pd id 24 is referenced by chunklet 3:241

after running this:

movech -perm -ovrd -f 0:241

just another chucklet pops up

cli% showpdch -spr 24
No chunklet information available.
cli% showpdch -spr 9
No chunklet information available.

3PAR - cli% dismisspd 9
Error : Pd id 9 is referenced by chunklet 0:242

admithw -checkonly output:

 

 

Checking for drive table upgrade packages
Checking nodes...

Checking volumes...

Checking system LDs...

After an upgrade from 2.2.4 or earlier, if sufficient space is present,
admithw will automatically remove and recreate the preserved data LDs
with a larger set size to improve their availability.

Checking ports...

Checking state of disks...
The following disks are NOT in an acceptable state:
Id CagePos Type -State- --------------------Detailed_State-------------------- -SedState-
9 0:9:0? FC failed vacated,missing,invalid_media,smart_threshold_exceeded unknown
24 0:9:0? FC failed vacated,prolonged_missing unknown
-----------------------------------------------------------------------------------------
2 total

Checking cabling...

Checking cage firmware...

Checking system health...
Checking alert
Checking cabling
Checking cage
Checking cert
Checking dar
Checking date
Checking fs
Checking host
Checking ld
Checking license
Checking network
Checking node
Checking pd
Checking port
Checking rc
Checking snmp
Checking task
Checking vlun
Checking vv
Component -------------------Description-------------------- Qty
Alert New alerts 4
Host Host ports not configured for virtual port support 2
License Licenses which have expired 1
Network Errors detected on network 1
PD PD count exceeds licensed quantity 1
PD Magazines with failed servicemag operations 1
Task Failed Tasks 2


admithw has completed.

7 REPLIES 7
Emmanuel_S1
HPE Pro

Re: Trying to remove bad disks from 3PAR

You have to confirm that the mag/drive is empty before doing any replacement.

Here is the typical replacement process to follow:

1. Use the "showpd -c" command and confirm that all the Used chunklets are 0.

If this is a 4 disk mag in a DC4 cage you will have to vacate the other disks as well. You can then run

servicemag start -log -pdid <PDID> or serviemag start -pdid <PDID>   

command against the disk to be replaced to put it in service mode or offline mode.

You can then replace the disk physically. **Note: make sure you replace the disk physically within 20-30 minutes to prevent the logging LD to fill up.

2. Once replaced run:

servicemag resume  <cageID> <magID>

- to return the relocated data back to the replacement disk. The old disk entry will be automatically dismissed or you can dismiss it manually once the servicemag resume process is done.

I would recommend double checking the syntax of the command from the CLI that applies to the version of code you have.

 

I am an HPE Employee.

Accept or Kudo

Bob19508
Occasional Contributor

Re: Trying to remove bad disks from 3PAR

Thank you Emmanuel_S1 for replying so quickly!

I think my 3PAR is drunk and confused.   it shows "0:9:0 ?"    on two drives.  I attached the screenshot.

Here is a output of the commands:

DLSAN09 cli% servicemag start -log -pdid 9
Warning: The -log option of servicemag may reduce the redundancy level of RAID sets until servicemag resume completes successfully.
Are you sure you want to run servicemag?
select q=quit y=yes n=no: y
Pd 9 does not have a current cage position.
Its last known position was at cage 0 mag 9 pos 0
Using that to run servicemag on cage 0 mag 9
servicemag start -log -pdid 9
... servicing disks in mag: 0 9
... normal disks: WWN [5000C50068A1D6E8] Id [25] diskpos [0]
... not normal disks: WWN [5000C5006BC98B48] Id [ 9]
.................... WWN [5000C50068A1EF0C] Id [24]

The servicemag start operation will continue in the background.


DLSAN09 cli% servicemag resume 0 9
Are you sure you want to run servicemag?
select q=quit y=yes n=no: y
servicemag resume 0 9
... mag 0 9 already onlooped
... firmware is current on pd WWN [5000C50068A1D6E8] Id [25]
... firmware is current on pd WWN [5000C5006BC98B48] Id [ 9]
... firmware is current on pd WWN [5000C50068A1EF0C] Id [24]
... checking for valid disks...
... disks in mag : 0 9
... normal disks: WWN [5000C50068A1D6E8] Id [25] diskpos [0]
... not normal disks: WWN [5000C5006BC98B48] Id [ 9]
.................... WWN [5000C50068A1EF0C] Id [24]
... playback chunklets from pd WWN [5000C50068A1D6E8] Id [25]

The servicemag resume operation will continue in the background.

 

Dennis Handly
Acclaimed Contributor

Re: Trying to remove bad disks from 3PAR

> What is the cleanest way to remove these drives?

You don't "remove" drives, you replace them with servicemag.

 

> We added one new one in

You don't add drives.  If you haven't done an admitpd yet, remove that drive and wail until servicemag says to add it.

Looks like it is too late now.  :-(

 

>#9 should be replaced already

If you didn't do a servicemag, you'll have problems.

 

9 0:9:0? FC failed vacated,missing,invalid_media,smart_threshold_exceeded unknown
24 0:9:0? FC failed vacated,prolonged_missing unknown

These "?" say you mistakenly removed the drive without using servicemag.

 

> movech -perm -ovrd -f 0:241

> just another chucklet pops up

This is NOT how to use movech.  You must find ALL chunklets and do them at once, with Linux vector scripting.  But using servicemag is much easier.

 

> I think my 3PAR is confused.   it shows "0:9:0 ?" on two drives.

Only because you used the wrong commands/process.  Instead of servicemag, you just replaced the drive

Have you looked at this link:

https://community.hpe.com/t5/3PAR-StoreServ-Storage/Remove-a-PD-permanently/m-p/6965032#U6965127

The magic is:

movech -f -dr -nowait -ovrd -perm $(showpdch -mov -nohdtot | awk '{print $1 ":" $2}')

This will just print the checklets to be moved.  It it looks reasonable, remove the "-dr" and redo.

You can use "showpdch -mov" to see how it progresses.

When done, you can do: dismisspd 9 24

Bob19508
Occasional Contributor

Re: Trying to remove bad disks from 3PAR

When i run the "movech -f -dr -nowait -ovrd -perm $(showpdch -mov -nohdtot | awk '{print $1 ":" $2}')" command, I am getting:

can't read "1": no such variable

 

 

 

 

Dennis Handly
Acclaimed Contributor

Re: Trying to remove bad disks from 3PAR

This of course requires you to use a bash shell with the remote CLI client.

Bob19508
Occasional Contributor

Re: Trying to remove bad disks from 3PAR

I don't have a valid SAID to download that CLI software.

 

Dennis Handly
Acclaimed Contributor

Re: Trying to remove bad disks from 3PAR

> I don't have a valid SAID to download that CLI software.

 

Then you have to change the command to work with ssh.  Most likely just add "ssh 3paradm@<address>" to the start of each CLI command.