HPE 3PAR StoreServ Storage
1752781 Members
5795 Online
108789 Solutions
New Discussion

Re: Evicting Disks from 3PAR Array (dismisspd)

 
Torsten.
Acclaimed Contributor

Re: Evicting Disks from 3PAR Array (dismisspd)

I'm sure the root cause is the same WWN carried over with the "bridge" in the disk magazine.
The systems will "remember" this WWN as bad forever.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
3padm
Advisor

Re: Evicting Disks from 3PAR Array (dismisspd)

Hi,

 

Sorry havent replied sooner - working UK time. In short it looked like the issue was resolved as we managed to track down a new caddy. The array picked up the new disk in the new caddy, gave it a new ID and off it went relocatig chunclets....until this morning I ran service status and this is what I got grrrrr...

 

3PAR-001 cli% servicemag status -d 2 5
A servicemag resume command failed on this magazine.
The command completed at Thu Aug 29 00:26:20 2013.
The output of the servicemag resume was:
servicemag resume 2 5
... mag 2 5 already onlooped
... upgrading firmware on pd WWN [2210000A330077F7]...
... firmware is current on pd WWN [2210000A330066F4] Id [28]
... checking for valid disks...
... checking for valid disks...
...   disks in mag  : 2 5
...      normal disks:  WWN [2210000A330077F7] Id [ 4]  diskpos [0]
...  not normal disks:  WWN [2210000A330066F4] Id [28]
... verifying spare space for disks 28 and 4
... playback chunklets from pd WWN [2210000A330077F7] Id [ 4]
... All chunklets played back / relocated.
... cleared logging mode for cage 2 mag 5
... relocating chunklets from spare space
... chunklet 5:167 - move_error,move_failed, failed move
... chunklet 5:169 - move_error,move_failed, failed move
... chunklet 5:171 - move_error,move_failed, failed move
... chunklet 5:173 - move_error,move_failed, failed move
... chunklet 5:174 - move_error,move_failed, failed move
... chunklet 5:175 - move_error,move_failed, failed move
... chunklet 5:176 - move_error,move_failed, failed move
... chunklet 5:177 - move_error,move_failed, failed move

......................this goes on for a while....

 

For around 4 /6 hours it looked like it was ok and doing what it should. Within the gui everything looks ok but under the 'Device Protocol' columa all devcies are showing up s SATA apart from the new Disk this is showing up as FC. I have attached this as a screenshot.

 

 

 

Thanks

 

Dennis Handly
Acclaimed Contributor

Re: Evicting Disks from 3PAR Array (dismisspd)

>I'm sure the root cause is the same WWN carried over with the "bridge" in the disk magazine.
>The systems will "remember" this WWN as bad forever.

 

And of course I kind of agree and disagree.

Ah, it appears the F class does have a bridge, SATA<->FC:

40 3:5:0   failed 2210000A330066AB SEAGATE ST31000340NS 9QJ6S683 XR38,1610 SATA     Magnetic

 

That ",1610" is the bridge FW version.

 

But if sysmgr can't talk to the disk (loop problems) IT will still remember the old WWN.

That's why a "showcage -d" is needed.  The cage may see the new WWN.  Or it may not see anything at all.

Dennis Handly
Acclaimed Contributor

Re: Evicting Disks from 3PAR Array (dismisspd)

>as SATA apart from the new Disk this is showing up as FC.

 

I've seen this happen with SAS when it has problems with disk models.

 

>chunklet 5:167 - move_error,move_failed, failed move

 

You may want to look at event logs to see if more details.

3padm
Advisor

Re: Evicting Disks from 3PAR Array (dismisspd)

Hi,

 

The disk models appear to be the same 'ST31000340NS' for the existing and new disk. Going back to the original question; how to evict a disk. I followed your steps and attached is the output. Though I was a little cheeky and ran 'removespare ' before showpdch . Interestingly it complains about the same chuncklet '5:167' as you have below which is the first one in the list...!

 

Thanks

Dennis Handly
Acclaimed Contributor

Re: Evicting Disks from 3PAR Array (dismisspd)

>I followed your steps and attached is the output.

 

You had listed PD 40 as bad but your steps have PD 4!

 

3Par-001 cli% removespare 4:a
Are you sure you want to remove spares?
select q=quit y=yes n=no: y
107 spares removed


3Par-001 cli% dismisspd 4
Error : Pd id 4 is referenced by chunklet 5:167
3Par-001 cli% showpdch -mov -from 4

 

Perhaps you were trying to work on PD 28 which moved to PD 4?

 I picked 40 since you hadn't done a servicemag on it.

 

>ran 'removespare' before showpdch.

 

The ordering isn't a problem because showpdch checks used before spare.  But you have removed the spares from the wrong disk.  Unfortunately createspare needs to be told exactly which chunklet to be made spare on PD 4.

 

To put them back, I would need the output of:

$ showpdch -a 4

 

But this may be moot if PD 4 is bad too.

 

>it complains about the same chunklet '5:167' as you have below which is the first one in the list!

 

Yes, that's the first one to be moved back.

(I guess these were the ones moved from PD 28 to PD 5.)

 

Unfortunately we don't know what this means:

.. chunklet 5:167 - move_error,move_failed, failed move

 

Did it fail copying from PD 5?  Or to your new disk, PD 4 is bad?

Anything move in the eventlogs or alerts?

3padm
Advisor

Re: Evicting Disks from 3PAR Array (dismisspd)

Hi,

 

Sorry I should have stated yes PD 4 is bad too. See screenshot. Kinda stuck on this one.

 

 

 

ill check logs shortly

 

thanks

Dennis Handly
Acclaimed Contributor

Re: Evicting Disks from 3PAR Array (dismisspd)

>PD 4 is bad too.

 

I think PD 4 is the new number when you did a servicemag on PD 28.  It has the same cage position.

 

>>Did it fail copying from PD 5?  Or to your new disk, PD 4 is bad?

 

What do these show:

showpd -s 4 5

showpd -i 4 5

showcage -d cage2