Operating System - HP-UX
1849135 Members
2518 Online
104041 Solutions
New Discussion

HP requests your feedback on 'fcmsutil replace_dsk'

 
marie-noelle jeanson_1
Trusted Contributor

HP requests your feedback on 'fcmsutil replace_dsk'

Hi,

HP is seeking user' s input on the usefulness / user friendliness of the replace_dsk feature of fcmsutil.

Question:
Do you like the fact that the FC driver denies access to a FC device after detecting a WWN change (for instance, after a bad disk has been replaced) and that it requires the user to run the command ‘fcmsutil replace_dsk’ (or reboot) to re-enable access to the new device (logging is generated to tell user to do so)?
Are you aware of the replace_dsk command? Have you used it? Do you think it needs to be more clearly documented?

Would you prefer the FC driver to act differently? How?

Background:

The Fibre Channel driver uses the World Wide Name (WWN) as a way to keep track of the FC devices it sees. When the Fibre Channel driver is initialized upon bootup, it creates a mapping of all of the FC devices, the mapping contains the WWN and the physical hardware path. The FC driver periodically performs a sanity check to make sure that all of the devices are at the same physical hardware paths. If the driver detects that the WWN for a given device has changed, the driver issues an error message and then denies access to the device. This is used to prevent devices i.e. disk drives from accidentally being pulled from one slot and then being placed into a different slot. You will get these messages when you replace a bad FC disk drive.

In order to get around this you can either run the fcmsutil command with the "replace_dsk -l" or you can just reboot. The fcmsutil command tells the F/C driver to forget about the previous WWN mapping and recreate a new mapping. Basically, you're telling the FC that you have made a hardware change and the change is correct. The reboot method works because these mappings are normally created upon bootup, so the F/C will assume that the mapping that it is creating is the correct one.

All thoughts welcome. Thanks!

Marie-No
10 REPLIES 10
Rita C Workman
Honored Contributor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

It's been so long since any disk has gone bad. And now that all our sites run with SAN's, I suppose this is something that to get more familiar with.

My preferences, well rebooting is rarely an option Mgmt likes. Since generally we're only replacing one disk and things continue to run via the mirror, I would have to say I would prefer to use the replace_dsk command.

The only concern I might have, is that our disk hardware vendor, could make a change out a bad disk when I'm not there. My concern is...since we have a mirrored environment, would this FC sanity check cause processing to halt (i.e. not continue off the mirror) waiting for a response ?
Running 24 x 7 shop, that could be a problem.

Thanks,
Rita
marie-noelle jeanson_1
Trusted Contributor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

Rita,

This replace_dsk only affects the path on which a name authentication failure was observed. It does not affect the mirror if the mirror is using a different path. If the mirror is using the same path, then, it can affect the mirror as well.

Regards,

Marie-Noelle
John Payne_2
Honored Contributor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

I didn't even know there was a replace_dsk option. I think it needs to documented, at least when to use it, etc.

It looks like this option would have fixed a problem I had 3 years ago, had I known.

We use SecurePath for our SAN luns. Would we ever need to use the command at this point?

John
Spoon!!!!
Deoncia Grayson_1
Honored Contributor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

Since we have been replacing disk on a pretty regular basis the replace_dsk command has been highly useful, but it keeps logging the error until you actually run replace_dsk command and since reboot is not an option management enjoy, I find that particular command useful. I didn't even know about the replace_dsk option until we changed the disk, so more documentation would be appreciated.
If no one ever took risks, Michelangelo would have painted the Sistine floor. -Neil Simon
marie-noelle jeanson_1
Trusted Contributor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

Hi all,

Thanks for your input.
I truely recognize that there is a lack of documentation regarding the need to run the 'fcmsutil replace_dsk' command after a device has been replaced. It looks like only the man page explains it today. I will try to get this point addressed.
Thanks,

Marie-Noelle
Florian Heigl (new acc)
Honored Contributor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

hey there - I've noticed that command option,
but never ever used it, first, because we don't have fc10's or similar jbod storage so I never got into the situation of replacing a disk and ending up with a different wwn for it.

I'm quite glad to finally know what it does, but I think denying access is a quite harsh method. on the other hand i *once* [1] actually mixed up (scsi root) disks I wanted to replace, and if I think about this a second time, this might be a really good idea, but the awareness for this command would have to be raised a bit. (think HW-recovery manuals...)

Florian

[1] I still owe coffee to a lot of people, I think.
yesterday I stood at the edge. Today I'm one step ahead.
Devender Khatana
Honored Contributor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

Hi,

In critical system environment like ours we can not afford to take a reboot after replacing a disk. Allthough we do not require this as we are on SAN with XP's but still 'replace_dsk -l' is a best option to provide for those who can use it.And obviously more documentation should be provided for such useful tools.

Regards,
Devender
Impossible itself mentions "I m possible"
Steven E. Protter
Exalted Contributor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

Interesting.

I like being able to handle these issues without rebooting. Management doesn't like boots, users don't either.

A note, because Rita's comment hit home.

We just had a SAN melt down and take 1.3 TB of data with it, two weeks ago this Tuesday.

It wasn't a disk that went, it was the backplane, I guess you'd call it, and the memory got corrupted along with it all our data.

Just because they are better and more reliable, they do occaisionally fail.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Borislav Perkov
Respected Contributor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

Hi Marie-Noelle,

I find this command useful when I have replaced bad disks without rebooting.

It should be more cleary documented.

But I have problems with ComandView (ver. A.1.05) where it reports to EMS that there is disk missing (gives message that WWN is different), actualy the inserted disk WWN is not registered. After reboot that stops.

Regards,
Borislav
Arief
Occasional Advisor

Re: HP requests your feedback on 'fcmsutil replace_dsk'

Just wanted to clarify that "replace_dsk" is actually not specific to FC disks or JBODs. It works with all types of FC devices.

It is needed anytime you replace an existing FC device without rebooting the host.

HP FC HBAs remember the WWN that should be at a particular N_Port ID. If it sees a different WWN there, it will disallow I/Os to that N_Port ID until "replace_dsk" is run or the host is reboot.