HPE GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- fcmsutil replace_dsk command to be issued?
Operating System - HP-UX
1827322
Members
4139
Online
109961
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-25-2004 10:04 PM
11-25-2004 10:04 PM
fcmsutil replace_dsk command to be issued?
Hi,
I've an exploding aggregation of errors from the kernel's FC driver in syslog.log.
Only for today we've already this accumulation:
[root@saturn:/root]
# grep -cE '^Nov 26.+Fibre' /var/adm/syslog/syslog.log
20873
The errors imply that there's a problem for the driver with the WWN.
[root@saturn:/root]
# tail /var/adm/syslog/syslog.log
Nov 26 11:33:49 saturn vmunix: fcmsutil(1M) command's replace_dsk option to allow the new device to b
e used.
Nov 26 11:33:49 saturn vmunix: 1/2/0/0: Fibre Channel Driver detected a parse error in the FLOGI/PLOG
I response
Nov 26 11:33:49 saturn vmunix: returned by nport ID 0x772213. FLOGI/PLOGI Fail Code = 0x6.
Nov 26 11:33:49 saturn vmunix:
Nov 26 11:33:51 saturn vmunix: 1/2/0/0: 'World-wide name' (unique identifier) for device at nport ID
0x772213 has
Nov 26 11:33:51 saturn vmunix: changed. If the device has been replaced intentionally, please use the
Nov 26 11:33:51 saturn vmunix: fcmsutil(1M) command's replace_dsk option to allow the new device to b
e used.
Nov 26 11:33:51 saturn vmunix: 1/2/0/0: Fibre Channel Driver detected a parse error in the FLOGI/PLOG
I response
Nov 26 11:33:51 saturn vmunix: returned by nport ID 0x772213. FLOGI/PLOGI Fail Code = 0x6.
Nov 26 11:33:51 saturn vmunix:
We did experience problems with a SAN switch recently (all happened shortly after our SAN admin left, but I hope this is only coincidence)
Because of that the driver were permanently switching between primary and alternate of the shared disks which led to a severe degradation in disk I/O performance.
I'm not familiar with FC and SAN technology (because I've never had access to the devices and tools used by the SAN admins).
So the only instant remedy, after we identified the flipflopping SAN switch, for me was to vgreduce all shared VGs by the primary path, and vgextend again just to swap the rank of primary and alternate paths.
That seemed to help on the other cluster nodes, except for one (saturn), where I now get the above errors logged by syslogd.
Thankfully the error messages are quite clear in that they specify the faulty HW path, and even suggest to run "fcmsutil replace_dsk ..."
I asked our new SAN admin to tell me if there's been a change in the WWN for this fabric, and to tell me if so.
She propagated a WWN to me which deviates from the one that fcmsutil displays for the device where the tachyon controller at the above path is attached.
[root@saturn:/root]
# ioscan -knfCfc
Class I H/W Path Driver S/W State H/W Type Description
=================================================================
fc 0 0/4/0/0 fcT1 CLAIMED INTERFACE HP Fibre Channel Mass Storage Adapter
fc 4 1/0/0/0 td CLAIMED INTERFACE HP Tachyon XL2 Fibre Channel Mass Storage Adapte
r
/dev/td4
fc 2 1/2/0/0 td CLAIMED INTERFACE HP Tachyon XL2 Fibre Channel Mass Storage Adapte
r
/dev/td2
fc 1 1/10/0/0 fcT1 CLAIMED INTERFACE HP Fibre Channel Mass Storage Adapter
[root@saturn:/root]
# vgdisplay -v vgbz|grep PV\ Name
PV Name /dev/dsk/c10t5d6
PV Name /dev/dsk/c7t5d6 Alternate Link
PV Name /dev/dsk/c10t5d7
PV Name /dev/dsk/c7t5d7 Alternate Link
PV Name /dev/dsk/c10t6d0
PV Name /dev/dsk/c7t6d0 Alternate Link
[root@saturn:/root]
# ioscan -knf -H 1/2/0/0|awk '$1~/c7t5/{print$1}'
/dev/dsk/c7t5d0
/dev/dsk/c7t5d1
/dev/dsk/c7t5d2
/dev/dsk/c7t5d3
/dev/dsk/c7t5d4
/dev/dsk/c7t5d5
/dev/dsk/c7t5d6
/dev/dsk/c7t5d7
[root@saturn:/root]
# /opt/fcms/bin/fcmsutil /dev/td2
Vendor ID is = 0x00103c
Device ID is = 0x001029
XL2 Chip Revision No is = 2.3
PCI Sub-system Vendor ID is = 0x00103c
PCI Sub-system ID is = 0x00128c
Topology = PTTOPT_FABRIC
Link Speed = 2Gb
Local N_Port_id is = 0x771b13
N_Port Node World Wide Name = 0x50060b000021e0a5
N_Port Port World Wide Name = 0x50060b000021e0a4
Driver state = ONLINE
Hardware Path is = 1/2/0/0
Number of Assisted IOs = 68151
Number of Active Login Sessions = 0
Dino Present on Card = NO
Maximum Frame Size = 960
Driver Version = @(#) libtd.a HP Fibre Channel Tachyon TL/TS/XL2 Driver B.11
.00.10 (AR1201) /ux/core/kern/wsio/td_glue.c: Oct 11 2001, 11:54:14
I now would like to know if I have to run
fcmsutil replace_dsk
with the WWN that our new SAN admin told me to end this logging mess.
Is it safe to run this command on a productive node?
Can I screw something up?
Well, then it should only affect one of two paths, theoretically.
Why isn't the driver itself updating to the correct WWN, by e.g. some autosensing mechanism?
We didn't ever have to run fcmsutil to do this as far as I can remember.
So could it be that our new SAN admin inadvertently "misconfigured" something on her SAN side?
Regards
Ralph
I've an exploding aggregation of errors from the kernel's FC driver in syslog.log.
Only for today we've already this accumulation:
[root@saturn:/root]
# grep -cE '^Nov 26.+Fibre' /var/adm/syslog/syslog.log
20873
The errors imply that there's a problem for the driver with the WWN.
[root@saturn:/root]
# tail /var/adm/syslog/syslog.log
Nov 26 11:33:49 saturn vmunix: fcmsutil(1M) command's replace_dsk option to allow the new device to b
e used.
Nov 26 11:33:49 saturn vmunix: 1/2/0/0: Fibre Channel Driver detected a parse error in the FLOGI/PLOG
I response
Nov 26 11:33:49 saturn vmunix: returned by nport ID 0x772213. FLOGI/PLOGI Fail Code = 0x6.
Nov 26 11:33:49 saturn vmunix:
Nov 26 11:33:51 saturn vmunix: 1/2/0/0: 'World-wide name' (unique identifier) for device at nport ID
0x772213 has
Nov 26 11:33:51 saturn vmunix: changed. If the device has been replaced intentionally, please use the
Nov 26 11:33:51 saturn vmunix: fcmsutil(1M) command's replace_dsk option to allow the new device to b
e used.
Nov 26 11:33:51 saturn vmunix: 1/2/0/0: Fibre Channel Driver detected a parse error in the FLOGI/PLOG
I response
Nov 26 11:33:51 saturn vmunix: returned by nport ID 0x772213. FLOGI/PLOGI Fail Code = 0x6.
Nov 26 11:33:51 saturn vmunix:
We did experience problems with a SAN switch recently (all happened shortly after our SAN admin left, but I hope this is only coincidence)
Because of that the driver were permanently switching between primary and alternate of the shared disks which led to a severe degradation in disk I/O performance.
I'm not familiar with FC and SAN technology (because I've never had access to the devices and tools used by the SAN admins).
So the only instant remedy, after we identified the flipflopping SAN switch, for me was to vgreduce all shared VGs by the primary path, and vgextend again just to swap the rank of primary and alternate paths.
That seemed to help on the other cluster nodes, except for one (saturn), where I now get the above errors logged by syslogd.
Thankfully the error messages are quite clear in that they specify the faulty HW path, and even suggest to run "fcmsutil replace_dsk ..."
I asked our new SAN admin to tell me if there's been a change in the WWN for this fabric, and to tell me if so.
She propagated a WWN to me which deviates from the one that fcmsutil displays for the device where the tachyon controller at the above path is attached.
[root@saturn:/root]
# ioscan -knfCfc
Class I H/W Path Driver S/W State H/W Type Description
=================================================================
fc 0 0/4/0/0 fcT1 CLAIMED INTERFACE HP Fibre Channel Mass Storage Adapter
fc 4 1/0/0/0 td CLAIMED INTERFACE HP Tachyon XL2 Fibre Channel Mass Storage Adapte
r
/dev/td4
fc 2 1/2/0/0 td CLAIMED INTERFACE HP Tachyon XL2 Fibre Channel Mass Storage Adapte
r
/dev/td2
fc 1 1/10/0/0 fcT1 CLAIMED INTERFACE HP Fibre Channel Mass Storage Adapter
[root@saturn:/root]
# vgdisplay -v vgbz|grep PV\ Name
PV Name /dev/dsk/c10t5d6
PV Name /dev/dsk/c7t5d6 Alternate Link
PV Name /dev/dsk/c10t5d7
PV Name /dev/dsk/c7t5d7 Alternate Link
PV Name /dev/dsk/c10t6d0
PV Name /dev/dsk/c7t6d0 Alternate Link
[root@saturn:/root]
# ioscan -knf -H 1/2/0/0|awk '$1~/c7t5/{print$1}'
/dev/dsk/c7t5d0
/dev/dsk/c7t5d1
/dev/dsk/c7t5d2
/dev/dsk/c7t5d3
/dev/dsk/c7t5d4
/dev/dsk/c7t5d5
/dev/dsk/c7t5d6
/dev/dsk/c7t5d7
[root@saturn:/root]
# /opt/fcms/bin/fcmsutil /dev/td2
Vendor ID is = 0x00103c
Device ID is = 0x001029
XL2 Chip Revision No is = 2.3
PCI Sub-system Vendor ID is = 0x00103c
PCI Sub-system ID is = 0x00128c
Topology = PTTOPT_FABRIC
Link Speed = 2Gb
Local N_Port_id is = 0x771b13
N_Port Node World Wide Name = 0x50060b000021e0a5
N_Port Port World Wide Name = 0x50060b000021e0a4
Driver state = ONLINE
Hardware Path is = 1/2/0/0
Number of Assisted IOs = 68151
Number of Active Login Sessions = 0
Dino Present on Card = NO
Maximum Frame Size = 960
Driver Version = @(#) libtd.a HP Fibre Channel Tachyon TL/TS/XL2 Driver B.11
.00.10 (AR1201) /ux/core/kern/wsio/td_glue.c: Oct 11 2001, 11:54:14
I now would like to know if I have to run
fcmsutil replace_dsk
with the WWN that our new SAN admin told me to end this logging mess.
Is it safe to run this command on a productive node?
Can I screw something up?
Well, then it should only affect one of two paths, theoretically.
Why isn't the driver itself updating to the correct WWN, by e.g. some autosensing mechanism?
We didn't ever have to run fcmsutil to do this as far as I can remember.
So could it be that our new SAN admin inadvertently "misconfigured" something on her SAN side?
Regards
Ralph
Madness, thy name is system administration
2 REPLIES 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-26-2004 02:15 AM
11-26-2004 02:15 AM
Re: fcmsutil replace_dsk command to be issued?
You have totally identifed the problem. Its the SAN.
It might not have been treacherous. It might have been the SAN admin was doing something on regular basis and that stopped happening when He left.
You need to get someone on the SAN and make sure there are no errors.
I'd also make sure your fiber cards are working perfectly.
Once all that stuff is out of the way, I'd plan a backup and possibly renumber your instances. A changing SAN environment may confuse your ioinit hardware database, kernel and it might be time for a fresh start.
http://www2.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&admit=-1335382922+1098717421883+28353475&docId=200000067424466
I doubt you screwed anything up. You may have hardware/kernel confusion caused by the changing SAN environment.
For the alternate paths to work properly the WWN for both fiber cards need to be set up for the LUNs on the disk array. Both WWN's need to be able to have full read/write access. I'm, thinking one of the WWN's is not correctly permitted in the LUN configuration.
SEP
It might not have been treacherous. It might have been the SAN admin was doing something on regular basis and that stopped happening when He left.
You need to get someone on the SAN and make sure there are no errors.
I'd also make sure your fiber cards are working perfectly.
Once all that stuff is out of the way, I'd plan a backup and possibly renumber your instances. A changing SAN environment may confuse your ioinit hardware database, kernel and it might be time for a fresh start.
http://www2.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&admit=-1335382922+1098717421883+28353475&docId=200000067424466
I doubt you screwed anything up. You may have hardware/kernel confusion caused by the changing SAN environment.
For the alternate paths to work properly the WWN for both fiber cards need to be set up for the LUNs on the disk array. Both WWN's need to be able to have full read/write access. I'm, thinking one of the WWN's is not correctly permitted in the LUN configuration.
SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2004 08:35 PM
11-28-2004 08:35 PM
Re: fcmsutil replace_dsk command to be issued?
Hi Steven,
meanwhile I was informed by our SAN admin that they indeed had had some piece of HW replaced by the HITACHI disk subsystem vendor.
Because of the crucial importance of the shared disk subsystems they do have 24/7 support, and failed parts get exchanged immediately.
I only had to run the command that the FC driver had been logging to syslogd, viz.
# /opt/fcms/bin/fcmsutil /dev/td2 replace_dsk 0x772213
That stopped logging immediately.
As I've learned, the reason for the driver wanting the administrator explicitly running this command instead of fixing things itself, being sheer precaution not to inadvertently overwrite a wrongly replaced disk.
meanwhile I was informed by our SAN admin that they indeed had had some piece of HW replaced by the HITACHI disk subsystem vendor.
Because of the crucial importance of the shared disk subsystems they do have 24/7 support, and failed parts get exchanged immediately.
I only had to run the command that the FC driver had been logging to syslogd, viz.
# /opt/fcms/bin/fcmsutil /dev/td2 replace_dsk 0x772213
That stopped logging immediately.
As I've learned, the reason for the driver wanting the administrator explicitly running this command instead of fixing things itself, being sheer precaution not to inadvertently overwrite a wrongly replaced disk.
Madness, thy name is system administration
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Support
Events and news
Customer resources
© Copyright 2025 Hewlett Packard Enterprise Development LP