- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Serviceguard and cmcld problem
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-13-2003 11:04 PM
тАО10-13-2003 11:04 PM
I have't posted on here before, so please bear with me if I miss something important.
On an A500 running 11.11 and SG A.11.14 and PHSS_27246 we received to following messages in syslog.log followed by a crash or TOC, and a successful failover to serv8. But I'm puzzled by the initial crash/TOC on serv7.
Oct 13 19:45:45 serv7 cmcld: Warning: cmcld process was unable to run for the last 5 seconds
Oct 13 19:46:11 serv7 cmcld: Warning: cmcld process was unable to run for the last 22 seconds,
Oct 13 19:46:11 serv7 cmcld: which is longer than the node timeout (10 seconds)
Oct 13 19:46:11 serv7 cmcld: Communication to node serv8 has been interrupted
Oct 13 19:46:11 serv7 cmcld: Node serv8 may have died
Oct 13 19:46:11 serv7 cmcld: Attempting to form a new cluster
Oct 13 19:46:17 serv7 cmcld: Attempting to adjust cluster membership
Oct 13 19:46:22 serv7 cmcld: Warning: cmcld process was unable to run for the last 4 seconds
Oct 13 19:46:13 serv7 cmcld: Communication to node serv8 has been interrupted
Oct 13 19:46:22 serv7 cmcld: Resumed updating safety time
Oct 13 19:46:13 serv7 cmcld: Attempting to form a new cluster
Oct 13 19:46:22 serv7 cmcld: 2 nodes have formed a new cluster, sequence #15
Oct 13 19:46:22 serv7 cmcld: The new active cluster membership is: serv8(id=2), serv7(id=1)
Oct 13 19:46:39 serv7 cmcld: Warning: cmcld process was unable to run for the last 3 seconds
Oct 13 19:51:09 serv7 automountd[858]: caenfs1:/export/admin/misc/scripts server not responding: RPC: Timed out
Oct 13 19:49:26 serv7 cmcld: Warning: cmcld process was unable to run for the last 3 seconds
Oct 13 19:46:32 serv7 cmcld: Warning: cmcld process was unable to run for the last 4 seconds
Oct 13 19:51:40 serv7 above message repeats 2 times
Oct 13 19:52:35 serv7 cmcld: Warning: cmcld process was unable to run for the last 3 seconds
Oct 13 20:02:57 serv7 cmcld: Warning: cmcld process was unable to run for the last 25 seconds,
Oct 13 20:02:57 serv7 cmcld: which is longer than the node timeout (10 seconds)
Oct 13 20:02:57 serv7 cmcld: WARNING: In the last hour, the ServiceGuard daemon
Oct 13 20:02:57 serv7 cmcld: experienced 3 short OS hangs of 5 or more seconds.
Oct 13 20:02:57 serv7 cmcld: Multiple short hangs or a longer single hang could
Oct 13 20:02:58 serv7 cmcld: lead to a system TOC.
Oct 13 20:02:58 serv7 cmcld: Communication to node serv8 has been interrupted
Oct 13 20:02:58 serv7 cmcld: Node serv8 may have died
Oct 13 20:02:58 serv7 cmcld: Attempting to form a new cluster
Oct 13 20:03:02 serv7 cmcld: Attempting to adjust cluster membership
Oct 13 20:03:04 serv7 cmcld: Resumed updating safety time
Oct 13 20:03:04 serv7 cmcld: 2 nodes have formed a new cluster, sequence #17
Oct 13 20:03:04 serv7 cmcld: The new active cluster membership is: serv8(id=2), serv7(id=1)
Oct 13 20:02:59 serv7 cmcld: Communication to node serv8 has been interrupted
Oct 13 20:03:09 serv7 cmcld: Warning: cmcld process was unable to run for the last 3 seconds
Oct 13 20:06:57 serv7 cmcld: Warning: cmcld process was unable to run for the last 6 seconds
Oct 13 20:07:17 serv7 cmcld: Warning: cmcld process was unable to run for the last 17 seconds,
Oct 13 20:02:59 serv7 cmcld: Attempting to form a new cluster
Oct 13 20:07:17 serv7 cmcld: which is longer than the node timeout (10 seconds)
Oct 13 20:07:17 serv7 cmcld: Communication to node serv8 has been interrupted
Oct 13 20:07:17 serv7 cmcld: Node serv8 may have died
Oct 13 20:07:17 serv7 cmcld: Attempting to form a new cluster
Oct 13 20:07:27 serv7 cmcld: Warning: cmcld process was unable to run for the last 8 seconds
Oct 13 20:07:27 serv7 cmcld: WARNING: In the last hour, the ServiceGuard daemon
Oct 13 20:07:27 serv7 cmcld: experienced 3 short OS hangs of 5 or more seconds.
Oct 13 20:07:27 serv7 cmcld: Multiple short hangs or a longer single hang could
Oct 13 20:07:27 serv7 cmcld: lead to a system TOC.
Oct 13 20:07:28 serv7 cmcld: Resumed updating safety time
Oct 13 20:07:30 serv7 cmcld: 2 nodes have formed a new cluster, sequence #18
Oct 13 20:07:30 serv7 cmcld: The new active cluster membership is: serv8(id=2), serv7(id=1)
Oct 13 20:07:41 serv7 xntpd[9676]: Previous time adjustment incomplete; residual -0.000002 sec
Oct 13 20:07:57 serv7 xntpd[9676]: Previous time adjustment incomplete; residual -0.000005 sec
Oct 13 20:10:00 serv7 cmcld: Warning: cmcld process was unable to run for the last 3 seconds
Oct 13 20:11:02 serv7 cmcld: Warning: cmcld process was unable to run for the last 14 seconds,
Oct 13 20:11:02 serv7 cmcld: which is longer than the node timeout (10 seconds)
Oct 13 20:11:02 serv7 cmcld: Communication to node serv8 has been interrupted
Oct 13 20:11:02 serv7 cmcld: Node serv8 may have died
Oct 13 20:11:02 serv7 cmcld: Attempting to form a new cluster
Oct 13 20:11:04 serv7 cmcld: Resumed updating safety time
Oct 13 20:11:08 serv7 cmcld: 2 nodes have formed a new cluster, sequence #19
Oct 13 20:11:08 serv7 cmcld: The new active cluster membership is: serv8(id=2), serv7(id=1)
Oct 13 20:11:44 serv7 cmcld: Warning: cmcld process was unable to run for the last 32 seconds,
Oct 13 20:11:44 serv7 cmcld: which is longer than the node timeout (10 seconds)
Oct 13 20:11:44 serv7 cmcld: Communication to node serv8 has been interrupted
Oct 13 20:11:44 serv7 cmcld: Node serv8 may have died
And That is the last entry in syslog.log!!
Any suggestions gratefully received.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-13-2003 11:07 PM
тАО10-13-2003 11:07 PM
Re: Serviceguard and cmcld problem
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-13-2003 11:12 PM
тАО10-13-2003 11:12 PM
SolutionYou would need to look at what was going on around this time, and the best method would be to log a call with your HP Response Centre and get your patching levels checked, as well as the dump analyzed.
One other thing that may influence this, is whether you have a singlew cpu or dual cpu's, amd the amount of memory and/or buffercache in use.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-13-2003 11:16 PM
тАО10-13-2003 11:16 PM
Re: Serviceguard and cmcld problem
There are several possibilities for this type of problem.
In a lot of cases like this, you need to contact HP to get a special troubleshooting program called "timer9" which can detect the case of "mini-hangs" on a system (short hangs that wouldnt necessarily be noticed by users but which hold off cmcld enough).
There have been some cases where vhand needed to be patched, others where there was a machine on the network generating a storm of network requests.
Shorting of asking you to patch vhand, ARPA, LAN, Streams, SCSI, and LVM, I'd suggest that you probably want to open a support call with HP .
Since you have a TOC dump, it will allow the engineers to look at what was happening on the system.
They will also probably want OLDsyslog.log file on the machien that died, syslog.log on the machine that lived and the /tmp/scancl.out file which is generated by running the "cmscancl" on either node.
Hope this helps,
Best regards,
Kent Ostby
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-14-2003 02:07 AM
тАО10-14-2003 02:07 AM