HPE GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Serviceguard membership error
Operating System - HP-UX
1837584
Members
2926
Online
110117
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2009 06:38 AM
08-13-2009 06:38 AM
Serviceguard membership error
Hopefully one of the Serviceguard gurus on here can shed some light on this. We had a network outage that caused complete connectivity loss between the two nodes of a cluster. Things got in a rather funky state, which I'm still trying to piece together from syslog entries... but it appeared the passive node, host2, (which was the one that had network connectivity yanked out from under it) gained control of the lock disk during cluster reformation. Although it supposedly released it when it couldn't start the package because of the network problem, host1 never regained control automatically. In fact it hung (no entries in syslog) for 20 minutes before it TOC'ed early in the episode. A couple of hours later, host2 was reconnected to the network and the following error occurred on host1 and it TOC'ed again.
daemon-err 2009-08-09 03:19:16 cmcld: Serviceguard fatal error in membership, sbd/sbd.c 556
daemon-err 2009-08-09 03:19:16 cmcld: Remote members: host2
daemon-err 2009-08-09 03:19:16 cmcld: Local members: host1
daemon-err 2009-08-09 03:19:16 cmcld: Could not enable safety time.
daemon-info 2009-08-09 03:19:16 cmcld: Aborting: sbd/sbd.c 672 (FATAL MEMBERSHIP ERROR DETECTED)
daemon-err 2009-08-09 03:19:16 cmcld: Serviceguard fatal error in membership, sbd/sbd.c 556
daemon-err 2009-08-09 03:19:16 cmcld: Remote members: host2
daemon-err 2009-08-09 03:19:16 cmcld: Local members: host1
daemon-err 2009-08-09 03:19:16 cmcld: Could not enable safety time.
daemon-info 2009-08-09 03:19:16 cmcld: Aborting: sbd/sbd.c 672 (FATAL MEMBERSHIP ERROR DETECTED)
--
Jeff Traigle
Jeff Traigle
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2009 06:44 AM
08-13-2009 06:44 AM
Re: Serviceguard membership error
Taken from "Managing Serviceguard Sixteenth Edition"
http://docs.hp.com/en/B3936-90140/ch03s01.html#d0e2638
>> The cmcld daemon sets a safety timer in the kernel which is used to detect kernel hangs. If this timer is not reset periodically by cmcld, the kernel will cause a system TOC (Transfer of Control) or INIT, which is an immediate system reset without a graceful shutdown. (This manual normally refers to this event simply as a system reset.) This could occur because cmcld could not communicate with the majority of the clusterâ s members, or because cmcld exited unexpectedly, aborted, or was unable to run for a significant amount of time and was unable to update the kernel timer, indicating a kernel hang.
http://docs.hp.com/en/B3936-90140/ch03s01.html#d0e2638
>> The cmcld daemon sets a safety timer in the kernel which is used to detect kernel hangs. If this timer is not reset periodically by cmcld, the kernel will cause a system TOC (Transfer of Control) or INIT, which is an immediate system reset without a graceful shutdown. (This manual normally refers to this event simply as a system reset.) This could occur because cmcld could not communicate with the majority of the clusterâ s members, or because cmcld exited unexpectedly, aborted, or was unable to run for a significant amount of time and was unable to update the kernel timer, indicating a kernel hang.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2009 07:31 AM
08-13-2009 07:31 AM
Re: Serviceguard membership error
It would help if we knew what SG version, and what SG patch is installed on both nodes
what /usr/lbin/cmcld
what /usr/lbin/cmcld
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2009 07:33 AM
08-13-2009 07:33 AM
Re: Serviceguard membership error
It would indeed. Sorry about that.
HP-UX 11.11
Serviceguard A.11.16
#-> what /usr/lbin/cmcld
/usr/lbin/cmcld:
HP92453-02A.11.00 HP-UX SYMBOLIC DEBUGGER (END.O ILP32) $Revision: 75.02 $
Build date: Mon Nov 12 11:52:56 PST 2007
Build id: ibld_sg_a1116patch_1111_product
Build platform: hpux
Cluster Monitor Product $Revision: 82.2 $
Cluster Monitor Product Only $Revision: 82.2 $
Daemon
A.11.16.00 Date: 11/12/07 Patch: PHSS_36898
HP-UX 11.11
Serviceguard A.11.16
#-> what /usr/lbin/cmcld
/usr/lbin/cmcld:
HP92453-02A.11.00 HP-UX SYMBOLIC DEBUGGER (END.O ILP32) $Revision: 75.02 $
Build date: Mon Nov 12 11:52:56 PST 2007
Build id: ibld_sg_a1116patch_1111_product
Build platform: hpux
Cluster Monitor Product $Revision: 82.2 $
Cluster Monitor Product Only $Revision: 82.2 $
Daemon
A.11.16.00 Date: 11/12/07 Patch: PHSS_36898
--
Jeff Traigle
Jeff Traigle
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Events and news
Customer resources
© Copyright 2025 Hewlett Packard Enterprise Development LP