- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Serviceguard restart or reset ?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-27-2008 03:37 PM
04-27-2008 03:37 PM
Serviceguard restart or reset ?
one hostname is mercury
the other is earth.
two server serviceguard A.11.16.00 installed and services.
two days ago i am install patch.
patch list is next:
HWEnable 0612
GOLDQPK 0712
Serviceguard Patch PHSS_36898
And rp7420 firmware ugrade 4.10 version
yesterday was no problem
today morning, i check syslog.log and see next error message
---------------------------------------------
mercury syslog.log
Apr 28 00:03:19 mercury cmcld: Warning: cmcld process was unable to run for the last 19.81 sec
onds,
Apr 28 00:03:19 mercury cmcld: which is longer than the node timeout (5.00 seconds)
Apr 28 00:03:19 mercury cmcld: Timer_loop delayed: current state=1 pop=(0,12116943), now=(0,12
118924), delta=19s(0,1981)
Apr 28 00:03:19 mercury cmcld: Timer_loop's previous check_timers started at tsb (0,12116942)
and lasted 19s (0,1982) executed 2 callbacks
Apr 28 00:03:19 mercury cmcld: Timer_loop's previous sigwait started at tsb (0,12116923) and l
asted 0s (0,19)
Apr 28 00:03:19 mercury cmcld: Timer_loop's previous cm_lock started at tsb (0,12116942) and l
asted 0s (0,0)
Apr 28 00:03:19 mercury cmcld: Timer_loop's last timer callback (type=8,id=-1) started at tsb
(0,12116942) and lasted 19s (0,1982)
Apr 28 00:03:19 mercury cmcld: Timer_loop's last greater than 1s timer callback (type=8,id=-1)
started at tsb (0,12116942) and lasted 19s (0,1982)
Apr 28 00:03:19 mercury cmcld: Could not send Heartbeat message to earth
Apr 28 00:03:19 mercury cmcld: Node earth may have died
Apr 28 00:03:19 mercury cmcld: Attempting to form a new cluster
Apr 28 00:03:19 mercury cmcld: Beginning standard election
Apr 28 00:03:19 mercury cmcld: timers delayed 16.20 seconds
Apr 28 00:03:19 mercury cmcld: Timer_loop delayed: current state=3 pop=(0,12117304), now=(0,12
118924), delta=16s(0,1620)
Apr 28 00:03:19 mercury cmcld: Timer_loop has been executing timerp(type=41, id=0, poptime=(0,
0))since tsb (0,12118924) for 0s(0,0)
Apr 28 00:03:19 mercury cmcld: Timer_loop has been executing check_timer since tsb (0,12118924
) for 0s (0,0)
Apr 28 00:03:19 mercury cmcld: Timer_loop's previous check_timers started at tsb (0,12116942)
and lasted 19s (0,1982) executed 2 callbacks
Apr 28 00:03:19 mercury cmcld: Timer_loop's previous sigwait started at tsb (0,12116923) and l
asted 0s (0,19)
Apr 28 00:03:19 mercury cmcld: Timer_loop's previous cm_lock started at tsb (0,12116942) and l
asted 0s (0,0)
Apr 28 00:03:19 mercury cmcld: Timer_loop's last timer callback (type=6,id=1) started at tsb (
0,12118924) and lasted 0s (0,0)
Apr 28 00:03:19 mercury cmcld: Timer_loop's last greater than 1s timer callback (type=8,id=-1)
started at tsb (0,12116942) and lasted 19s (0,1982)
Apr 28 00:03:19 mercury cmcld: Communication to node earth has been interrupted
Apr 28 00:03:19 mercury cmcld: Attempting to form a new cluster
Apr 28 00:03:19 mercury cmcld: Beginning standard election
Apr 28 00:03:21 mercury cmclconfd[5420]: Updated file /var/adm/cmcluster/frdump.cmcld.6 for no
de mercury (length = 512096).
Apr 28 00:03:21 mercury cmcld: Attempting to adjust cluster membership
Apr 28 00:03:21 mercury cmcld: Beginning standard partial election
Apr 28 00:03:22 mercury cmcld: Resumed updating safety time
Apr 28 00:03:22 mercury cmcld: 2 nodes have formed a new cluster, sequence #3
Apr 28 00:03:22 mercury cmcld: The new active cluster membership is: earth(id=1), mercury(id=2
)
Apr 28 07:55:46 mercury cmclconfd[11882]: ERROR: The identd authenticated user name () did not
match with the sender user name (root) while querying for node earth. Exiting.
----------------------------------------------
earth syslog.log
Apr 28 00:03:04 earth cmcld: Timed out node mercury. It may have failed.
Apr 28 00:03:04 earth cmcld: Attempting to adjust cluster membership
Apr 28 00:03:04 earth cmcld: Beginning standard partial election
Apr 28 00:02:54 earth vmunix: NFS fsstat failed for server mercury: RPC: Timed out
Apr 28 00:03:10 earth cmcld: Obtaining Cluster Lock
Apr 28 00:03:11 earth cmcld: Successfully obtained the Cluster Lock
Apr 28 00:03:11 earth cmcld: Turning off safety time protection since the cluster
Apr 28 00:03:11 earth cmcld: may now consist of a single node. If Serviceguard
Apr 28 00:03:11 earth cmcld: fails, this node will not automatically halt
Apr 28 00:03:21 earth cmcld: Enabling safety time protection
Apr 28 00:03:21 earth cmcld: Attempting to adjust cluster membership
Apr 28 00:03:21 earth cmcld: Beginning standard partial election
Apr 28 00:03:21 earth cmclconfd[4179]: Updated file /var/adm/cmcluster/frdump.cmcld.0 for node
earth (length = 512096).
Apr 28 00:03:21 earth cmcld: Resumed updating safety time
Apr 28 00:03:21 earth cmclconfd[4179]: Updated file /var/adm/cmcluster/frdump.cmcld.1 for node
earth (length = 12444).
Apr 28 00:03:22 earth cmcld: Clearing Cluster Lock
Apr 28 00:03:22 earth cmcld: 2 nodes have formed a new cluster, sequence #3
Apr 28 00:03:22 earth cmcld: The new active cluster membership is: earth(id=1), mercury(id=2)
Apr 28 00:03:23 earth cmcld: Successfully cleared Cluster Lock
-------------------------------------------------
what can i do this condition?
why did this ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-27-2008 06:53 PM
04-27-2008 06:53 PM
Re: Serviceguard restart or reset ?
hi,
suspected possible hardware failure, i would suggest you have a proper shutdown and bootup and see you have any further errors,
if this sympton still showing, better log case to HP
WK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-27-2008 07:41 PM
04-27-2008 07:41 PM
Re: Serviceguard restart or reset ?
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-27-2008 09:19 PM
04-27-2008 09:19 PM
Re: Serviceguard restart or reset ?
i talk to this case with HP engineer.
he say that time was backup schedule operating...so network I/O was busy.
then CPU was failed processing serviceguard daemon.
and he say i have to update igelan driver.
once more thanks !
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-27-2008 09:21 PM
04-27-2008 09:21 PM