- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Serviceguard mystery problem
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-26-2006 03:57 PM
08-26-2006 03:57 PM
Serviceguard mystery problem
Recently I found a 2-node cluster in a very strange state: The cluster is started up clean and running (having one oracle and two NFS packages), but...
After about two hours of operation, one node becomes behave strangely, the node can be ping-ed, telnet/ssh-ed into a host, but when running 'cmviewcl', the command reports that the node is unreachable, and the services running on that node are became unavailable.
Inspecting the syslog and cluster package logs on that machine found nothing special is written.
In addition, this problem is repeatable for multiple trying of halting the node and start the cluster for all.
Is that any hints that I can look into?
A lot of thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-26-2006 04:14 PM
08-26-2006 04:14 PM
Re: Serviceguard mystery problem
Could you post the error log on the screen when you issue cmviewcl
Regards
nanan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-26-2006 06:09 PM
08-26-2006 06:09 PM
Re: Serviceguard mystery problem
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-27-2006 01:57 AM
08-27-2006 01:57 AM
Re: Serviceguard mystery problem
Sounds to me like networking has crashed on the second node.
Take a took at /var/adm/syslog/syslog.log
Also while the second node is up, examine the network configuration in /etc/rc.config.d/netconf
cstm/mstm/xstm might be useful in pinpointing a network problem.
With no evidence other than circumstantial, my prime suspect is a bad NIC card.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-27-2006 04:53 AM
08-27-2006 04:53 AM
Re: Serviceguard mystery problem
1. From each node, use linkloop between the local card and the other node. You'll need the MAC address for each card pair. This bypasses most of the networking software.
2. ping each node from the other node.
3. run traceroute between each node. This will rule out a possible switch or router problem, or a route config error in one of rhe nodes.
4. verify each network seervice (telnet, ssh, remsh/rlogin, ftp)
Note that a good picture of each LAN card's connectivity to ensure that the network path is still correct (electrically as well as logically).
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-27-2006 11:26 PM
08-27-2006 11:26 PM
Re: Serviceguard mystery problem
Since the node appears to be unreachable only to Serviceguard, it seems that Serviceguard cannot connect to the cmclconfd daemon via inetd.
Try 'inetd -k' followed by 'inetd'.
Does the problem stop?
Try the cmviewcl command from the other node in the cluster - does it work?
From the symptoms, this doesn't sound like a permission problem, but rather a connection problem, either physical or configuration-wise.