- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- cannot start service guard
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2003 08:16 PM
07-17-2003 08:16 PM
cannot start service guard
[root@seacliff cmcluster]# cmruncl
Unable to open communications to configuration daemon: Connection refused
Unable to connect to configuration database.
Unable to open communications to configuration daemon: No such file or directory
cmruncl : Unable to determine the nodes on the current cluster
cmruncl : Either no cluster configuration file exists, or the file is corrupted, or cmclconfd is unable to run
[root@seacliff cmcluster]# ps -ef |grep cmclconfd
root 1696 1015 0 03:51 ? 00:00:00 cmclconfd -p
root 1698 1518 0 03:51 pts/0 00:00:00 grep cmclconfd
When trying on the other node it complained with this error:
[root@augusta cmcluster]# cmruncl
Cannot connect to configuration daemon (cmclconfd) on node seacliff
Unable to execute command remotely
I tried copying the whole /usr/local/cmcluster to the rebooted node so I reapply the ascii config file but no luck. cmviewcl on the rebooted node had this error:
cmveiwcl : Unable to query the package information: cluster my be reforming, try again: Communication error on send.
cmviewcl on the other node comes up with cluster down and nodes down and packages down. The /var/log/messages on the rebooted server indicates the following:
Jul 18 03:17:15 seacliff cmcluster: Created /dev/deadman c 10 63
Jul 18 03:17:15 seacliff cmcluster: /usr/local/cmcluster/bin/cmresmond failed to
start. See /usr/local/cmcluster/ResMonServer.log for details.
Jul 18 03:17:16 seacliff CM-CMD[1361]: /usr/local/cmcluster/bin/cmrunnode -v
Jul 18 03:17:16 seacliff cmcluster: ERROR: Unable to join cluster.
Any help on this is greatly appreciated. Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2003 08:37 PM
07-17-2003 08:37 PM
Re: cannot start service guard
1) piranha
2) Software Service Contract with HP. Service guard is powerful, but hard.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2003 08:41 PM
07-17-2003 08:41 PM
Re: cannot start service guard
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2003 09:05 PM
07-17-2003 09:05 PM
Re: cannot start service guard
Doesn't it require full remote-host equivalency for the user running the service guard (i.e. root can rsh etc.) ?
Also check to make sure the firewall isn't interfering.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2003 09:56 PM
07-17-2003 09:56 PM
Re: cannot start service guard
I could ping and rsh from either machine. I can rsh to itself as well. Telnet works for both machines also. No firewalls between them. Both are on the same subnet.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 09:03 AM
07-18-2003 09:03 AM
Re: cannot start service guard
when u say rsh is working, i assume rsh as root.
are u sure there is no problem with the subnet configuration in cluster configuration file. i have had problems with incorrect entries. can u check that as well.
-balaji(based on my exp with MCSG on hp ux)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 12:34 PM
07-18-2003 12:34 PM
Re: cannot start service guard
Yes, I'm using root all the way. Can u point me in the direction to how to troubleshoot the subnet configuration or connectivity?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 12:45 PM
07-18-2003 12:45 PM
Re: cannot start service guard
can you post the following
1. ifconfig -a on both the machines.
2. the relevant configuration of the HB on both the machines.
let me or the forumers out here see if anything is fishy.
-b-
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 12:58 PM
07-18-2003 12:58 PM
Re: cannot start service guard
eth0 Link encap:Ethernet HWaddr 00:30:48:24:ED:E0
inet addr:172.31.50.233 Bcast:172.31.50.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:389151 errors:0 dropped:0 overruns:0 frame:0
TX packets:213381 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:86488658 (82.4 Mb) TX bytes:43372291 (41.3 Mb)
Interrupt:17 Base address:0x4400 Memory:fc321000-fc321038
eth0:2 Link encap:Ethernet HWaddr 00:30:48:24:ED:E0
inet addr:172.31.50.124 Bcast:172.31.50.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:17 Base address:0x4400 Memory:fc321000-fc321038
eth0:3 Link encap:Ethernet HWaddr 00:30:48:24:ED:E0
inet addr:172.31.50.125 Bcast:172.31.50.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:17 Base address:0x4400 Memory:fc321000-fc321038
eth0:4 Link encap:Ethernet HWaddr 00:30:48:24:ED:E0
inet addr:172.31.50.220 Bcast:172.31.50.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:17 Base address:0x4400 Memory:fc321000-fc321038
eth0:5 Link encap:Ethernet HWaddr 00:30:48:24:ED:E0
inet addr:172.31.50.126 Bcast:172.31.50.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:17 Base address:0x4400 Memory:fc321000-fc321038
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:23387 errors:0 dropped:0 overruns:0 frame:0
TX packets:23387 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2323018 (2.2 Mb) TX bytes:2323018 (2.2 Mb)
For node augusta:
eth0 Link encap:Ethernet HWaddr 00:30:48:23:86:0E
inet addr:172.31.50.208 Bcast:172.31.50.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:17813240 errors:0 dropped:0 overruns:0 frame:0
TX packets:25421266 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:2779927657 (2651.1 Mb) TX bytes:3875068562 (3695.5 Mb)
Interrupt:17 Base address:0x4400 Memory:fc321000-fc321038
eth0:1 Link encap:Ethernet HWaddr 00:30:48:23:86:0E
inet addr:172.31.50.127 Bcast:172.31.50.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:17 Base address:0x4400 Memory:fc321000-fc321038
eth0:2 Link encap:Ethernet HWaddr 00:30:48:23:86:0E
inet addr:172.31.50.124 Bcast:172.31.50.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:17 Base address:0x4400 Memory:fc321000-fc321038
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:431850 errors:0 dropped:0 overruns:0 frame:0
TX packets:431850 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:51640582 (49.2 Mb) TX bytes:51640582 (49.2 Mb)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 01:02 PM
07-18-2003 01:02 PM
Re: cannot start service guard
u have too many ip addresses.
are u running so many packages. btw, even if they are the ip address of packages, they get activated only when the cluster and package comes up right?
and can u show the relevant snippets from ur cluster configuration file.
-balaji
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 01:36 PM
07-18-2003 01:36 PM
Re: cannot start service guard
Here are the HB configurations. At the moment I just have one physical card active. I ran out of ports on the switch for another card.
augusta:
DEVICE=eth0
BOOTPROTO=static
ONBOOT=yes
BROADCAST=172.31.50.255
NETWORK=172.31.50.0
NETMASK=255.255.255.0
IPADDR=172.31.50.208
USERCTL=false
seacliff:
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=static
BROADCAST=172.31.50.255
NETWORK=172.31.50.0
NETMASK=255.255.255.0
IPADDR=172.31.50.233
On the 5 packages' config file, I have:
SUBNET 172.31.50.0
On the cluster config file:
NODE_NAME augusta
NETWORK_INTERFACE eth0
HEARTBEAT_IP 172.31.50.208
NODE_NAME seacliff
NETWORK_INTERFACE eth0
HEARTBEAT_IP 172.31.50.233
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 01:39 PM
07-18-2003 01:39 PM
Re: cannot start service guard
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 01:53 PM
07-18-2003 01:53 PM
Re: cannot start service guard
-balaji
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 02:04 PM
07-18-2003 02:04 PM
Re: cannot start service guard
I found this in the /usr/local/qs/log.
Jul 17 23:34:37:0:Request for lock /sg/cms_cluster succeeded. New lock owners: s
eacliff,augusta
Jul 18 00:20:26:0:Request for lock /sg/cms_cluster succeeded. New lock owners: s
eacliff,augusta
Jul 18 00:26:07:0:Request for lock /sg/cms_cluster succeeded. New lock owners: s
eacliff,augusta
Jul 18 01:51:08:0:Request for lock /sg/cms_cluster succeeded. New lock owners: a
ugusta
[root@qsla1 log]# pwd
/usr/local/qs/log
[root@qsla1 log]#
The last entry for a request only have augusta for an owner. Any ideas?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2003 02:50 PM
07-18-2003 02:50 PM
Re: cannot start service guard
btw, which is ur quorum server! augusta?
-balaji
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 07:20 AM
07-21-2003 07:20 AM
Re: cannot start service guard
The quorum server is qs1. No there are no cluster daemon up on that node. I had the AUTOSTART_CMCLD=1 parameter set. None came up after the reboot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 07:35 AM
07-21-2003 07:35 AM
Re: cannot start service guard
-balaji
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 08:45 AM
07-21-2003 08:45 AM
Re: cannot start service guard
I got the message below.
File /usr/local/cmcluster/conf/cmresmond_config.xml does not exist
udp 0 0 0.0.0.0:5302 0.0.0.0:*
/etc/init.d/cmcluster.init: rm: command not found
cmrunnode : Unable to determine the nodes on the current cluster
cmrunnode : Either no cluster configuration file exists, or the file is corrupted, or cmclconfd is unable to run
Unable to open communications to configuration daemon: Connection refused
Unable to connect to configuration database.
Unable to open communications to configuration daemon: No such file or directory
ERROR: Unable to join cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 09:32 AM
07-21-2003 09:32 AM
Re: cannot start service guard
any chance with ur files are not proper are incorrect path settings.
-balaji
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 12:14 PM
07-21-2003 12:14 PM
Re: cannot start service guard
The files seems to be in order. I'm not sure why it doesn't see where rm is. The cmcluster.init script is untouched. It uses the absoulute path for rm which is /bin/rm. It is removing the /dev/deadman file by using /bin/rm -rf. I verified it through the command line both with absolute path and just with rm. It's in root's search path. I tried to remove seacliff from the cluster configuration but I could run a cmgetconf or cmquerycl to create a new ascii file. The cluster would not run from either node because it gets hung up on trying to probe seacliff which the other node couldn't do it. Is there other options for me to use? Thanks again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 09:07 PM
07-21-2003 09:07 PM
Re: cannot start service guard
-balaji
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-22-2003 02:30 PM
07-22-2003 02:30 PM
Re: cannot start service guard
Thanks for all your help. We didn't find the culprit of the problem but I found out that there was a function that is called but is already being called from the SERVICE_CMD parameter. Here's what I did to get it started again.
Rebooted both servers and found out that neither can join the cluster anymore. Both are now getting the same message as seacliff.
- Copied /usr/local/cmcluster directory to /usr/local/cmcluster.old on one and blew it away on the other node
- Uninstalled Service Guard on both nodes.
- Reinstalled Service Guard on both nodes.
- Kept Quorum server as configured.
- Verified that I could run cmviewcl to inform me that SG is not configured on both nodes.
- Recreated cluster ascii file and plugged in the entries from old file
- Recreated pkg config file and plugged in the enties from the old file
- cmcheckconf & cmapplyconf
- It's now back and running.
Thanks again for everyone's input.