Operating System - Linux
1839312 Members
3209 Online
110138 Solutions
New Discussion

Re: Red Hat Cluster Systems

 
SOLVED
Go to solution
MikeL_4
Super Advisor

Red Hat Cluster Systems

I am running Red Hat 5.3 64 bit and setting up servers with Red Hat's Server Cluster Software....

I currently have a running Cluster on servers ntpcdb1.dcd.convergys.com and ntpcdb2.dcd.convergys.com

[root@ntpcdb1 ~]# clustat
Cluster Status for ntpcdb @ Mon Jun 15 14:21:05 2009
Member Status: Quorate

Member Name ID Status
------ ---- ---- ------
ntpcdb2.dcd.convergys.com 1 Online
ntpcdb1.dcd.convergys.com 2 Online, Local
[root@ntpcdb1 ~]#

When I start up luci to create a new cluster on ntpcrtr1.dcd.convergys.com for this server and ntpcrtr2.dcd.convergys.com
I receive the following error messages:

Status messages:
Host ntpcrtr1.dcd.convergys.com is already authenticated
Host ntpcrtr2.dcd.convergys.com is already authenticated

The following errors occurred:
ntpcrtr1.dcd.convergys.com reports it is a member of cluster "ntpcdb"
ntpcrtr2.dcd.convergys.com reports it is a member of cluster "ntpcdb"

Don't know how or why these servers are saying they are in the ntpcdb cluster, but I need to get them removed from there so
I can create the new ntpcrtr cluster with these 2 servers in it....

Any ideas ???
9 REPLIES 9
Michael Steele_2
Honored Contributor

Re: Red Hat Cluster Systems

Sounds like you just copied over the ntpcdb config files and forgot to modify them for ntpcrt.
Support Fatherhood - Stop Family Law
MikeL_4
Super Advisor

Re: Red Hat Cluster Systems

The /etc/cluster/cluster.conf file on both ntpcrtr1/2 servers did contain all the entries for the working cluster, ntpcdb1/2...

I modified this file and changed all references to the new cluster and server names...

I then went back into the luci web site for the server, and selected cluster:

Cluster Name: ntpcrtr Start this cluster Restart this cluster Delete this cluster

Status: Not Quorate
Total Cluster Votes: 0
Minimum Required Quorum: 1

Nodes
ntpcrtr1.dcd.convergys.com
ntpcrtr2.dcd.convergys.com
Services
No Services Defined

I then selected to Start the Cluster and received:

Choose a cluster to administer
An error occurred when trying to contact any of the nodes in the ntpcrtr cluster.

When I looked at the cluster.conf file on both ntpcrtr1/2 servers, all the references I changed from ntpcdb1/2 to ntpcrtr1/2, were all changed back to ntpcdb1/2....
MikeL_4
Super Advisor

Re: Red Hat Cluster Systems

Any time I try to configure the cluster or start the cluster in the Luci WEB Interface or manually, the /etc/cluster/cluster.conf file is changed from to the cluster.conf file from another cluster....
Solution

Re: Red Hat Cluster Systems

Are ntpcrtr1, ntpcrtr2, ntpcdb1 and ntpcdb2 different machines or just two machines with more than one IP address assigned to each?

In any case you can only build one cluster with two machines, since you need at least two nodes, each declared with nodename matching the output of the "hostname" command.

The second place to look for the error is the method for cluster communication.

RHCS uses broadcast as the default for communicating with other cluster nodes, that restricts you to a single cluster in a physical network, at least when you configure the cluster using system-config-cluster utility.

Since you are using luci, the default communication type is multicast. In that case, for the second cluster you must specify a different multicast address. You can specify it in the multicast tab.

Make sure to allow IGMP traffic on your ethernet switches.

Hope it helps.

Best regards.

Steven E. Protter
Exalted Contributor

Re: Red Hat Cluster Systems

Shalom,

What is your fencing setup?

You may need to issue this command:

fence_ack_manual -O -n nodename

Had the same problem with my home lab this morning.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
MikeL_4
Super Advisor

Re: Red Hat Cluster Systems

I've finally been able to create the Cluster with nodes ntpcrtr1.dcd.convergys.com and ntpcrtr2.dcd.convergys.com....

However, only ntpcrtr1 is active.... ntpcrtr2 gives the error message:

The ricci agent for this node is unresponsive.

And I show ricci running on the ntpcrtr2 server:

ricci is running on this server:
[root@ntpcrtr2 ~]# ps -ef |grep ricci
ricci 8423 1 0 09:52 ? 00:00:00 ricci -u 202
root 9464 7337 0 10:49 pts/1 00:00:00 grep ricci
[root@ntpcrtr2 ~]# service ricci status
ricci (pid 8423) is running...
[root@ntpcrtr2 ~]#

I've found some hits saying there is a bug fix for this, but both servers were updated and all available patches have been applied..... Now I'm loast again....

Cluster Name: ntpcrtr

Status: Quorate
Total Cluster Votes: 1
Minimum Required Quorum: 1
Nodes
ntpcrtr1.dcd.convergys.com
ntpcrtr2.dcd.convergys.com

---------------------------------------
ntpcrtr
Node Name: ntpcrtr2.dcd.convergys.com
Status: This node is not responding

The ricci agent for this node is unresponsive.
Node-specific information is not available at this time.

Re: Red Hat Cluster Systems

Can you post your cluster.conf files?
MikeL_4
Super Advisor

Re: Red Hat Cluster Systems

[root@ntpcrtr1 ~]# cat /etc/cluster/cluster.conf


















[root@ntpcrtr1 ~]#
MikeL_4
Super Advisor

Re: Red Hat Cluster Systems

.