Operating System - HP-UX
1823384 Members
2529 Online
109654 Solutions
New Discussion юеВ

First time setting up a cluster, getting an error

 
SOLVED
Go to solution
Ken Penland_1
Trusted Contributor

First time setting up a cluster, getting an error

Ok, it is prob a dumb mistake somewhere, but I am just not seeing it... I have a two node cluster, when I run the "cmruncl" on one box, it comes up just fine...shut it down and try to run it on the other box, I get the error message:
"Unable to determine operating system version of node"
So it seems box A can communicate with box B, but not the other way around...we have /etc/cmcluster/cmclnodelist set up with both boxes listed in it like it is supposed to....what am I missing?
'
12 REPLIES 12
Mister_Z
Frequent Advisor

Re: First time setting up a cluster, getting an error

Ensure the hostname for boxA in boxB's cmclnodelist is correct.
I work for HP
Ken Penland_1
Trusted Contributor

Re: First time setting up a cluster, getting an error

does MC service guard rely on the hostname of a system? because the name used to get to the box via dns is one name, but the HOSTNAME on the system is another...I am starting to think that this is the problem?
'
Mister_Z
Frequent Advisor

Re: First time setting up a cluster, getting an error

Ken,

that's the root of the problem. Find and excerpt of doc UMCSGKBRC00008185

CAUSE: Hostname resolution services (whether local /etc/hosts or
DNS) may be supplying a mix of fully qualified domain names (FQDN)
with simple hostnames.

SOLUTION: Use 'netstat -i' on each node to see whether simple or FQDN
hostnames are used. ALL cluster-related files must reference the
hostname the way that the name service supplies it. Update either the
name service provider or the cluster-related file so that the same type
of reference is used. Simple hostnames are preferred.

I work for HP
Ken Penland_1
Trusted Contributor

Re: First time setting up a cluster, getting an error

doing a netstat -i shows me the fully qualified name, so I tried using that...still no joy..
'
Geoff Wild
Honored Contributor

Re: First time setting up a cluster, getting an error

Yes, MC/SG relies on the host name of the systems...

In our clusters:

Add eachj host name and ip's to the /etc/hosts file

In /etc/nsswitch.conf:
hosts: files [NOTFOUND=CONTINUE] dns

For security, add hostsnames and userids (like root) to /etc/cmcluster/cmnodelist

To make sure everything works, run:

cmquerycl -v -C /etc/cmcluster/yourcluster.ascii -n yournode1 -n yournode2


Rgds...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.
Michael Steele_2
Honored Contributor

Re: First time setting up a cluster, getting an error

a) check permissions on both nodes for /etc/cmcluster/cmclnodelist.

b) copy work node's copy of cmclnodelist over to altertate.

c) attach:
# cmscancl -n nodeA -o /tmp/file
# cmscancl -n nodeB -o /tmp/file

d) Try just the hostname and not FQDN.
Support Fatherhood - Stop Family Law
Michael Steele_2
Honored Contributor

Re: First time setting up a cluster, getting an error

a) check permissions on both nodes for /etc/cmcluster/cmclnodelist.

b) copy working node's cmclnodelist over to other node:

c) Try just the hostname and not FQDN.

d) attach:
# cmscancl -n nodeA -o /tmp/file
# cmscancl -n nodeB -o /tmp/file
Support Fatherhood - Stop Family Law
Michael Steele_2
Honored Contributor

Re: First time setting up a cluster, getting an error

Please attach:

# cmquerycl -C config.ascii -n lr006b04 -n lr006b05
Support Fatherhood - Stop Family Law
Ken Penland_1
Trusted Contributor

Re: First time setting up a cluster, getting an error

Ok, attached is the output you requested...with the ip addresses changed to make our IA folks happy ;)

one thing to note, and I know this has to be the problem, but am unsure on how to get around it...
our www box is our primary web server, and that is what we want to fail over...well, www is the hostname on the box, but the primary IP address of the system resolves to ourd1, we have a secondary ip address at lan2:1 which resolves to www (so when a user goes to the www address they can pull up the webpage) anyways, when I try to use the name ourd1, it cant connect, when I use the name www it works, and I can bring up the cluster, but only from the funwww box, if I try to start the cluster from the www box it cant connect to funwww with an error:
Unable to determine operating system version of node funwww.
'
John Palmer
Honored Contributor

Re: First time setting up a cluster, getting an error

Your cluster should consist of the nodes:
funwww and ourd1.

www should be defined as your package IP address (IP[0] in your package control file).

Regards,
John
Stephen Doud
Honored Contributor
Solution

Re: First time setting up a cluster, getting an error

The "Cluster Configuration Planning" section of the "Managing Serviceguard" manual describes the NODE_NAME parameter which is used in the cluster ASCII file:

NODE_NAME The hostname of each system that will be a node in the cluster. The node name can contain up to 40 characters. The node name must not contain the full domain name. For example, enter ftsys9, not ftsys9.cup.hp.com.

Note the word "hostname". What is returned from the 'hostname' command?

I was able to simulate your cluster creation and successfully ran 'cmruncl' from the server which was given a hostname matching the alias of the subordinate IP - in my case, lan0:1

I don't recommend doing this.

The www alias should be associated with a relocatable IP address which would float with a package.
-sd
Ken Penland_1
Trusted Contributor

Re: First time setting up a cluster, getting an error

the hostnames as reported by the hostname command AND from:
grep ^HOSTNAME /etc/rc.config.d/netconf

for both systems are:
funwww and www
so that must be why the www name works and not the ourd1 name, even though ourd1 is the name associated to the ip address that is given...

The web folks wanted whichever box is running the package to have the hostname of www cause apparently some of there scripts look at the hostname for whatever reason, but that is starting to sound like not a possiblity...since the node names rely on the hostname...

what I am going to do is see if we can get the hostname on the www box changed to ourd1, drop the hard coded www ip addy configured to lan2:1, and that should eliminate these strange problems...

Thanks for your help :)
'