Clustering Problem

Mousa55 · ‎01-03-2010

Dear All,

I facing problem with make a new cluster on two node (rp8420, 11.11 for both) As described below.
===============
# cmquerycl -v -C cmclconfig.ascii -n ruxerp01 -n ruxdb02

Begin checking the nodes...
Error: Permission denied to 10.8.1.51
Warning: Unable to determine local domain name for ruxerp01
Looking for other clusters ... Done
Error: Node ruxdb02 is refusing Serviceguard communication.
Please make sure that the proper security access is configured on node
ruxdb02 through either file-based access (pre-A.11.16 version) or role-based
access (version A.11.16 or higher) and/or that the host name lookup
on node ruxdb02 resolves the IP address correctly.
Failed to gather configuration information.
===============
I configured all required files As described in attached file. and i exporting and importing the vg01.map successfully and the VG is active in one node (primary)

How I can solving this problem.

Thanks and best regards

johnsonpk · ‎01-04-2010

Are you able to rlogin from node 1 to node2 without password and vice versa ?

And also one entry per server in cmclnodelist would be enough

Rgds!
Johnson

Mousa55 · ‎01-04-2010

Hi All,

i have 11.16 Serviceguard version and this all patch installed on both node

swlist -l product | grep -i guard
ATS-CORE A.11.16.00 Serviceguard Advanced Tape Services
PHKL_27532 1.0 ; ServiceGuard/vsar incompatibility removed
PHKL_28114 1.0 timeout; ServiceGuard TOC
PHSS_32731 1.0 Serviceguard A.11.16.00
SG-Apache-Tool B.02.20 Serviceguard Apache Script Templates
SG-Informix-Tool B.02.20 Serviceguard Informix Script Templates
SG-NFS-Tool A.11.11.05 Serviceguard NFS Script Templates
SG-Oracle-Tool B.02.20 Serviceguard Oracle Script Templates
SG-Progress-Tool B.02.20 Serviceguard Progress Script Templates
SG-Samba-Tool B.02.20 Serviceguard Samba Script Templates
SG-Sybase-Tool B.02.20 Serviceguard Sybase Script Templates
SG-Tomcat-Tool B.02.20 Serviceguard Tomcat Script Templates
ServiceGuard A.11.16.00 ServiceGuard

thanks

Mousa55 · ‎01-04-2010

Hi,

yes i can rlogin without password in 2 node

thanks

R.K. # · ‎01-04-2010

Hi Nejad,

You need to check following files and their format on both nodes:

/etc/cmcluster/cmclnodelist - must exist on both nodes and entries of both nodes should be present.
/etc/nsswitch.conf - files dns

What is :
# grep ident /etc/services
# grep ident /etc/inetd.conf

Don't fix what ain't broke

Mousa55 · ‎01-04-2010

Hi All,

The /etc/cmcluster/cmclnodelist it is exist on both node.
but the /etc/nsswitch.conf it is availble on second node only and this file included
# more /etc/nsswitch.conf
#
# /etc/nsswitch.files:
#
# @(#)B.11.11_LR
#
# An example file that could be copied over to /etc/nsswitch.conf; it
# does not use any name services.
#
passwd: files
group: files
hosts: files
services: files
networks: files
protocols: files
rpc: files
publickey: files
netgroup: files
automount: files
aliases: files

What is :
# grep ident /etc/inetd.conf
this will allow service to ignore the identd

thanks

R.K. # · ‎01-04-2010

Hi Nejad,

Actually I wanted the output of following commands:

# grep ident /etc/services
# grep ident /etc/inetd.conf

Don't fix what ain't broke

Mousa55 · ‎01-04-2010

Hi All,

on Node 1:

# grep ident /etc/services
ident 113/tcp authentication # RFC1413
# grep ident /etc/inetd.conf
auth stream tcp wait root /usr/lbin/identd identd

Node 2:

# grep ident /etc/services
ident 113/tcp authentication # RFC1413
# grep ident /etc/inetd.conf
auth stream tcp wait root /usr/lbin/identd identd

thanks

R.K. # · ‎01-04-2010

Hi Nejad,

My server has following entries:

/etc/services:
ident 113/tcp auth tap # RFC 1413

/etc/inetd.conf:
ident stream tcp wait bin /usr/lbin/identd identd

Why not try changing auth --> ident....in inetd.conf.

then..
# inetd -c

Don't fix what ain't broke

Mousa55 · ‎01-04-2010

Hi All,

the problem is still.

Begin checking the nodes...
Error: Permission denied to 10.8.1.51
Warning: Unable to determine local domain name for ruxerp01
Looking for other clusters ... Done
Error: Node ruxdb02 is refusing Serviceguard communication.
Please make sure that the proper security access is configured on node
ruxdb02 through either file-based access (pre-A.11.16 version) or role-based
access (version A.11.16 or higher) and/or that the host name lookup
on node ruxdb02 resolves the IP address correctly.
Failed to gather configuration information.

# grep ident /etc/inetd.conf
ident stream tcp wait bin /usr/lbin/identd identd

thanks

R.K. # · ‎01-04-2010

Hi Nejad,

Check following link....may be you can find something useful.

http://bizsupport1.austin.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=3203314&prodTypeId=18964&objectID=c01680756

Don't fix what ain't broke

Horia Chirculescu · ‎01-04-2010

Hello,

You should check also syslog.log on both nodes.

There should be some informations that you can use.

Best regards,
Horia.

Best regards from Romania,
Horia.

Mousa55 · ‎01-04-2010

Hi All,

No any important information in syslog.log file.
only this
"cmclconfd running with weak security (identd disabled)"

# grep ident /etc/inetd.conf
ident stream tcp wait bin /usr/lbin/identd identd

thanks

Rita C Workman · ‎01-04-2010

What does your /etc/resolv.conf show?

Do reserve lookups (nslookup) on both boxes; One for the IP and one for the hostname. Report what comes up.

Move your heartbeat addresses to the bottom of your /etc/hosts files.

A thought - hostfile entries generally look like:
IP

Thanks,
Rita

Rita C Workman · ‎01-04-2010

Forgot to add...

You don't have DNS running on "any" platform?

/rcw

Michael Steele_2 · ‎01-04-2010

Hi

You have problems with your FQDN's.

"...Serviceguard recognizes only the hostname (the first element) in a fully
qualified domain name (a name with four elements separated by periods, like those in the example above). This means, for example, that gryf.uksr.hp.com and gryf.cup.hp.com cannot be nodes in the same cluster, as Serviceguard would see them as the same host gryf..."

A MCSB node is the same thing as a hostname or a FQDN.

Also, you need to be using a DNS lookup first before a /etc/hosts file lookup or none of your package floating ip's will ever fail over correctly.

Support Fatherhood - Stop Family Law

Michael Steele_2 · ‎01-04-2010

Check the file "/etc/cmcluster/cmclnodelist"

Host1 root
Host2 root

Support Fatherhood - Stop Family Law

Rita C Workman · ‎01-04-2010

I think Michael saw what I saw (maybe).

What I didn't care for was that your heartbeat name is too close to your node name. It could maybe do a lookup and stop at the heartbeat and miss going to the actual line (below) for your server. Try changing the name on your heartbeat name to be more unique, like HB-ruxp01 or just HB01.

You can resolve to the hostfile then DNS in your resolv.conf. But both had better be working in harmony. So if you run reverse nslookup's do it for all potential lookup boxes (UNIX hostfile & all DNS servers). That way you ensure resolution is same on all. > Fix accordingly as needed.

For /etc/lvmrc (each to his own), but I generally just set it to Activate=1.
Remember for clustered volume groups you set them to vgchange -c y to give the cluster control.

/rcw

Emil Velez · ‎01-04-2010

I assume cmclnodelist is on both nodes.

make sure it is in /etc/cmcluster

Mousa55 · ‎01-04-2010

Hi All,

i don't have the /etc/resolv.conf file in any node.
and i Move it the heartbeat addresses to the bottom in /etc/hosts and the problem is still. and i don't have DNS running on "any" platform.

Node1:
/etc/hosts

127.0.0.1 localhost loopback
10.8.33.251 itprt
10.15.9.92 rhom rhom.sp.local # HP OpenView Operations Mgt
10.8.1.56 ruxpost
10.8.1.60 ruxdevdb
10.8.1.50 ruxdb01
10.8.1.69 ruxerp01
10.8.1.69 ruxerp01.sp.local
10.8.1.51 ruxdb02
10.8.1.51 ruxdb02.sp.local
10.15.1.26 rhhubcas02 hubcas02
10.8.1.2 DNS01 sp.local
10.8.1.3 DNS02 sp.local
192.168.1.40 ruxerp01-heartbeat
192.168.1.30 ruxdb02-heartbeat
==
# more /etc/cmcluster/cmclnodelist
ruxerp01 root
ruxdb02 root
127.0.0.1 root
==
# more /.rhosts
+
ruxerp01 root
ruxdb02 root
==================
Node2:
/etc/hosts
10.8.1.50 ruxdb01
10.8.1.69 ruxerp01
10.8.1.69 ruxerp01.sp.local
10.8.1.51 ruxdb02
10.8.1.51 ruxdb02.sp.local
127.0.0.1 localhost loopback
10.15.1.26 rhhubcas02 hubcas02
10.8.1.2 DNS01 sp.local
10.8.1.3 DNS02 sp.local
10.8.1.56 ruxpost
10.8.1.60 ruxdevdb
#192.168.1.20 ruxdb01-heartbeat
192.168.1.40 ruxerp01-heartbeat
192.168.1.30 ruxdb02-heartbeat
===
# more /etc/cmcluster/cmclnodelist
ruxerp01 root
ruxdb02 root
127.0.0.1 root
==
# more /.rhosts
+
ruxerp01 root
ruxdb02 root
===============

thanks for all replay

sujit kumar singh · ‎01-04-2010

hi

# more /etc/cmcluster/cmclnodelist
ruxerp01 root
ruxdb02 root
127.0.0.1 root

you can remove the last line that is not required.

also check the pseudo-random-number generator on both the systems.

check for consistency of files

ls /dev/random
ls /dev/urandom
lssf /dev/random
lssf /dev/urandom

swlist -l product -l bundle | grep -i krng

regards
sujit

Mousa55 · ‎01-04-2010

Hi All,

On 2 Node:
# ls /dev/random
/dev/random not found
# ls /dev/urandom
/dev/urandom not found
# lssf /dev/random
lssf: /dev/random: No such file or directory
# lssf /dev/urandom/dev/random
lssf: /dev/urandom/dev/random: No such file or directory

and nnothing output of
swlist -l product -l bundle | grep -i krng

thanks

Michael Steele_2 · ‎01-04-2010

vi "/etc/cmcluster/cmclnodelist"

ruxerp01 root
ruxerp02 root

both nodes.

Support Fatherhood - Stop Family Law

Mousa55 · ‎01-04-2010

Hi All,

i editing this file with same below but
vi /etc/cmcluster/cmclnodelist

ruxerp01 root
ruxerp02 root

the problem is still

thanks

shanmuhanandam · ‎01-04-2010

Hi,
As per your previous statement, please modify the /etc/nsswitch.conf file in the other node also...

Thanks,
Shanmugam.

I am an HPE Employee

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Clustering Problem

Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem

Re: Clustering Problem