Operating System - HP-UX
1834450 Members
2235 Online
110067 Solutions
New Discussion

Re: Unable to create a service guard cluster with 11.16..Please Help!

 
SOLVED
Go to solution
jmckinzie
Super Advisor

Unable to create a service guard cluster with 11.16..Please Help!

Ok, here is the scenario....
I have two hosts that will be running applications from NFS mounts that reside on a NETAPP. Both these hosts have the correct drives mounted via NFS...

I have installed SG11.16 on both...
I have went through all the setup features on
http://www2.itrc.hp.com/service/cki/search.do?category=c0&prevQueryString=KNC050897003&mode=id&searchString=umcsgkbrc00010342&searchCrit=allwords&docType=Security&docType=Patch&docType=EngineerNotes&docType=BugReports&docType=Hardware&docType=ReferenceMaterials&docType=ThirdParty&dateRange=all

except the LVM preparation because these packages will run from NFS mounts that are already mounted...

However, I am getting various errors when trying to create the cluster...

Here is the error I am getting...

# cmquerycl -C cfc.dev -n cfcap01d -n cfcap02d

Error: Permission denied accessing node cfcap01d.
Note: Disks were discovered which are not in use by either LVM or VxVM.
Use pvcreate(1M) to initialize a disk for LVM or,
use vxdiskadm(1M) to initialize a disk for VxVM.
Error: Permission denied accessing node cfcap01d.
Error: Permission denied accessing node cfcap01d.
Error: Unable to determine network configuration: failed to setup for probing the network on node cfcap01d: Permission denied
Failed to probe network
Warning: Network interface lan0 on node cfcap02d couldn't talk to itself.
Warning: Network interface lan1 on node cfcap02d couldn't talk to itself.
Warning: Network interface lan2 on node cfcap02d couldn't talk to itself.
Failed to gather configuration information.


Any ideas on where to begin with the permission denied statements?
19 REPLIES 19
melvyn burnard
Honored Contributor

Re: Unable to create a service guard cluster with 11.16..Please Help!

you need to read the security manuals for 11.16 an densure you have all configuered LAN's loste d in your hosts file.
See:
http://docs.hp.com/en/5874/securingserviceguard_nov2005.pdf
http://docs.hp.com/en/6283/SGsecurityfiles.pdf
http://docs.hp.com/en/6283/SGsecurityfiles.pdf

ALso, you will need a cluster lock disc (on LVM) or a quorum server for a two node cluster
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Rita C Workman
Honored Contributor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Melvyn has pointed you to the books...here's a couple things quick to look at, since your errors are mostly network related.

1. Make sure both nodes host file is right. I like to nslookup and make sure reverse lookup works right on everything.
2. Did you set up your heartbeat properly-can you ping everything there too.
3. Did you set up so root can talk to the other box. I just use .rhosts file. Do you have this set up on your systems ?
4. Looks like you need to do some checking on cfcap02d lancards. Are they configured right ? Get out your lan notes, cause you may need to check them (lanscan,linkloop,netstat,lanadmin).

Rgrds,
Rita
jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

1. Make sure both nodes host file is right. I like to nslookup and make sure reverse lookup works right on everything.

nslookup/ping work fine !

2. Did you set up your heartbeat properly-can you ping everything there too.

How do i setup heartbeat?

3. Did you set up so root can talk to the other box. I just use .rhosts file. Do you have this set up on your systems ?

I don't have .rhosts available on my systems...
however, I have these entries in the
/etc/cmcluster/cmclnodelist
cfcap01d root
cfcap02d root

4. Looks like you need to do some checking on cfcap02d lancards. Are they configured right ? Get out your lan notes, cause you may need to check them (lanscan,linkloop,netstat,lanadmin).


cfcap02d lanscan
# lanscan
Hardware Station Crd Hdw Net-Interface NM MAC HP-DLPI DLPI
Path Address In# State NamePPA ID Type Support Mjr#
0/0/0/0 0x000F201D9D63 0 UP lan0 snap0 1 ETHER Yes 119
0/10/0/0 0x00306EEA341C 1 UP lan1 snap1 2 ETHER Yes 119
0/12/0/0 0x00306EEA04CC 2 UP lan2 snap2 3 ETHER Yes 119


cfcap01d lanscan

# lanscan
Hardware Station Crd Hdw Net-Interface NM MAC HP-DLPI DLPI
Path Address In# State NamePPA ID Type Support Mjr#
0/0/0/0 0x000F201DADA2 0 UP lan0 snap0 1 ETHER Yes 119
0/10/0/0 0x00306EEA04DC 1 UP lan1 snap1 2 ETHER Yes 119
0/12/0/0 0x00306EEA345B 2 UP lan2 snap2 3 ETHER Yes 119

root@cfcap01d:/etc/cmcluster
# lanadmin -x 1
Current Config = 100 Full-Duplex MANUAL
root@cfcap01d:/etc/cmcluster
# lanadmin -x 0
Current Config = 10 AUTONEG

root@cfcap02d:/etc/cmcluster
# lanadmin -x 0
Current Config = 100 Full-Duplex MANUAL


root@cfcap02d:/etc/cmcluster
# lanadmin -x 1
Current Config = 10 AUTONEG

Any other ideas?

I am not understanding why i have network errors as my net is up...i can ping...all entries are in the /etc/hosts file correctly....
What else is there?


melvyn burnard
Honored Contributor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Warning: Network interface lan0 on node cfcap02d couldn't talk to itself.
Warning: Network interface lan1 on node cfcap02d couldn't talk to itself.
Warning: Network interface lan2 on node cfcap02d couldn't talk to itself.

These warningas tell you that there is a problem on your nmetwork, you need to investigate this and fix it first
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Rita C Workman
Honored Contributor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Jody,

cmclnodelist is not required in 11.16

I don't know what kind of lan cards you have but unless they are Gig the autonegotiate probably should be off. Again...I don't know what your cards are so I can't confirm that.

You have a lot of work to do first to set up MC/SG. Some think SG is something you throw up easy with a couple of quick questions...and it really isn't. Plus once you get it up you'll need to understand it so you can modify as your environment changes. To do that, you need to know MC/SG.

The best advice I can give you is to please review docs.hp.com for Service Guard, http://docs.hp.com/en/ha.html and look to fixing your network issues first.

Rita
jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Ok, I at least got the cmcheckconf to work but i am getting these errors...

Error: cfcap01d lan1 did not receive DLPI probe from itself.
Error: cfcap01d lan1 should not be included in configuration.
Failed to probe network
Error: cfcap01d lan1 can communicate with cfcap02d lan0 over subnet 161.127.235.0
on the IP level, but not on the DLPI level.
There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.
Error: cfcap02d lan0 can communicate with cfcap01d lan1 over subnet 161.127.235.0
on the IP level, but not on the DLPI level.
There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.
Failed to evaluate network
cmcheckconf : Unable to reconcile configuration file cfcdev.conf
with discovered configuration information.


I have also attached a copy of my cmscancl output...

THanks in advance...
jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Anyone out there??

Please help...
Ninad_1
Honored Contributor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Jody,

On node cfcap02d, what is the output of
linkloop -i 0 0x000F201D9D63
linkloop -i 1 0x00306EEA341C
linkloop -i 2 0x00306EEA04CC

The output should show as OK

The error from your outputs say that the lan port is able to communicate at ip level but not at physical link level and there seems to be some network component like layer 3 switch or router between the connectivity, not allowing link level connectivity required by Service Guard. Can you check your netwrok connectivity requirement as per SG suggestions.

Also refer to this link if it helps you
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=790109

Regards,
Ninad
sajeer_2
Regular Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Ok,

I checked the above link and thus, posted my cmscancl...

Next thing,

When reading the SG documentation on installing SG for a two node cluster, it tells me that I MUST have a lock file location that is shareable to all...

Is there a way around this?

Thanks,
Ninad_1
Honored Contributor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Have you resolved the earlier problem ? What helped in resolving the problem ?

Yes you can use a quorum server.
You need to install the quorum server software on any machine which is already operational or any Linux machine. But please see the manuals for determining rules you should observe while selecting quorum server - like it should have different power circuit, must be accessible to all nodes on network etc etc.

What exactly do you want when you say a workaround ? Is it that you do not have a disk or what exactly are you looking for ?
The disk and VG you will be defining for cluster lock need not be dedicated for locking and can have active VG and volumes to be used by applications.
Cluster lock is a MUST for a 2 node cluster - may it be a cluster lock disk or quorum server.

Regards,
Ninad
jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Understood, the lock doesn't have to be dedicated however, I am under the impression that both servers must have access to the same lock device...

Is this correct...

jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

THese commands were run from cfcap02d...

CFCAP02d

linkloop -i 0 0x000F201D9D63
Link connectivity to LAN station: 0x000F201D9D63
-- OK


# linkloop -i 1 0x00306EEA341C
Link connectivity to LAN station: 0x00306EEA341C
-- OK

# linkloop -i 2 0x00306EEA04CC
Link connectivity to LAN station: 0x00306EEA04CC
error: expected primitive 0x30, got DL_ERROR_ACK
dl_error_primitive = 0x2d
dl_errno = 0x04
dl_unix_errno = 57
error - did not receive data part of message


Does lan2 have to be active?
I have to active ports on lan0 ad lan1...any ideas?
jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Where can i find the quorum server software?
does it have to be installed on a server outside of the cluster?
jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Ok, i have the quorum server installed on a node that doesn't reside in the cluster that is on the same network...

Any idea how to check things with the errors above regarding the DLPI issues?

-TIA
Ninad_1
Honored Contributor
Solution

Re: Unable to create a service guard cluster with 11.16..Please Help!

Few things.
You may choose not to use lan 2 if you wish to, since you already have 2 lan ports. But usually it is better to have hearbeat and data traffic on seperate lan ports so that high data traffic does not cause missing hearbeats.
Next thing as been already suggested - where you have 10 MB for lan1, why not turn off autoneg ? - also check if the switch ports are configured for the same speed and duplex setting for the corresponding lan port connection.
Regarding lan2 error - again check configuration - on lan port side, switch side and the actual cables. The error suggests to me there could be a mismatch of settings for lan port and switch port or a faulty card. You can attach a loop-back connector to lan2 and check linkloop.

Another thing - you need to check if lan0 and lan1 can communicate at data link layer. Thus you need to run linkloop for port 0 to check connectivity for MAC of port 1 . In essense - on cfcap02d.
linkloop -i 0 0x00306EEA341C
linkloop -i 1 0x000F201D9D63

Carry out similar tests on other node as well.

Regarding cluster lock - yes both the nodes need to have the same cluster lock otherwise the whole purpose of having a lock is defeated, apart from the fact that its not valid configuration.
I suggest you read the Managing Service Guard manual first.
http://docs.hp.com/en/ha.html#Serviceguard


Regards,
Ninad
jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

ok, I now have the quorum server running and ready to go...

Now, I must get the cluster formed however, when i do a linkloop on the ports, i get this...

linkloop -i 0 0x000F201D9D63(lan 0 MAC) all works well.

# linkloop -i 1 0x00306EEA341C(lan1 MAC)
Link connectivity to LAN station: 0x00306EEA341C
error: expected primitive 0x30, got DL_ERROR_ACK
dl_error_primitive = 0x2d
dl_errno = 0x04
dl_unix_errno = 57
error - did not receive data part of message

Why am i getting these errors?

I have checked connectivity etc. and switched the ports on the host from lan 2 to lan 1...no dice...

Any ideas on what to check when linkloop doesn't work?

I also get this when i run cmscancl on cfcap02d...


------ lan0 to lan1 ------
PPA 0 link test to 0x00306EEA341C (NO CONNECTION)

------ lan0 to lan2 ------
PPA 0 link test to 0x00306EEA04CC (NO CONNECTION)

------ lan1 to lan0 ------
PPA 1 link test to 0x000F201D9D63 (NO CONNECTION)

------ lan1 to lan2 ------
PPA 1 link test to 0x00306EEA04CC (NO CONNECTION)

------ lan2 to lan0 ------
PPA 2 link test to 0x000F201D9D63 (NO CONNECTION)

------ lan2 to lan1 ------
PPA 2 link test to 0x00306EEA341C (NO CONNECTION)

Thanks,
jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

Would IPSec stop me from being able to create a cluster?

jmckinzie
Super Advisor

Re: Unable to create a service guard cluster with 11.16..Please Help!

IPSec was stopping the cluster from being created.