Operating System - HP-UX
1826639 Members
3194 Online
109695 Solutions
New Discussion

Re: SG A11.17 CMAPPLYCONF gives io errors.

 
SOLVED
Go to solution
Chiel Voswijk
Occasional Advisor

SG A11.17 CMAPPLYCONF gives io errors.

Hi There,

somehow and suddenly we get IO ERRORS during CMAPPLYCONF.
Does anybody know how to solve this problem.
HP advised to install patch PHSS_33840 but did not help.
We know by deleting the culster and rebuild it.
It will be solved then...but for how long???

hereby the output

Checking cluster file: /etc/cmcluster/clprd2.ascii
Checking nodes ... Done
Checking existing configuration ... Done
Gathering storage information
Found 24 devices on node hvudhg31
Found 24 devices on node hvudhg32
Found 24 devices on node hvudhg33
Analysis of 72 devices should take approximately 2 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
Found 6 volume groups on node hvudhg31
Found 6 volume groups on node hvudhg32
Found 6 volume groups on node hvudhg33
Analysis of 18 volume groups should take approximately 1 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
Gathering network information
Beginning network probing (this may take a while)
Completed network probing
Checking for inconsistencies
Modifying configuration on node hvudhg31
Modifying configuration on node hvudhg33
Modifying configuration on node hvudhg32

Modify the cluster configuration ([y]/n)? y
Marking/unmarking volume groups for use in the cluster
Unable to copy file to hvudhg31: I/O error
Unable to copy file to hvudhg32: I/O error
Unable to copy file to hvudhg33: I/O error
Unable to copy file to hvudhg33: I/O error
Unable to copy file to hvudhg31: I/O error
Unable to copy file to hvudhg32: I/O error
Unable to apply the configuration change: Permission denied
. Check the syslog file(s) for additional information.
cmapplyconf: Unable to apply the configuration

Regards
Chiel

13 REPLIES 13
IT_2007
Honored Contributor

Re: SG A11.17 CMAPPLYCONF gives io errors.

Unable to copy file to hvudhg31: I/O error
Unable to copy file to hvudhg32: I/O error
Unable to copy file to hvudhg33: I/O error
Unable to copy file to hvudhg33: I/O error
Unable to copy file to hvudhg31: I/O error
Unable to copy file to hvudhg32: I/O error
Unable to apply the configuration change: Permission denied
==============================

Check hvudhg31, 32 and 33 servers for permissions on /etc/cmcluster directory and files in it.

Also check syslog for any I/O errors. Do you have any Hardware issues?
Chiel Voswijk
Occasional Advisor

Re: SG A11.17 CMAPPLYCONF gives io errors.

The syslog has no errors on Hardware.
The packages are on EVA4000 SAN. No isseus there either.

The rights on /etc/cmcluster are:

drwxr-xr-x 28 bin bin 8192 Sep 21 11:31 cmcluster

All the files inside on root:sys

Strange thing is that before Patching PHSS_33840 we could copy the Cluster config files to another node and Apply from there without problems. Now on all three nodes we have the same error.

Thanks
Chiel
freddy_21
Respected Contributor

Re: SG A11.17 CMAPPLYCONF gives io errors.

Hello Chiel,
maybe try to move cmclconfig file to another directory and try to run cmcpplyconf again. Maybe your cmclconfig file corrupted.


thanks
freddy
IT_2007
Honored Contributor

Re: SG A11.17 CMAPPLYCONF gives io errors.

After applying patch did you try to stop and restart entire cluster with packages?

What modification are you trying to do the cluster config file?
Chiel Voswijk
Occasional Advisor

Re: SG A11.17 CMAPPLYCONF gives io errors.

Freddi,

The hvudlg33 is our master configuration server for serviceguard (MOTHER).
On what nodes should this move be done???
All???

IT-2007,

No...we upgraded SG per node.
So cmhaltpkg -> cmhaltnode -> Patch.

The change in the ascii file was to add some monitor and full control users. Something we have done more then once.

We had such a problem on three testnodes in the past on another EVA and reboot the nodes and cluster them again. That did not help, It's from that problem we know that deleting the cluster and rebuild it the problem is gone. But that's equal shooting a man from his horse. You don't want that in a 7X24 enviroment.

Chiel


IT_2007
Honored Contributor
Solution

Re: SG A11.17 CMAPPLYCONF gives io errors.

ok.
1. check status of patches and software on all nodes.

swlist -l fileset -a state |egrep -i "transient|corrupt|installed"

If you find any, fix them.
2. Try to query the cluster using cmquerycl and it will create ascii file for you. Then modify ascii file to add monitor scripts to the cluster. I feel something wrong in the present ascii file.

Other alternate is rollback the patch and recrreate cluster.
Chiel Voswijk
Occasional Advisor

Re: SG A11.17 CMAPPLYCONF gives io errors.

IT-2007

your suggestion about the Swlist was very intresting. Thanks
Only outcome on the HVUDHG33 (Mother)

lsof.lsof-RUN corrupt

Other nodes no failures.

LSOF is a standalone utility and could not be the problem.

Making a new Ascii file is for same as recreating the Cluster. Same amount of work.

Chiel
IT_2007
Honored Contributor

Re: SG A11.17 CMAPPLYCONF gives io errors.

agree with you but to make sure that you don't have any issues with SG, just run cmquerycl and try to capture information. don't have to do cmapplyconf. That will confirm that if you have any errors.

Atleast you don't have any software and patches are not in corrupt state except lsof which is ok.
freddy_21
Respected Contributor

Re: SG A11.17 CMAPPLYCONF gives io errors.

hello Chiel,
ok run at hvudlg33
mv /etc/cmcluster/cmclconfig /tmp/cmclconfig.ori

run cmapplyconf again at hvudlg33.

did you compile the ascii file when the cluster running?

Thanks
freddy
Chiel Voswijk
Occasional Advisor

Re: SG A11.17 CMAPPLYCONF gives io errors.

Ok...it looks like that cmclconfig has to do something with the errors.
I moved them from the HVUDHG31 and 32 and run the cmcapplyconf from the HVUDHG33 (MOTHER)

Now only this message appeared:

Marking/unmarking volume groups for use in the cluster
Unable to copy file to hvudhg33: I/O error
Unable to copy file to hvudhg33: I/O error
Unable to apply the configuration change: Permission denied
. Check the syslog file(s) for additional information.

So permissions denied on her own node and not the other cluster nodes.
How can I create a new cmclconfig file without recreating the whole cluster.


Chiel

IT_2007
Honored Contributor

Re: SG A11.17 CMAPPLYCONF gives io errors.

check syslog on MOTHER node for more information to see why you are getting permission denied.
Chiel Voswijk
Occasional Advisor

Re: SG A11.17 CMAPPLYCONF gives io errors.

Sep 22 08:55:12 hvudhg33 cmclconfd[5195]: WARNING: User root from ip address (19
2.168.240.133) does not have privileges to access this node. Either they are com
ing from a node without enhanced security or somebody may be attempting un-autho
rized access to this system.

Ok I know... this is a well known message on this forum. I read alot of this isseu...easy to solve is the answer.
Create a huge host,.rhost and cmclnodelist and make a new nsswitch.conf. Well we have that on this cluster a long time. We had never problems with this.

Then the strange thing that ROOT has no privelleges on her own node. HVUDHG33 is the 192.168.240.133.

Still it looks like a corrupt clconfig file.
For now I have to find out how to recreate that one on a live cluster.

Thanks
CHiel
Chiel Voswijk
Occasional Advisor

Re: SG A11.17 CMAPPLYCONF gives io errors.

Cluster deleted and rebuild it with the old Ascii file.
No problems sofar.
The CMAPPLYCONF works fine and the changes are made.

Started the Packages with one failure.
HVVDHG IP adress was wrong in the /etc'/host file.
Made the correction and test the CMAPLLYCONF again with a change in the ascii.
It gave an access denied on one node. but the cmclconfig was distributed.

To set the enablemant on the package with cmmodpkg -e we had a new error:
"Unable to retrieve local cluster configuration"

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAARCHHHHHHHHHHHHH

By making a new rlogin to the MOTHER node we suddenly were able to run CMAPPLYCONF and CMMODPKG without problems.
From that point the old rlogin session is also doing well. ???????
F.CK.

Now...we have access to the nodes over two different LAN segments. The third is a dedicated hartbeat switch.
It looks like that we are randomly accessing the nodes by one of the two nics.

When you have the right path that moment it works.

How to deal with this...I don't know.

Chiel