Serviceguard
cancel
Showing results for 
Search instead for 
Did you mean: 

Problems Reinstalling serviceguard for linux A.11.16

SOLVED
Go to solution
Ouattara
Advisor

Problems Reinstalling serviceguard for linux A.11.16

Hello Everybody

I set up a serviceguard A.11.16 cluster using two hp DL 580 G4 with RHEL4, two MSA 1000 and two fiber channel switches.
I did evth by the doc and it was working, the application was failing over.
After some time i had a problem: The system clock was running two fast. To solve it i upgraded the kernel on the two cluster nodes.
Unfortunately before doing that i didnt stop the cluster service on the two nodes.
The upgrade solved the time problem but ServiceGuard stopped working. I couldnt get it to start on any of the nodes.
I then decided to unistall and reinstall and this precisely where the problem started.
I AM NOT ABLE TO GET THE INSTALLATION TO COMPLETE SUCCESSFULLY;

This is the output returned by the servers:
17 REPLIES
Matti_Kurkela
Honored Contributor

Re: Problems Reinstalling serviceguard for linux A.11.16

Looks like both the pidentd and serviceguard RPMs have been successfully installed, but there are some problems with the deadman kernel module. (The message at the end of pidentd RPM installation is just a reminder.)

First, run:
depmod -a
modprobe -r deadman
modprobe deadman
If the last command is successful, good. If not, what's the error message? What does the output of the "dmesg" command say about the deadman driver?

Do you have the appropriate kernel-source package installed? The version of the kernel-source package should match the version of your running kernel.

To avoid the warning about the identd daemon when installing the serviceguard RPM, you should have started started the identd daemon before installing the package. However, this is just a warning message: it means the installation did not stop, it just says "hey, you havent' done this yet; remember you'll need to do this before actually starting ServiceGuard."

In addition, ldconfig complains about /usr/lib/libcpqlsptransport.so.0.

Normally, most libraries in /usr/lib should have a three-part version number of the form /usr/lib/lib.so....
The ldconfig command will automatically create and maintain a symbolic link named /usr/lib/lib.so., pointing to lib.so....

Now there's a /usr/lib/libcpqlsptransport.so.0 library which is not a symbolic link. There is probably also a library named something like libcpqlsptransport.so.0.*, making ldconfig want to create a symbolic link for it - but it cannot, because there is already a library using that name.

To fix this, you should first examine the output of "ls -l /usr/lib/libcpqlstransport*".
If the libcpqlsptransport.so.0 is older/lower version than any other existing libcpqlsptransport.so.0.* file, you can probably delete (or move aside) the libcpqlsptransport.so.0 file and re-run /sbin/ldconfig to create a symbolic link in its place.

MK
MK
Serviceguard for Linux
Honored Contributor

Re: Problems Reinstalling serviceguard for linux A.11.16

I didn't read all of Matti's response but it could very well be on the right track. One thing to note, when the kernel version has changed the deadman driver must be rebuilt. The regular install of SGLX does that but it never hurts to check the kernel upgrade procedure and redo it.
Ouattara
Advisor

Re: Problems Reinstalling serviceguard for linux A.11.16

Thanks for your help Matti.

I was able to solve the library problem using your advice. In my first message i talked about an upgrade it was rather an update and i am also using the 64 bit version of Redhat.
I still have a problem regarding identd and the deadman driver. This is the output i get when i follw your advices:

[root@bcluster ~]# depmod -a
[root@bcluster ~]# modprobe -r deadman
FATAL: Module deadman not found.
[root@bcluster ~]# modprobe deadman
FATAL: Module deadman not found.
[root@bcluster ~]#

I also tried to setup the identd (still by the doc) here is the output

[root@bcluster davy]# /sbin/chkconfig --level 35 identd on
[root@bcluster davy]# /etc/init.d/identd start
FATAL: Module pidentd not found.

I am waiting for more advice


Matti_Kurkela
Honored Contributor

Re: Problems Reinstalling serviceguard for linux A.11.16

The deadman and pidentd kernel modules are apparently not there for your current kernel. So, the first thing would be to re-create them.

The pidentd module (execute as root):
cd /usr/src/pidentd-3.0.15sg/drivers
make
make install
depmod -a

The deadman module (execute as root):
cd /usr/local/cmcluster/drivers
make
make install
depmod -a

Every time a new kernel update is installed, you will have to boot once without ServiceGuard, run these commands and reboot again to verify that system will successfully boot to ServiceGuard mode without manual intervention.

The standard Makefiles in the pidentd and deadman module source directories will always create the module for the current kernel. That's a bit inconvenient.

I've made a small modification to both Makefiles in my SG/Linux installations to make the KERNEL_SOURCE and KERNEL_BUILD variables overrideable from the command line:
at the beginning of each Makefile, the assignments "KERNEL_SOURCE := ..." and "KERNEL_BUILD := ..." must be changed to "KERNEL_SOURCE ?= ..." and KERNEL_BUILD ?= ..." respectively i.e. just changing the colons into question marks.

After this change, the Makefiles work as before, but you have the option of adding KERNEL_SOURCE and KERNEL_BUILD variable assignments to the "make" command line, to create the modules for some kernel version *other* than the currently running one.

To make it easier, I've made a patch file (sg-new-kernel-makefile.patch) to make this change automatically. I've also created a script (new-kernel) for easy re-generation of both modules. You'll find them in a .zip file attached to this message.

Extract the .zip file to e.g. /var/tmp.
Then change the Makefiles using the patch file:
cd /
patch -p0
Copy the new-kernel script to the /usr/local/sbin directory and make it executable:
cp /var/tmp/new-kernel /usr/local/sbin
chmod a+x /var/tmp/new-kernel

Usage:
The command
new-kernel
will create and install the pidentd and deadman modules for the current kernel.

The command
new-kernel
will create and install the modules for the specified kernel version.

Example:
You run up2date and install RedHat's latest SMP kernel package for RHEL4. You also verify that the matching kernel-source package is installed. The kernel version number (the "uname -r" string) for the new kernel is "2.6.9-67.0.4.ELsmp".

After the up2date is completed but before rebooting, run:
new-kernel 2.6.9-67.0.4.ELsmp

When you reboot after this, the system will automatically start using the latest kernel version. If the "new-kernel" command was successful, the system should also be ready to re-join the ServiceGuard cluster automatically when it boots up.

The next step would be to make the ServiceGuard startup script detect that the kernel version has changed and run "new-kernel" automatically at boot if necessary. As the "hpasm" driver package already does something similar, it should be do-able.

MK
MK
Ouattara
Advisor

Re: Problems Reinstalling serviceguard for linux A.11.16

Thanks again for your help Matti.


I decided to start from scratch and deleted everything in /usr/local/cmcluster/drivers and /usr/src/pidentd-3.0.15sg/drivers (after making a copy to a folder) and to reinstall SGCmom, Pidentd and serviceguard.
This is the output of my system.


[root@bcluster drivers]# cd /davy/
[root@bcluster davy]# ls
libcpqlsptransport.so.0 pidentd-3.0.15sg-1.x86_64.rpm
Mattipatch serviceguard-A.11.16.07-0.product.redhat.x86_64.rpm
new-kernel sgcmom-B.03.01.02-0.product.redhat.x86_64.rpm
patch sg-new-kernel-makefile.patch
[root@bcluster davy]# rpm -ivh sgcmom-B.03.01.02-0.product.redhat.x86_64.rpm
Preparing... ########################################### [100%]
1:sgcmom ########################################### [100%]
[root@bcluster davy]# rpm -ivh pidentd-3.0.15sg-1.x86_64.rpm
Preparing... ########################################### [100%]
1:pidentd ########################################### [100%]
CC [M] /usr/src/pidentd-3.0.15sg/drivers/pidentd.o
Building modules, stage 2.
MODPOST
CC /usr/src/pidentd-3.0.15sg/drivers/pidentd.mod.o
LD [M] /usr/src/pidentd-3.0.15sg/drivers/pidentd.ko
INSTALL /usr/src/pidentd-3.0.15sg/drivers/pidentd.ko

The pidentd driver has been built and
installed for kernel version 2.6.9-67.ELsmp. If you change and rebuild
the kernel you must also rebuild the pidentd driver. This is
because the kernel version and the drivers kernel mod will not
match and thus will not load.

[root@bcluster davy]# rpm -ivh serviceguard-A.11.16.07-0.product.redhat.x86_64.rpm
Preparing... ########################################### [100%]
1:serviceguard ########################################### [100%]
Validating the identd configuration...
Warning: ServiceGuard uses the identd daemon, which does not
appear to be running on this node. Please take the proper
steps to configure and run identd before attempting to
have this node re-join the cluster.

CC [M] /usr/local/cmcluster/drivers/deadman.o
/usr/local/cmcluster/drivers/deadman.c:167: warning: `MODULE_PARM_' is deprecated (declared at include/linux/module.h:552)
Building modules, stage 2.
MODPOST
CC /usr/local/cmcluster/drivers/deadman.mod.o
LD [M] /usr/local/cmcluster/drivers/deadman.ko
INSTALL /usr/local/cmcluster/drivers/deadman.ko
Could not load the deadman driver. This could mean
mean that the driver did not build properly. You will
not be able to run Serviceguard until this problem
is resolved. See the Serviceguard Documentation
on possible resolutions to this problem.
No lingering cmclconfd processes to kill.

To complete the SG/Linux installation:
- add "/usr/local/cmcluster/bin" to your path

[root@bcluster davy]#

Immediately after reinstalling the 3 components I issued these commands:

[root@bcluster drivers]# depmod -a
[root@bcluster drivers]#
[root@bcluster drivers]# modprobe -r deadman
FATAL: Module deadman not found.
[root@bcluster drivers]# modprobe deadman
FATAL: Module deadman not found.
[root@bcluster drivers]#


[root@bcluster drivers]# modprobe -r pidentd
FATAL: Module pidentd not found.
[root@bcluster drivers]# modprobe pidentd
FATAL: Module pidentd not found.
[root@bcluster drivers]# rpm -q pidentd
pidentd-3.0.15sg-1
[root@bcluster drivers]#



[root@bcluster drivers]# cd /
[root@bcluster /]# patch -p0 patching file /usr/local/cmcluster/drivers/Makefile
patching file /usr/src/pidentd-3.0.15sg/drivers/Makefile
[root@bcluster /]# uname -r
2.6.9-67.ELsmp
[root@bcluster /]# new-kernel 2.6.9-67.ELsmp
new-kernel: recompiling deadman driver
Building modules, stage 2.
MODPOST
INSTALL /usr/local/cmcluster/drivers/deadman.ko
new-kernel: SUCCESS: deadman driver recompiled
new-kernel: recompiling SG pidentd driver
Building modules, stage 2.
MODPOST
INSTALL /usr/src/pidentd-3.0.15sg/drivers/pidentd.ko
new-kernel: SUCCESS: SG pidentd driver recompiled
[root@bcluster /]#

After rebooting I issued the following commands:

[root@bcluster ~]# /sbin/chkconfig --level 35 identd on
[root@bcluster ~]# /etc/init.d/identd start
FATAL: Module pidentd not found.
[root@bcluster ~]# modprobe -r deadman
FATAL: Module deadman not found.
[root@bcluster ~]#


Looks like I still need your help Matti.
Ouattara
Advisor

Re: Problems Reinstalling serviceguard for linux A.11.16

HELLO EVERYBODY
I am still experiencing problems reinstalling the Serviceguard application. I am thinking of recompiling the kernel. Would it be of help. Is there anything else i can do?
HELP!!
Serviceguard for Linux
Honored Contributor

Re: Problems Reinstalling serviceguard for linux A.11.16

Do not recompile the kernel. You will wind up with a system that cannot be supported by Red Hat or HP.

Better to make sure you have uninstalled all of the HP RPMs and trying to reinstalling SG making sure you go through the process carefully.
Ouattara
Advisor

Re: Problems Reinstalling serviceguard for linux A.11.16

Thanks for your help

Its ok for the kernel recompilation but for the HP RPM's, as you can see from the last exhibit, i have uninstalled SGCMOM, PIDENTD and SERVICEGUARD. Which other RPM could i possibly uninstall?
Ouattara
Advisor

Re: Problems Reinstalling serviceguard for linux A.11.16

Hello everybody.

SOS

i am stuck. Need help
Serviceguard for Linux
Honored Contributor

Re: Problems Reinstalling serviceguard for linux A.11.16

I hope I may be on track with this question. (and sorry for previous comment).

When you upgraded the kernel did you also upgrade kernel-devel? I think this is the most critical. Other related question is did you install any kernel src or source packages?

One way to get more help on this is to run

# rpm -qa | fgrep kernel > itrc_kernel

and then post that file.

This seems to still all be related to the fact that the deadman driver & pidentd aren't being made correctly.

Ouattara
Advisor

Re: Problems Reinstalling serviceguard for linux A.11.16

Hello

I think i did upgrade the kernel-devel and i have not installed any kernel src package.

this is the output i obtained after running the command:

[root@bcluster march18]# rpm -qa |fgrep kernel >itrc_kernel
[root@bcluster march18]# ls
itrc_kernel
[root@bcluster march18]# vi itrc_kernel
kernel-doc-2.6.9-5.EL
kernel-hugemem-devel-2.6.9-5.EL
kernel-utils-2.4-13.1.48
kernel-devel-2.6.9-67.EL
kernel-smp-devel-2.6.9-5.EL
kernel-2.6.9-67.EL
kernel-smp-2.6.9-67.EL

What do you think about that?
Serviceguard for Linux
Honored Contributor

Re: Problems Reinstalling serviceguard for linux A.11.16

Another guess. Notice that kernel-smp-devel is not at teh same level as kernel-smp.

That is certainly worth a quick try.

By teh way, if you have support, this is definitely worth logging a call.
Ouattara
Advisor

Re: Problems Reinstalling serviceguard for linux A.11.16

Thank four your help

Do you think updating kernel-smp-devel-2.6.9-5.EL to version 2.6.9-5.EL could solve the problem?
Serviceguard for Linux
Honored Contributor
Solution

Re: Problems Reinstalling serviceguard for linux A.11.16

I think you had a typo.

Try updating kernel-smp-devel-2.6.9-5.EL to kernel-smp-devel-2.6.9-67.EL. Certainly worth a try - right?
Matti_Kurkela
Honored Contributor

Re: Problems Reinstalling serviceguard for linux A.11.16

"Worth a try" is definitely an understatement - if you're using a SMP kernel with ServiceGuard, you *must* install the kernel-smp-devel package matching your kernel version.

MK
MK
Ouattara
Advisor

Re: Problems Reinstalling serviceguard for linux A.11.16

Thanks to both of you.
The problem was related indeed to the fact that kernel-SMP and kernel-SMP-devel were not updated at the same level. After updating the two modules to version 2.6.9-67.0.7.ELsmp i was able to successfully carry the installation of identd and serviceguard and to start the identd service on one node.
I am about to do the same thing on the second one.
I think from now on things will go on smoothly.
Thank you VERY MUCH!! Matti and Mr "Serviceguard" for your precious support.
Ouattara
Advisor

Re: Problems Reinstalling serviceguard for linux A.11.16

After the last two advices i was able to solve the problem i had installing serviceguard. I am yet to configure it again