while extending logical volume facing the error

BhanuI · ‎09-05-2011

Hi All,

Today we tried to extend the logical volume which is in cluster but unfortunately we are not able to do that , so from snap clone we reverted but getting th ebelow error mrssages while making the package up.

Sep 5 07:19:29 - Node "qa1crmapp4": Starting rmtab synchronization process
Sep 5 07:19:29 - Node "qa1crmapp4": Adding IP address xxxxxxxxxxxxxx to subnet xxxxxxxxxxxxxxxx
Found duplicate PV PefOptNSsSbTG9Vcel21eIT18W7Ry0AG: using /dev/sdf1 not /dev/sdd1
umount2: Invalid argument
umount: /siebel_fs: not mounted
mount: can't find /siebel_fs in /etc/fstab or /etc/mtab
umount2: Invalid argument
umount: /siebel_fs: not mounted
mount: can't find /siebel_fs in /etc/fstab or /etc/mtab
ERROR: Function customer_defined_run_cmds; Failed to RUN customer commands
Sep 5 07:19:32 - Node "qa1crmapp4": Remove IP address xxxxxxxxxxx from subnet xxxxxxxxxxxxxx
Sep 5 07:19:32 - Node "qa1crmapp4": Stoping rmtab synchronization process
Sep 5 07:19:32 - Node "qa1crmapp4": Unexporting filesystem on *:/siebel_fs
Found duplicate PV PefOptNSsSbTG9Vcel21eIT18W7Ry0AG: using /dev/sdf1 not /dev/sdd1
Sep 5 07:19:32 - Node "qa1crmapp4": Unmounting filesystem on /siebel_fs
Sep 5 07:19:32 - Node "qa1crmapp4": Deactivating volume group vgpkg_nfs
Attempting to deltag to vg vgpkg_nfs...
deltag was successful on vg vgpkg_nfs.
###### Node "qa1crmapp4": Package start FAILED at Mon Sep 5 07:19:32 AST 2011 ######

Matti_Kurkela · ‎09-06-2011

What's your Serviceguard version? And the name and version of your Linux distribution?

Please run these commands and show the output:

uname -a
cat /etc/*release /etc/*version
cmversion

Extending a LV+filesystem should have no direct effects to the ability to mount/unmount the filesystem, unless you're reaching the maximum filesystem size for your filesystem type & system architecture.

The error messages indicate the package is trying to unmount "/siebel_fs" while starting up the package, and detects a problem because /siebel_fs is not mounted at the moment. Unmounting filesystems while starting the package does not quite make sense.

The unmount command is being run by the customer_defined_run_cmds section of your Serviceguard package control script; can you please show that part of the script? If that script calls other scripts, it might be necessary to examine those scripts too.

When the customer_defined_run_cmds section fails, Serviceguard will automatically unmount the package filesystem(s), deactivate the volume group(s) and delete the VG tags used for VG protection on Serviceguard for Linux. All this seems normal.

The messages also include information about unexporting the /siebel_fs filesystem and about a "rmtab synchronization process"... I guess this package might be using some Serviceguard toolkit or extension?

The failed unmounting of /siebel_fs might be caused by the Serviceguard extension (probably attempting to exchange one filesystem for another), but I'd expect any Serviceguard extension to be prepared for the non-existence of the expected filesystems. But it might be a local customization too. In that case, you may have to read the script and try to understand what it is supposed to do. If someone has changed the customer_defined_run_cmds part the package control script recently, you might want to find that person and ask him/her what these commands are trying to achieve.

(Remember that the package control scripts are not automatically synchronized to all the cluster nodes: you should compare the scripts in each node. A common Serviceguard admin mistake is to update the package control script in one node only. If the package failover is not tested after such a modification, this error may remain unnoticed until there is a problem that requires failover.)

Thanks to HP's documentation reorganization (and probably also because Serviceguard for Linux is a discontinued product), I could not quickly find the Serviceguard for Linux NFS extension documentation. The HP-UX version of the same documentation suggests the NFS extension might be relying on autofs for some of its functionality. If your autofs configuration includes references to /siebel_fs, you might want to double-check that autofs is running correctly.

MK

BhanuI · ‎09-06-2011

Hi MK,

Thank you for reply.

uname -a#

Linux qa1crmapp4 2.6.9-78.0.1.ELhugemem #1 SMP Tue Jul 22 18:23:25 EDT 2008 i686 i686 i386 GNU/Linux

[root@xxxxxxxxx nfs_siebel]# cat /etc/redhat-release
Red Hat Enterprise Linux AS release 4 (Nahant Update 7)

[root@xxxxxxxx nfs_siebel]# cmversion
A.11.18.00

The error messages indicate the package is trying to unmount "/siebel_fs" while starting up the package, and detects a problem because /siebel_fs is not mounted at the moment. Unmounting filesystems while starting the package does not quite make sense.

As said above in the script it while starting the pacakage it checks wheteher the /siebel_fs is mounted or not if it's mounted it tries unmount it . so thats the reason we got the laert.

The unmount command is being run by the customer_defined_run_cmds section of your Serviceguard package control script; can you please show that part of the script? If that script calls other scripts, it might be necessary to examine those scripts too.

Please find the script for

#!/bin/bash
for host in qa1crmapp1 qa1crmapp2
do
ssh $host /bin/umount -f /siebel_fs
sleep 1
ssh $host /bin/mount /siebel_fs
done

Here when the cluster is down this NFS exporting would also be down. so first trying to umount thats fine but while trying to mount it is not able to mount it , but in the log file it's successfully exported from cluster server.

Main problem is why it's trying to mount from /etc/fstab ?????? it should be done from cluster server.

And also please find teh main script for my cluster package.

Matti_Kurkela · ‎09-06-2011

Ah, I think I understand now.

The script you copy/pasted in your last post was /usr/local/cmcluster/conf/nfs_siebel/remount.sh, right?

So the umount/mount error messages were produced by remote umount/mount commands that were running on qa1crmapp1 and qa1crmapp2.

The remount.sh script essentially runs these commands:

ssh qa1crmapp1 /bin/umount -f /siebel_fs  # forced unmount
sleep 1
ssh qa1crmapp1 /bin/mount /siebel_fs
ssh qa1crmapp2 /bin/umount -f /siebel_fs  # forced unmount
sleep 1
ssh qa1crmapp2 /bin/mount /siebel_fs

I think the problem is that the remount.sh is too simple: it does not check if /siebel_fs is already unmounted, causing unnecessary errors in the unmount step. And it does not supply all the parameters to the remote mount command, so the mount command will have to look into /etc/fstab for more information. This will cause problems if the /siebel_fs mount point is not specified in /etc/fstab on qa1crmapp1 and qa1crmapp2.

Because remount.sh contains no "return" or "exit" command at the end, the result code of the last command executed in remount.sh becomes the result code of the entire remount.sh script. The keyword "done" is not a command: it's only the last part of the "for ... in ...; do ...; done" clause.

So the result code of remount.sh will be the same as the result code of the "ssh qa1crmapp2 /bin/mount /siebel_fs" command. According to the log in your original post, it failed... so the "mount" command on qa1crmapp2 returned a non-zero result code, which was reported to this remount.sh script on qa1crmapp4 by the ssh command, and then to the package control script. And then the package control script "thought" something serious had gone wrong in the package start-up, and stopped the package.

You should probably modify the remount.sh script like this:

#!/bin/bash
for host in qa1crmapp1 qa1crmapp2
do
  # make an obvious log message
  echo "Running a remount.sh operation on $host"
  ssh $host /bin/umount -f /siebel_fs 2>/dev/null # error messages ignored, this command may fail
  sleep 1
  # the full mount command with the package IP and all the required options here
  ssh $host /bin/mount -t nfs -o hard,intr 172.18.133.15:/siebel_fs /siebel_fs
  # then make a log message according to the result.
  if [ $? -ne 0 ]
  then
    echo "The remount.sh operation on $host failed. Continuing anyway..."
  else
    echo "The remount.sh operation on $host was successful."
  fi
done

# The previous mount commands may have produced errors, but those errors happened on remote hosts.
# They should not stop the package from starting. So always exit this script with a
# "OK" return code.

exit 0

The "-o hard,intr" options in the mount command may or may not be appropriate for you; choose the NFS mount options as required by your application.

MK

BhanuI · ‎09-08-2011

Hi MK,

Thans a ton. we will try this and update u soon.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

while extending logical volume facing the error

while extending logical volume facing the error

Re: while extending logical volume facing the error

Re: while extending logical volume facing the error

Re: while extending logical volume facing the error

Re: while extending logical volume facing the error