cancel
Showing results for 
Search instead for 
Did you mean: 

Urgent

SOLVED
Go to solution
S.S.
Super Advisor

Urgent

Hi All,

I have newly created the following filesystems and this is the fstab entries:

/dev/test3working/test3working /test3working vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4arch/test4arch /test4arch vxfs delaylog 0 2
/dev/test4db1/test4db1 /test4db1 vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4db2/test4db2 /test4db2 vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4db3/test4db3 /test4db3 vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4db4/test4db4 /test4db4 vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4exe/test4exe /test4exe vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4redo/test4redo /test4redo vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4arcd/test4arcd /test4arcdata vxfs rw,suid,largefiles,delaylog,detainlog 0 2

On Feb 27th there was an auto reboot.
11:00 Sat Feb 27 2010. Reboot after panic: SafetyTimer expired, isr.ior = 0'4240003.0'45514b68

after reboot these filesystems which i newly created were not automatically mounted as even i added in the /etc/fstab file.

Also,
For the panic reboot i found the following logs in the OLD Syslog.
eb 27 10:09:42 oradb1 sshd[11994]: Accepted password for RB03553 from 10.1.236.10 port 1789 ssh2
Feb 27 10:09:57 oradb1 su: + 1 RB03553-root
Feb 27 10:27:28 oradb1 sshd[13179]: Failed password for RB03553 from 10.1.236.10 port 3854 ssh2
Feb 27 10:27:33 oradb1 sshd[13179]: Accepted password for RB03553 from 10.1.236.10 port 3854 ssh2
Feb 27 10:28:12 oradb1 su: + 0 RB03553-root
Feb 27 10:29:02 oradb1 cmsrvassistd[12002]: The cluster daemon aborted our connection.
Feb 27 10:29:02 oradb1 cmsrvassistd[12002]: Lost connection with Serviceguard cluster daemon (cmcld): Software caused connect
ion abort
Feb 27 10:29:02 oradb1 cmlvmd[12004]: The cluster daemon aborted our connection.
Feb 27 10:29:02 oradb1 cmlvmd[12004]: Could not read messages from /usr/lbin/cmcld: Software caused connection abort
Feb 27 10:29:02 oradb1 cmclconfd[11998]: The Serviceguard daemon, /usr/lbin/cmcld[11999], died upon receiving signal number 9
.
Feb 27 10:29:02 oradb1 cmtaped[12005]: The cluster daemon aborted our connection.
Feb 27 10:29:02 oradb1 cmlvmd[12004]: The cluster daemon aborted our connection.
Feb 27 10:29:02 oradb1 cmtaped[12005]: cmtaped terminating. (ATS 1.14)
Feb 27 10:29:02 oradb1 cmlvmd[12004]: CLVMD exiting
Feb 27 10:29:02 oradb1 cmlvmd[12004]: CLVMD exiting
Feb 27 10:30:32 oradb1 su: + tty?? root-oratest3

May i know it is due to ServiceGuard errors?

Kindly help me soon.

Thanks!
15 REPLIES
Kapil Jha
Honored Contributor

Re: Urgent

The service guard daemon was not able to contact to other node, and it aborted the connection..

What happened in there only one machine went down or both the machines?

why you connecting cluster thing and fstab thing because if file system is in cluster it should not be in fstab.

BR,
Kapil+
I am in this small bowl, I wane see the real world......
Michael Steele_2
Honored Contributor

Re: Urgent

You know absolutely nothing about MC Service Guard - You should stop what you are doing and take the class or hire someone else to accomplish the work

None, I repeat none, of the file systems used by MC/SG GO INTO /ETC/FSTAB.

That you are even using /etc/fstab says everything.

Give it up.
Support Fatherhood - Stop Family Law
S.S.
Super Advisor

Re: Urgent

Hi,

Checked the /etc/rc.log file and here are the errors reported for the filesystems which are unable to mount automatically.

vxfs mount: illegal -o suboption -- rw
vxfs mount: Usage:
mount [-l] [-v|-p]
mount [-F vxfs] [-eQ] -a
mount [-F vxfs] [-eQrV]
[-o [rw|ro] [suid|nosuid] [quota] [remount]
[log|delaylog|tmplog] [largefiles|nolargefiles]
[fsetname=fileset] [qio|noqio] [logiosize=iosize]
[mincache=direct|dsync|closesync|tmpcache|unbuffered]
[convosync=direct|dsync|closesync|delay|unbuffered]
[datainlog|nodatainlog] [blkclear] ] {special | mount_point}
mount [-F vxfs] [-eQrV]
[-o [rw|ro] [suid|nosuid] [quota] [remount]
[log|delaylog|tmplog] [largefiles|nolargefiles]
[fsetname=fileset] [qio|noqio] [logiosize=iosize]
[mincache=direct|dsync|closesync|tmpcache|unbuffered]
[convosync=direct|dsync|closesync|delay|unbuffered]
[datainlog|nodatainlog] [blkclear] ] special mount_point
mount [-F vxfs] [-eQrV]
[-o [ro] [suid|nosuid] [snapof=primary_special]
[snapsize=blocks] ] special mount_point
mountall: cannot mount /dev/test4arcd/test4arcd
mountall: diagnostics from mount

Kindly help me how can i reset this correctly for the auto mount process.

Thanks!
S.S.
Super Advisor

Re: Urgent

Hi,

Yes Michael, as you said these filesystems are not configured in cluster.

I found the above posted errors for unable to mount auto.
Can you suggest me how can i proceed further to correct this.

Also, would someone help me why the reboot has been happened?

Waiting for your quick response.

Thanks!
Kapil Jha
Honored Contributor

Re: Urgent

>>>vxfs mount: illegal -o suboption -- rw

this suggest there is some issue with fstab entries u made.
Can you try to mount them manually (just to check if that working).

why system reboot?
Open a case with HP there can be n number of reasons for this.
From your post seems some cluster issue?
your cluster is working fine now?

BR,
Kapil+
I am in this small bowl, I wane see the real world......
S.S.
Super Advisor

Re: Urgent

Hi Kapil,

Thank you!

I have tried to manually mounting the filesystem and worked fine.

About cluster issue:

The Secondary node is down.
NODE STATUS STATE
oradb1 down failed


Thanks & Regards


Michael Steele_2
Honored Contributor

Re: Urgent

Sigh. Question: Are you up and running on the other node?

As far as debugging this problem goes, refer to this message and search the ITRC for related Panics, like this one.

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=717943

11:00 Sat Feb 27 2010. Reboot after panic: SafetyTimer expired, isr.ior = 0'4240003.0'45514b68

Find your cluster lock disk and verify that it hasn't failed.

Refer to the node Timeout value and consider raising it. It isn't that simple. Review the procedure first.

Refer to this manual.

http://docs.hp.com/en/B3936-90135/B3936-90135.pdf

CALL HP
Support Fatherhood - Stop Family Law
S.S.
Super Advisor

Re: Urgent

Hi All,

Justnow my colleague said that he had done some thing with cluster due to that it went to reboot.

Now i manually started the node and working fine.

One thing i want to know about for the auto mounting; what changes i should perform to correct this.

Thank you Kapil and Michael.
Kapil Jha
Honored Contributor
Solution

Re: Urgent

I suppose if its working fine with manual mount just check and populate the fstab file correctly.

I again hop that its not part of cluster package :)

edit the fstab and to check just do a mount -a
BR,
Kapil+
I am in this small bowl, I wane see the real world......
S.S.
Super Advisor

Re: Urgent

Hi Kapil,

Please see the attached file.

During Creation of new filesystems i have used the following command.
For example:

# newfs -F vxfs -o largefiles /dev/test4db2/rtest4db2


Thanks & Regards
Kapil Jha
Honored Contributor

Re: Urgent

If your file system is mounting manually fine.
Then there is no error in creating them,
issue is with automatically mounting it while system reboots so fstab file is the file which is responsible for that.

I suspect there is some white character in this file while u added the new FS.
So i would suggest to remove entries of this file system from fstab and re-enter them carefully.

to check you may have to umount the already mounted FS.

BR,
Kapil+
I am in this small bowl, I wane see the real world......
S.S.
Super Advisor

Re: Urgent

Hi Kapil,

Thank you.

Because of you i was carefully read the lines in /etc/fstab and found i mis-typed the word datainlog as detainlog.

Thanks for your great help! :-)



S.S.
Super Advisor

Re: Urgent

Thank you!
Kapil Jha
Honored Contributor

Re: Urgent

haha....I also checked if something misspelling but I could not see that :)
you have a good eye

BR,
Kapil+
I am in this small bowl, I wane see the real world......
Steve Post
Trusted Contributor

Re: Urgent

um......
correct me if I am wrong.....

I thought that when you use MC serviceguard you never put the filesystem into the fstab file if it is part of a package. That there is a process within MC serviceguard that performs the mounting of the file system that are part of the package.

If ServiceGuard was a person it would be saying "HEY! Who the heck already mounted this stuff? That's my job. Those filesystems are supposed to be on the alternate node right now."