1752758 Members
4956 Online
108789 Solutions
New Discussion юеВ

Re: Urgent

 
SOLVED
Go to solution
S.S.
Super Advisor

Urgent

Hi All,

I have newly created the following filesystems and this is the fstab entries:

/dev/test3working/test3working /test3working vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4arch/test4arch /test4arch vxfs delaylog 0 2
/dev/test4db1/test4db1 /test4db1 vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4db2/test4db2 /test4db2 vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4db3/test4db3 /test4db3 vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4db4/test4db4 /test4db4 vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4exe/test4exe /test4exe vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4redo/test4redo /test4redo vxfs rw,suid,largefiles,delaylog,detainlog 0 2
/dev/test4arcd/test4arcd /test4arcdata vxfs rw,suid,largefiles,delaylog,detainlog 0 2

On Feb 27th there was an auto reboot.
11:00 Sat Feb 27 2010. Reboot after panic: SafetyTimer expired, isr.ior = 0'4240003.0'45514b68

after reboot these filesystems which i newly created were not automatically mounted as even i added in the /etc/fstab file.

Also,
For the panic reboot i found the following logs in the OLD Syslog.
eb 27 10:09:42 oradb1 sshd[11994]: Accepted password for RB03553 from 10.1.236.10 port 1789 ssh2
Feb 27 10:09:57 oradb1 su: + 1 RB03553-root
Feb 27 10:27:28 oradb1 sshd[13179]: Failed password for RB03553 from 10.1.236.10 port 3854 ssh2
Feb 27 10:27:33 oradb1 sshd[13179]: Accepted password for RB03553 from 10.1.236.10 port 3854 ssh2
Feb 27 10:28:12 oradb1 su: + 0 RB03553-root
Feb 27 10:29:02 oradb1 cmsrvassistd[12002]: The cluster daemon aborted our connection.
Feb 27 10:29:02 oradb1 cmsrvassistd[12002]: Lost connection with Serviceguard cluster daemon (cmcld): Software caused connect
ion abort
Feb 27 10:29:02 oradb1 cmlvmd[12004]: The cluster daemon aborted our connection.
Feb 27 10:29:02 oradb1 cmlvmd[12004]: Could not read messages from /usr/lbin/cmcld: Software caused connection abort
Feb 27 10:29:02 oradb1 cmclconfd[11998]: The Serviceguard daemon, /usr/lbin/cmcld[11999], died upon receiving signal number 9
.
Feb 27 10:29:02 oradb1 cmtaped[12005]: The cluster daemon aborted our connection.
Feb 27 10:29:02 oradb1 cmlvmd[12004]: The cluster daemon aborted our connection.
Feb 27 10:29:02 oradb1 cmtaped[12005]: cmtaped terminating. (ATS 1.14)
Feb 27 10:29:02 oradb1 cmlvmd[12004]: CLVMD exiting
Feb 27 10:29:02 oradb1 cmlvmd[12004]: CLVMD exiting
Feb 27 10:30:32 oradb1 su: + tty?? root-oratest3

May i know it is due to ServiceGuard errors?

Kindly help me soon.

Thanks!
15 REPLIES 15
Kapil Jha
Honored Contributor

Re: Urgent

The service guard daemon was not able to contact to other node, and it aborted the connection..

What happened in there only one machine went down or both the machines?

why you connecting cluster thing and fstab thing because if file system is in cluster it should not be in fstab.

BR,
Kapil+
I am in this small bowl, I wane see the real world......
Michael Steele_2
Honored Contributor

Re: Urgent

You know absolutely nothing about MC Service Guard - You should stop what you are doing and take the class or hire someone else to accomplish the work

None, I repeat none, of the file systems used by MC/SG GO INTO /ETC/FSTAB.

That you are even using /etc/fstab says everything.

Give it up.
Support Fatherhood - Stop Family Law
S.S.
Super Advisor

Re: Urgent

Hi,

Checked the /etc/rc.log file and here are the errors reported for the filesystems which are unable to mount automatically.

vxfs mount: illegal -o suboption -- rw
vxfs mount: Usage:
mount [-l] [-v|-p]
mount [-F vxfs] [-eQ] -a
mount [-F vxfs] [-eQrV]
[-o [rw|ro] [suid|nosuid] [quota] [remount]
[log|delaylog|tmplog] [largefiles|nolargefiles]
[fsetname=fileset] [qio|noqio] [logiosize=iosize]
[mincache=direct|dsync|closesync|tmpcache|unbuffered]
[convosync=direct|dsync|closesync|delay|unbuffered]
[datainlog|nodatainlog] [blkclear] ] {special | mount_point}
mount [-F vxfs] [-eQrV]
[-o [rw|ro] [suid|nosuid] [quota] [remount]
[log|delaylog|tmplog] [largefiles|nolargefiles]
[fsetname=fileset] [qio|noqio] [logiosize=iosize]
[mincache=direct|dsync|closesync|tmpcache|unbuffered]
[convosync=direct|dsync|closesync|delay|unbuffered]
[datainlog|nodatainlog] [blkclear] ] special mount_point
mount [-F vxfs] [-eQrV]
[-o [ro] [suid|nosuid] [snapof=primary_special]
[snapsize=blocks] ] special mount_point
mountall: cannot mount /dev/test4arcd/test4arcd
mountall: diagnostics from mount

Kindly help me how can i reset this correctly for the auto mount process.

Thanks!
S.S.
Super Advisor

Re: Urgent

Hi,

Yes Michael, as you said these filesystems are not configured in cluster.

I found the above posted errors for unable to mount auto.
Can you suggest me how can i proceed further to correct this.

Also, would someone help me why the reboot has been happened?

Waiting for your quick response.

Thanks!
Kapil Jha
Honored Contributor

Re: Urgent

>>>vxfs mount: illegal -o suboption -- rw

this suggest there is some issue with fstab entries u made.
Can you try to mount them manually (just to check if that working).

why system reboot?
Open a case with HP there can be n number of reasons for this.
From your post seems some cluster issue?
your cluster is working fine now?

BR,
Kapil+
I am in this small bowl, I wane see the real world......
S.S.
Super Advisor

Re: Urgent

Hi Kapil,

Thank you!

I have tried to manually mounting the filesystem and worked fine.

About cluster issue:

The Secondary node is down.
NODE STATUS STATE
oradb1 down failed


Thanks & Regards


Michael Steele_2
Honored Contributor

Re: Urgent

Sigh. Question: Are you up and running on the other node?

As far as debugging this problem goes, refer to this message and search the ITRC for related Panics, like this one.

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=717943

11:00 Sat Feb 27 2010. Reboot after panic: SafetyTimer expired, isr.ior = 0'4240003.0'45514b68

Find your cluster lock disk and verify that it hasn't failed.

Refer to the node Timeout value and consider raising it. It isn't that simple. Review the procedure first.

Refer to this manual.

http://docs.hp.com/en/B3936-90135/B3936-90135.pdf

CALL HP
Support Fatherhood - Stop Family Law
S.S.
Super Advisor

Re: Urgent

Hi All,

Justnow my colleague said that he had done some thing with cluster due to that it went to reboot.

Now i manually started the node and working fine.

One thing i want to know about for the auto mounting; what changes i should perform to correct this.

Thank you Kapil and Michael.
Kapil Jha
Honored Contributor
Solution

Re: Urgent

I suppose if its working fine with manual mount just check and populate the fstab file correctly.

I again hop that its not part of cluster package :)

edit the fstab and to check just do a mount -a
BR,
Kapil+
I am in this small bowl, I wane see the real world......