Operating System - HP-UX
1834601 Members
4114 Online
110069 Solutions
New Discussion

What happened to my reboot?

 
Bryan D. Quinn
Respected Contributor

What happened to my reboot?

Ok, this is in hopes that somebody knows more about the smoke and mirrors than I do.

I have an N-4000 server that took about 55 minutes to reboot last night. So basically it missed it's bus so to speak because my PRD server tried to restart SAP on it and the box was not up yet. My question is this:
Does anybody know of anyway of finding out what went on between the last entry in OLDsyslog.log and the first entry in syslog.log, other than watching the console?

I am just trying to find out what made it take 55 minutes to reboot, when it should have been around 10 minutes at the most. I have checked /etc/shutdownlog. I also have a log for the script that does the reboot and it showed all of the shutdown information that you would normally see on the console. It went all the way to the last portion which says "Transition to run-level 0 complete" "Shutdown at 19:14 (in 0 minutes) reboot:

If anybody knows of something I am missing, please let me know.

Thanks,
-Bryan
12 REPLIES 12
Pete Randall
Outstanding Contributor

Re: What happened to my reboot?

Bryan,

I would check the GSP logs. If there are alerts issued during the initial bootup (for example, we get a memory error because the SIMMs are unbalance between the memory carriers), there will be a delay until it times out and continues. Maybe you had multiple, recoverable GSP issues.


Pete


Pete
Leif Halvarsson_2
Honored Contributor

Re: What happened to my reboot?

Hi,
Anything in the /etc/rc.log ?

I would suspect some problem with a filesystem and there was a full fsck running.
Stefan Farrelly
Honored Contributor

Re: What happened to my reboot?

whats the timestamp on /etc/rc.log - this when the system startup finished. Also at the start of /etc/rc.log it says when it started - ocassionally the rc process on our servers can take a long time usually due to a network problem or SAN problem.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Stefan Farrelly
Honored Contributor

Re: What happened to my reboot?

whats the timestamp on /etc/rc.log - this is when the system startup finished. Also at the start of /etc/rc.log it says when it started - ocassionally the rc process on our servers can take a long time usually due to a network problem or SAN problem.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Bryan D. Quinn
Respected Contributor

Re: What happened to my reboot?

Pete-

I did check the GSP logs, but I did not see anything that was from that time. I might check over that again a little more carefully. Thanks!

Stefan-

The time stamp on the start of the rc.log is 20:09, which corresponds with the time stamp on the first entry of the syslog.log.

So my reboot started at 19:14 and my rc.log and syslog.log both have a start of 20:09. I will check the GSP again, a little more carefully.

Thanks,
-Bryan
Marco Santerre
Honored Contributor

Re: What happened to my reboot?

Though I won't offer any help per say, as already mentionned if the server had any kind of network trouble on its way down, it could have taken a little longer to actually close everything "properly" before it actually went into its sequence of reboot, therefore the reason why you wouldn't see anything in the /etc/rc.log and in the GSP. Though I'm not implying that there was actually something wrong with you rserver in the first place, maybe something did happen where it had trouble with shutting down on eof the process and it took a while to bring it down.. NFS does come to mind as well.
Cooperation is doing with a smile what you have to do anyhow.
Pete Randall
Outstanding Contributor

Re: What happened to my reboot?

Bryan,

Note that GSP logs timestamps are GMT - you'll have to translate.


Pete


Pete
Steven E. Protter
Exalted Contributor

Re: What happened to my reboot?

Most likely your database had trouble going down or coming up.

So long as it didn't actually fail, the log files mentioned above may not note the event.

You might want to check the alert logs on the backend database for messages, because something had trouble on the way down or on the way up.

The time stamps in the /etc/rc.log file might be useful in pinpointing where you want to look(what time).

I make this suggestion, because oracle did this to me a few times last year, taking 45 minutes to shut down, with lots of disk i/o. It was cutting logs as I recall.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Bryan D. Quinn
Respected Contributor

Re: What happened to my reboot?

Marco- Thanks, NFS had crossed my mind as well, but I have six other application servers one of which is an identical N-4000 and none of the other boxes had any trouble. They all mount the exact same NFS mounts from our PRD box.

Pete- Thanks for the heads up on the GMT in GSP, I was not aware of that.

SEP- I thought about checking out Oracle, but it appears to have come down fine. None of the other application servers (6 additionals) had any problems. PRD stopped all of the SAP processes on all the app servers and then began the reboot of each. Then PRD rebooted itself and when it came up it started SAP/Oracle and then started SAP on all the app servers. Since app server number 6 was not up, it failed when it tried to start SAP on that box. We caught it a few minutes later and manually started SAP on that box.

It looks like next weekend I will be sitting in front of the console for that box watching it reboot. That is if it decides to do this again. It did not do it last week but did do it the week before that. Very strange situation.

Thanks,
-Bryan
Todd McDaniel_1
Honored Contributor

Re: What happened to my reboot?

On my N-4000/55, I have oracle running on my boxes and I occasionally have my DBAs shutdown the DB so that it will not restart upon reboot... when I am doing patching...

IF possible, shutdown the DB and then reboot without it coming up... Then see how long it takes for rebooting...

IF you have any other apps try turning them off and rebooting to test it... if you have time to test it.


Also, how much, if any attached storage do you have? I had an old V-class that had a ton of disks in a SAN. I know it ran ioscan several times until I finally had a patch installed that stored the ioscan data and sped up the reboot time.
Unix, the other white meat.
Todd McDaniel_1
Honored Contributor

Re: What happened to my reboot?

One more thought, Did you watch the shutdown procedure from the console?

I would watch that and see if something hangs or fails to shutdown properly.
Unix, the other white meat.
Bryan D. Quinn
Respected Contributor

Re: What happened to my reboot?

Hey Todd,

No I was at home when the server did it's weekly reboot, so I did not get to see the shutdown. I did however, capture part of the shutdown messages in my script log file. It captured everything down to the transition to run-level 0 message. Everything shutdown fine. As for Oracle, I am not running Oracle on this box. It is an application server for our SAP/Oracle Production server. This has happened once before and that was two weeks ago. So last weeks reboot went fine, apparently whatever is getting "hosed" up did not get "hosed" up last weekend.
Anyways, next weekend I plan on being on-site to watch the console during the reboot.

Thanks,
-Bryan