Operating System - HP-UX
1835880 Members
3575 Online
110086 Solutions
New Discussion

Re: Big Problem! - K420 will not fully boot after a simple reboot

 
Gavin Westermann
Frequent Advisor

Big Problem! - K420 will not fully boot after a simple reboot

Can't imagine what happened here. Last night i remotely rebooted a K420 running 11.11 and lost contact with it. I went to it and at a console it was stuck half way through the boot process. rebooted to single user mode, ioscan sees both scsi drives, on raising the run level to 6 I discover that vg02 all of a sudden cannot be seen or is missing? naturally the boot process hangs as /opt is in that group as well as other things. Any tips on how to recover from something like this?

Gavin
20 REPLIES 20
Helen French
Honored Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

I would do the following:
1) Boot system in single user mode or LVM maintenance mode.
2) Check all suspected disks for any I/O errors.
3) Check file systems with fsck.
4) Check old rc log files and see what is/was failing.
5) Try to activate vg02 manually and mount file systems.
6) Check system log files.
Life is a promise, fulfill it!
A. Clay Stephenson
Acclaimed Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Run level 6 is very unusual. If this was not a typo, I would first look for some very strange init.d commands that are triggered above run-level 3.

Look at /etc/inittab and see what the initdefault entry is set to; in almost all cases it is set to 3.

An ioscan -fn should indicate whether the disks themselves have disappeared or if your have LVM errors. If a disk shows up as NO_HW then you are having physical disk problems.


If it ain't broke, I can fix that.
Jeff Schussele
Honored Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Hi Gavin,

Sun systems use run levels 5 & 6 for specialized shutdown scripting. HP doesn't. So I'm not sure just *why* you'd attempt run level 6.
Please explain.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Printaporn_1
Esteemed Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Hi Gavin,
Check root's shell does it change from
/sbin/sh ?
enjoy any little thing in my life
Rajesh G. Ghone
Regular Advisor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Hi Gavin,

it looks like disk belongs to VG02 is having some problem, according to me what you can do is go to single user mode try to activate VG02 manually & if it gets activated then run fsck on all the file system blongs to that VG if this things work then i think the system should come up smoothly.

Regards,
Rajesh G.
Rajesh Ghone
Steven E. Protter
Exalted Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

I'm betting that vg02 can't get a quorum and isn't coming up.

There should be something about it in the rc.log or /var/adm/syslog/syslog.log

If the volume group is attached by fiber, check the fiber card

fcmsutil /dev/td0

the device depends on your configuration.

As far as run level goes, I typically deal with production issues with new software that doesn't want to start up right by adding a run level.

Though I prefer to do this on development machines I did once run into a situation where I used Ignite to push out a production server and some oracle product worked right on the golden image server but not production. I dealt with it by changing /etc/inittab and moving the Start and Kill scripts to run level 4 and 3 respectively.

I don't run systems very long that way and recommend returning to a standard configuration as soon as possible.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Gavin Westermann
Frequent Advisor

Re: Big Problem! - K420 will not fully boot after a simple reboot

lets forget about the run level and that is not really an issue, I just picked something off the top of my head to go further in the boot process. it still hangs at that same point. VG02 is definitely not happening, someone mentioned starting it manually? How might I do that, I will have to talk someone through it as I have no physical access to the box at the moment.
Michael Steele_2
Honored Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Gavin:

Display the disk contents of vg02.

strings /etc/lvmtab

-or-

vgcfgrestore -f /etc/lvmconf/vg02.conf -l

Test the disks for failure:

ioscan -fnkC disk | more
(* Check for CLAIMED status *)


dd if=/dev/dsk/cXtYdZ of=/dev/null count=1000000
control c

If prompt does not return then bad disk.

diskinfo /dev/dsk/cXtYdZ
Note type and size if bad, needed for replacement.

If you need vgo2 up then activate without quorum.

vgchange -a y -q n /dev/vg02
Support Fatherhood - Stop Family Law
George Liu_2
Frequent Advisor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Given you can still login in single user mode and vg02 is for /opt only, things are really not very bad. Here is what you need to do,

Login into single user mode;
ioscan to check the disk for vg02 can be recognized. If not, locate problem is from hardware or kernel. Have you performed patch recently?

Run vg commands to display, active, and so on to recover.
Kent Ostby
Honored Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

I would do the following:

1) boot to single user mode

2) compare your files in /dev/vg01 and /dev/vg02 (I'm assuming that you have a /dev/vg01 and it activates). There will be differences in the minor numbers to reflect that this is vg 02 and not vg 01, but the number of files per lvol and the general type of files should be the same.

3) Attempt to activate /dev/vg02 with:

vgchange -a y /dev/vg02

IF this works then the VG can activate and you probably dont have an lvm or hw problem.

IF this gives you warnings but still activates then deactivate it with
vgchange -a n /dev/vg02 .

IF this command didnt work or you got warning and have now deactivated it then I would go through the process of doing the vgcfgrestore to all of the disks in /etc/lvmtab (another user above listed this).

After the vgcfgrestore has been done, redo the vgchange -a y /dev/vg02 command.

If this still fails then you may have a HW problem that you should get checked out by your HW folks.

3) Okay .. if the vgchange -a y /dev/vg02 works but you are still getting errors about vg02 not being available on a normal boot then you either have a problem in /etc/fstab or /etc/lvmrc.

Check /etc/fstab to make sure the lines included with /dev/vg02 match up there.

Also check /etc/lvmrc to see if it matches ( via ll or du ) other systems that you have.

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Gavin Westermann
Frequent Advisor

Re: Big Problem! - K420 will not fully boot after a simple reboot

I had someone console in at single user mode and try some of the commands, the drives both show as claimed, he tries disk info but it is not found, I had him path it /usr/sbin/diskinfo but still not found, a cd to /usr and an ls gives no report which leads me to believe that the contents of /usr might be in that group. I'll have to go up there myself to see this with my own eyes. The last thing I installed was ssh and had no issues, several days later I rebooted the box remotely and all this happened. Thanks for eveyone's help
A. Clay Stephenson
Acclaimed Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Note that /usr is not mounted in single-user mode unless an explicit mount /usr command is issued.
If it ain't broke, I can fix that.
Rajesh G. Ghone
Regular Advisor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Hi Gavin,
In single user mode you need to mount /usr

Regards,
Rajesh G.
Rajesh Ghone
Gavin Westermann
Frequent Advisor

Re: Big Problem! - K420 will not fully boot after a simple reboot

So I spent a couple of hours in front of that machine last night and made some headway. The vg02 thing was a red herring of sorts as vg02 was a group assigned to some external drives that have not been attached to the box for some time. vg00 and vg01 are the only two of concern here. So.... that said. Back to single user mode, vg00 is fine and I bring up vg01 with the use of a command on person offered, vgchange, so thanks for that. I mount each point manually to each logical volume so now I can see everything, one by one I bring up the run level until I get to 3 and things seem fine, but no network connectivity and still an error message on the console something to the effect "INIT process spawning too fast, will wait 5 minutes and try again" Any thoughts on that?
Sridhar Bhaskarla
Honored Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Hi Gavin,

Another thing to check is to make sure you have no problems with your startup scripts. While you are in single user mode. Do

. /etc/rc.config.d/*

You should not get any errors.

If you did, then there is a problem with one of the scripts there and you will need to fix it.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Gavin Westermann
Frequent Advisor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Sri,

That sounds like a possiblity. I had another admin install a mediator that forwards events from NNM over to CIC (Cisco information Ceneter) as the machine boots up I do see some reference to a start up script that referrs to NCO which is that mediator, maybe he placed it too early in the sequence and it tries to launch it too soon hence the hang?
Sridhar Bhaskarla
Honored Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Hi Gavin,

There may be syntax errors in that script. Try this way while you are in single user mode

#for FILE in /etc/rc.config.d/*
>do
>echo $FILE
>read
>done

This source in each file by displaying it's name. This way you can find the file that is causing the issue. Keep pressing enter until all the files are executed.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Gavin Westermann
Frequent Advisor

Re: Big Problem! - K420 will not fully boot after a simple reboot

I'll try that and see if the culprit is in there. I have rebooted this box a million times and never had this happen. One other question, if I can, apparently, raise the box to run level 3 successfully with everything mounted why does my networking not come up?
A. Clay Stephenson
Acclaimed Contributor

Re: Big Problem! - K420 will not fully boot after a simple reboot

You should have networking a run level 2. The "respawning too rapidly" message is a direct result of your network problems.


0) Carefully check network cables, switch port settings,
...

(I'm betting on a hardware/cabling problem)

1) Cd to /etc/rc.config.d

2) Carefully check netconf, hpetherconf, and probably hpbtlanconf. If any changes are made/needed then the easy
method is to reboot at this point.









If it ain't broke, I can fix that.
Gavin Westermann
Frequent Advisor

Re: Big Problem! - K420 will not fully boot after a simple reboot

Solved the problem and the server is up. The other admin that had written the script to start the mediator had triggered it to start far too early in the boot sequence causing the whole box to hang. I changed the name of the script he had placed in rc.config.d and rebooted the box - like clockwork it came up. I spread around points to all that helped on this. A big thanks to all for help and patience!

Gavin AT&T Canada