HPE 9000 and HPE e3000 Servers
1751960 Members
4750 Online
108783 Solutions
New Discussion юеВ

Re: Server Keeps rebooting

 
Lee Walters_1
Advisor

Server Keeps rebooting

My HP server which is an rp2470 (HP9000) server keeps rebooting with the same error messages.

we had a power cut on Thursday night in which time the machine went down dirty and hasn't come back up since.

The errors that it gives me are Below at the bottom. I've search this forum and found it a few times but all the help and advice that's been given in those posts hasn't worked.

I think the hard disk in the machine is dead but one of you might be able to help a bit more. I can get at the GSP and ISL prompts but that's about it. Can't boot into the OS which is HP-UX 11.

Any help would be seriously appreciated.

Thanks

************* SYSTEM ALERT **************
SYSTEM NAME: tv3
DATE: 12/20/2004 TIME: 14:36:32
ALERT LEVEL: 12 = Software failure

REASON FOR ALERT
SOURCE: 1 = processor
SOURCE DETAIL: 1 = processor general SOURCE ID: 0
PROBLEM DETAIL: 0 = no problem detail

LEDs: RUN ATTENTION FAULT REMOTE POWER
FLASH FLASH FLASH ON ON
LED State: Unexpected Reboot. Running non-OS code. Non-critical error detected.
Check Chassis and Console Logs for error messages.

0xA0E000C01100B000 00000000 000005E9 - type 20 = major change in system state
0x58E008C01100B000 0000680B 140E2420 - type 11 = Timestamp 12/20/2004 14:36:32
A: ack read of this entry - X: Disable all future alert messages
Anything else skip redisplay the log entry
->Choice:a
*****************************************

************* SYSTEM ALERT **************
SYSTEM NAME: tv3
DATE: 12/20/2004 TIME: 14:36:41
ALERT LEVEL: 3 = System blocked waiting for operator input

REASON FOR ALERT
SOURCE: 1 = processor
SOURCE DETAIL: 1 = processor general SOURCE ID: 0
PROBLEM DETAIL: 0 = no problem detail

LEDs: RUN ATTENTION FAULT REMOTE POWER
FLASH FLASH FLASH ON ON
LED State: Unexpected Reboot. Running non-OS code. Non-critical error detected.
Check Chassis and Console Logs for error messages.

0xF8E000301100E000 00000000 0000E000 - type 31 = legacy PA HEX chassis-code
0x58E008301100E000 0000680B 140E2429 - type 11 = Timestamp 12/20/2004 14:36:41
A: ack read of this entry - X: Disable all future alert messages
Anything else skip redisplay the log entry
->Choice:a
*****************************************

************* SYSTEM ALERT **************
SYSTEM NAME: tv3
DATE: 12/20/2004 TIME: 14:37:49
ALERT LEVEL: 3 = System blocked waiting for operator input

REASON FOR ALERT
SOURCE: 1 = processor
SOURCE DETAIL: 1 = processor general SOURCE ID: 0
PROBLEM DETAIL: 0 = no problem detail

LEDs: RUN ATTENTION FAULT REMOTE POWER
FLASH FLASH FLASH ON ON
LED State: Unexpected Reboot. Running non-OS code. Non-critical error detected.
Check Chassis and Console Logs for error messages.

0xF8E000301100EF00 00000000 0000EF00 - type 31 = legacy PA HEX chassis-code
0x58E008301100EF00 0000680B 140E2531 - type 11 = Timestamp 12/20/2004 14:37:49
A: ack read of this entry - X: Disable all future alert messages
Anything else skip redisplay the log entry
->Choice:
23 REPLIES 23
Victor BERRIDGE
Honored Contributor

Re: Server Keeps rebooting

Hi,
How are you configured? mirrored system disks? do you have a tape device?
At ISL when you type sea what are the bootable devices?
You may have just a corrupted boot disk. if you have an Ignite tape I would try to restore to see if it can. If it cant your disk may well be dead...


All the best
Victor
martin_215
Frequent Advisor

Re: Server Keeps rebooting

I think since you see system name listen in the error message
SYSTEM NAME: tv3
looks like hpux has passed ISL loading and kernel loading

most probably it wuld be files missing caused by filesystem corruption when that power outage occured

I would actually put in hpux cdrom and see if ican boot all the way from cdrom to command prompt
this would in someway ensure you hardware and processors are okie

once you get past tht at ISL>hpux -is and see if u can boot to single user
or hpux -lm boot to boot into lvm maintenance mode
Lee Walters_1
Advisor

Re: Server Keeps rebooting


Update - have been able to get the machine into the single user mode.

I've run an FSCK on the machine and it complains about all the volumes on the primary disk.

Here's the output from the boot and from FSCK ... am I right in thinking that the main disk is dead?


System Console is on the Built-In Serial Interface
Entering cifs_init...
Initialization finished successfully... slot is 9
Logical volume 64, 0x3 configured as ROOT
Logical volume 64, 0x2 configured as SWAP
Logical volume 64, 0x2 configured as DUMP
Swap device table: (start & size given in 512-byte blocks)
entry 0 - major is 64, minor is 0x2; start = 0, size = 6291456
Starting the STREAMS daemons-phase 1
Checking root file system.
vxfs fsck: file system had I/O error(s) on meta-data.
log replay in progress
file system is not clean, full fsck required
pass0 - checking structural files
pass1 - checking inode sanity and blocks
vxfs fsck: fsck read failure bno = 10008, off = 0, len = 8192
file system check failure, aborting ...
Root check done.

Then come the system alerts as described above ... and the result of FSCK is

# fsck
fsck: /dev/vg00/rlvol1: possible swap device (cannot determine)

continue (y/n)? y
Can't open /dev/vg00/rlvol1, errno = 6
vxfs fsck: Cannot open /dev/vg00/lvol3: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol4: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol5: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol6: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol7: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol8: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg01/lvol1: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvora: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvdata2: No such device or address
file system check failure, aborting ...
#

Thanks
martin_215
Frequent Advisor

Re: Server Keeps rebooting

I think your disk group is not activated
since you could boot into single user mode your disk is most likely OKAY

its your filesystem and volume groups which need maintenance

activate the volume group and run a full fsck
martin_215
Frequent Advisor

Re: Server Keeps rebooting

further you cant run fsck on swap volume
so dont worry if swap volume fsck gives error
Lee Walters_1
Advisor

Re: Server Keeps rebooting


Am a complete NOOB at HPUX - how do I activate my groups etc ?

Thanks
Victor BERRIDGE
Honored Contributor

Re: Server Keeps rebooting

Lee,
Unless youve changed the way HPUX should be installed, if your system is complaining about rlvol1 - it is a bad start since this is /stand, the swap beeing always rlvol2...

So do you have a tape device and a make_recovery tape done?
Because to sort get out of this you will need to get a sane vg00 up, things get easy after...
Lee Walters_1
Advisor

Re: Server Keeps rebooting


Victor,

We haven't got a taped drive attached or a backup of the disk. I've "inherited" the server so to speak a few months back and not got round to doing anything maintenance wise on there yet in terms of the OS.

I've got a full backup of our systems on there so that's not a problem. If the disk is dead and needs replacing then thats what I'll do.

martin_215
Frequent Advisor

Re: Server Keeps rebooting

vgchange -a y /dev/vgXXX

this could actuvate the volume group
I am not sure if activating the VG and running fsck may fix it

check what other members has to say