HPE 9000 and HPE e3000 Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

Server Keeps rebooting

 

Server Keeps rebooting

My HP server which is an rp2470 (HP9000) server keeps rebooting with the same error messages.

we had a power cut on Thursday night in which time the machine went down dirty and hasn't come back up since.

The errors that it gives me are Below at the bottom. I've search this forum and found it a few times but all the help and advice that's been given in those posts hasn't worked.

I think the hard disk in the machine is dead but one of you might be able to help a bit more. I can get at the GSP and ISL prompts but that's about it. Can't boot into the OS which is HP-UX 11.

Any help would be seriously appreciated.

Thanks

************* SYSTEM ALERT **************
SYSTEM NAME: tv3
DATE: 12/20/2004 TIME: 14:36:32
ALERT LEVEL: 12 = Software failure

REASON FOR ALERT
SOURCE: 1 = processor
SOURCE DETAIL: 1 = processor general SOURCE ID: 0
PROBLEM DETAIL: 0 = no problem detail

LEDs: RUN ATTENTION FAULT REMOTE POWER
FLASH FLASH FLASH ON ON
LED State: Unexpected Reboot. Running non-OS code. Non-critical error detected.
Check Chassis and Console Logs for error messages.

0xA0E000C01100B000 00000000 000005E9 - type 20 = major change in system state
0x58E008C01100B000 0000680B 140E2420 - type 11 = Timestamp 12/20/2004 14:36:32
A: ack read of this entry - X: Disable all future alert messages
Anything else skip redisplay the log entry
->Choice:a
*****************************************

************* SYSTEM ALERT **************
SYSTEM NAME: tv3
DATE: 12/20/2004 TIME: 14:36:41
ALERT LEVEL: 3 = System blocked waiting for operator input

REASON FOR ALERT
SOURCE: 1 = processor
SOURCE DETAIL: 1 = processor general SOURCE ID: 0
PROBLEM DETAIL: 0 = no problem detail

LEDs: RUN ATTENTION FAULT REMOTE POWER
FLASH FLASH FLASH ON ON
LED State: Unexpected Reboot. Running non-OS code. Non-critical error detected.
Check Chassis and Console Logs for error messages.

0xF8E000301100E000 00000000 0000E000 - type 31 = legacy PA HEX chassis-code
0x58E008301100E000 0000680B 140E2429 - type 11 = Timestamp 12/20/2004 14:36:41
A: ack read of this entry - X: Disable all future alert messages
Anything else skip redisplay the log entry
->Choice:a
*****************************************

************* SYSTEM ALERT **************
SYSTEM NAME: tv3
DATE: 12/20/2004 TIME: 14:37:49
ALERT LEVEL: 3 = System blocked waiting for operator input

REASON FOR ALERT
SOURCE: 1 = processor
SOURCE DETAIL: 1 = processor general SOURCE ID: 0
PROBLEM DETAIL: 0 = no problem detail

LEDs: RUN ATTENTION FAULT REMOTE POWER
FLASH FLASH FLASH ON ON
LED State: Unexpected Reboot. Running non-OS code. Non-critical error detected.
Check Chassis and Console Logs for error messages.

0xF8E000301100EF00 00000000 0000EF00 - type 31 = legacy PA HEX chassis-code
0x58E008301100EF00 0000680B 140E2531 - type 11 = Timestamp 12/20/2004 14:37:49
A: ack read of this entry - X: Disable all future alert messages
Anything else skip redisplay the log entry
->Choice:
23 REPLIES 23
Victor BERRIDGE
Honored Contributor

Re: Server Keeps rebooting

Hi,
How are you configured? mirrored system disks? do you have a tape device?
At ISL when you type sea what are the bootable devices?
You may have just a corrupted boot disk. if you have an Ignite tape I would try to restore to see if it can. If it cant your disk may well be dead...


All the best
Victor
martin_215
Frequent Advisor

Re: Server Keeps rebooting

I think since you see system name listen in the error message
SYSTEM NAME: tv3
looks like hpux has passed ISL loading and kernel loading

most probably it wuld be files missing caused by filesystem corruption when that power outage occured

I would actually put in hpux cdrom and see if ican boot all the way from cdrom to command prompt
this would in someway ensure you hardware and processors are okie

once you get past tht at ISL>hpux -is and see if u can boot to single user
or hpux -lm boot to boot into lvm maintenance mode

Re: Server Keeps rebooting


Update - have been able to get the machine into the single user mode.

I've run an FSCK on the machine and it complains about all the volumes on the primary disk.

Here's the output from the boot and from FSCK ... am I right in thinking that the main disk is dead?


System Console is on the Built-In Serial Interface
Entering cifs_init...
Initialization finished successfully... slot is 9
Logical volume 64, 0x3 configured as ROOT
Logical volume 64, 0x2 configured as SWAP
Logical volume 64, 0x2 configured as DUMP
Swap device table: (start & size given in 512-byte blocks)
entry 0 - major is 64, minor is 0x2; start = 0, size = 6291456
Starting the STREAMS daemons-phase 1
Checking root file system.
vxfs fsck: file system had I/O error(s) on meta-data.
log replay in progress
file system is not clean, full fsck required
pass0 - checking structural files
pass1 - checking inode sanity and blocks
vxfs fsck: fsck read failure bno = 10008, off = 0, len = 8192
file system check failure, aborting ...
Root check done.

Then come the system alerts as described above ... and the result of FSCK is

# fsck
fsck: /dev/vg00/rlvol1: possible swap device (cannot determine)

continue (y/n)? y
Can't open /dev/vg00/rlvol1, errno = 6
vxfs fsck: Cannot open /dev/vg00/lvol3: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol4: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol5: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol6: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol7: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvol8: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg01/lvol1: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvora: No such device or address
file system check failure, aborting ...
vxfs fsck: Cannot open /dev/vg00/lvdata2: No such device or address
file system check failure, aborting ...
#

Thanks
martin_215
Frequent Advisor

Re: Server Keeps rebooting

I think your disk group is not activated
since you could boot into single user mode your disk is most likely OKAY

its your filesystem and volume groups which need maintenance

activate the volume group and run a full fsck
martin_215
Frequent Advisor

Re: Server Keeps rebooting

further you cant run fsck on swap volume
so dont worry if swap volume fsck gives error

Re: Server Keeps rebooting


Am a complete NOOB at HPUX - how do I activate my groups etc ?

Thanks
Victor BERRIDGE
Honored Contributor

Re: Server Keeps rebooting

Lee,
Unless youve changed the way HPUX should be installed, if your system is complaining about rlvol1 - it is a bad start since this is /stand, the swap beeing always rlvol2...

So do you have a tape device and a make_recovery tape done?
Because to sort get out of this you will need to get a sane vg00 up, things get easy after...

Re: Server Keeps rebooting


Victor,

We haven't got a taped drive attached or a backup of the disk. I've "inherited" the server so to speak a few months back and not got round to doing anything maintenance wise on there yet in terms of the OS.

I've got a full backup of our systems on there so that's not a problem. If the disk is dead and needs replacing then thats what I'll do.

martin_215
Frequent Advisor

Re: Server Keeps rebooting

vgchange -a y /dev/vgXXX

this could actuvate the volume group
I am not sure if activating the VG and running fsck may fix it

check what other members has to say

Re: Server Keeps rebooting

Martin

I have to say a big thank you. Your offerings have worked. The machine is back up again and the fault light is off. It's still not on the network and the following errors are now being seen on reboot

msgcnt 54 vxfs: mesg 016: vx_ilisterr - / file system error reading inode 1825
msgcnt 55 vxfs: mesg 016: vx_ilisterr - / file system error reading inode 1825
msgcnt 56 vxfs: mesg 016: vx_ilisterr - / file system error reading inode 1825
msgcnt 57 vxfs: mesg 016: vx_ilisterr - / file system error reading inode 1825

INIT: Command is respawning too rapidly.
Will try again in 5 minutes.
Check for possible errors.
id:samd "/usr/sam/lbin/samd # system mgmt daemon"

#
#

I'm guessing this is an error on reading the disk?




Victor BERRIDGE
Honored Contributor

Re: Server Keeps rebooting

Lee,
Does the slot matche with the address of your hardware?
Because this is strange:
Initialization finished successfully... slot is 9
Logical volume 64, 0x3 configured as ROOT
Logical volume 64, 0x2 configured as SWAP
Logical volume 64, 0x2 configured as DUMP
64 means you are on LVM but 0x3 is this true? or should I read 0x3 inwhich case it could be correct
But I/O error on metadata plus read failure isnt good...
How did you backup the system?

All the best
Victor

Re: Server Keeps rebooting

Victor

I have no idea. Like I said I'm new to HP-UX and so I couldn't tell you.

If the I/O read error is bad then maybe I'll just get a new disk and start from scratch on the machine.

Best course of action ?

martin_215
Frequent Advisor

Re: Server Keeps rebooting

iguess there some error in inittab which brings that command responding too rapidly error

it may also be caused by root and var mounted as readonly which causes init to report error

do a remount of root and /var with readwrite and do init q

martin_215
Frequent Advisor

Re: Server Keeps rebooting

inode read error could be caused by bad disk block or superblock error
may be you will have to do an fsck and reboot -n
did u do reboot -n after u did the fsck previous time
or else it will flush memory to disk

Re: Server Keeps rebooting


Well, I've done a reboot -n after the fsck and still the same I/O error on the / File system on an inode.

So I'm guessing that the disk is dead in some way. I'm going to be calling my HP engineer this morning so I guess all that is left is to say thank you both for your help.

martin_215
Frequent Advisor

Re: Server Keeps rebooting

Its quite strange that both disks goes bad at the same inode
since you have mirrored volume

I still believe bad disk media is unlikely

ofcourse reinstalling would fix it
but I belive it is still something wrong with the filesystem.
where you doing some sort of filesystem metadata operation like extending filesystem,extending volume etc etc when this power outage occured
you neednt have to reply to this

just post what fixed the problem and please do assign points.



Re: Server Keeps rebooting

Hi Lee,
I Want to be sure if you don't have problem with Hardware (GSP it self my be faulty)did you replace it ? this can hapen after power failure !!!!

beleve that workgroup give more and efficient result

Re: Server Keeps rebooting


Martin,

The disk's are not mirrored at all. They are two stand alone disks acting as seperate volumes.

I've telephoned my local engineer and he's comming in tomorrow to have a look and if need be replace the disks and re-install the OS for me.

I'll let you know the outcome.

Re: Server Keeps rebooting


Engineer came - Disk was dead so he's replaced it and re-installed the OS. Trouble is now, he wasn't that great with HP-UX.

I now can't get my CDE running on a remote machine.

Any pointers or shall I start a new topic?
Robert-Jan Goossens
Honored Contributor

Re: Server Keeps rebooting

Start a new one in the HPUX section.

Refer with a link to this thread.

HPUX section
http://forums1.itrc.hp.com/service/forums/familyhome.do?familyId=117

Best regards,
Robert-Jan
martin_215
Frequent Advisor

Re: Server Keeps rebooting

its probably the demon which accepts remote cde connections may not be running
im not sure wat the demon name is
check inetd.conf and dtspcd

did u mean connecting to this servers cde from a remote computers cde?



martin_215
Frequent Advisor

Re: Server Keeps rebooting

check /var/adm/inetd.sec
this has entry permitting which hosts are allowed to connect to dtspc
by default only local host is allowed
add entry for ur remote machine

Re: Server Keeps rebooting


Hi Martin

Sorry for the delay - holidays :-)

That didn't work. Thanks for the try though ..