HPE 9000 and HPE e3000 Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

boot time error

 
Arunkumar_3
Advisor

boot time error

Hi All,
I am using 9000 server. I am running 11.11 on it. I have two A6795A HBAs in it. It is connected to Storage Area Network using FC cables. Some how system was down, I rebooted. But it is not rebooting and failed at one place with an errore message


alloc_pdc_pages: Relocating PDC from 0xffff800000 to 0x3fb00000.

************* SYSTEM ALERT **************
SYSTEM NAME: uninitialized
DATE: 11/14/2003 TIME: 22:23:38
ALERT LEVEL: 12 = Software failure

REASON FOR ALERT
SOURCE: 1 = processor
SOURCE DETAIL: 1 = processor general SOURCE ID: 0
PROBLEM DETAIL: 0 = no problem detail

LEDs: RUN ATTENTION FAULT REMOTE POWER
FLASH FLASH OFF ON ON
LED State: Running non-OS code. Non-critical error detected.
Check Chassis and Console Logs for error messages.

0xF8E000C01100B800 00000000 0000B800 - type 31 = legacy PA HEX chassis-code
0x58E008C01100B800 0000670A 0E161726 - type 11 = Timestamp 11/14/2003 22:23:38
A: ack read of this entry - X: Disable all future alert messages
Anything else skip redisplay the log entry
->Choice:

After this message it started giving as follows

platform test 613F
platform test 613D
unknown, no source stated legacy PA HEX chassis-code CE00
unknown, no source stated legacy PA HEX chassis-code CE01
unknown, no source stated legacy PA HEX chassis-code CE62
unknown, no source stated legacy PA HEX chassis-code CE63
unknown, no source stated legacy PA HEX chassis-code CE63
unknown, no source stated legacy PA HEX chassis-code CE63
unknown, no source stated legacy PA HEX chassis-code CE63
unknown, no source stated legacy PA HEX chassis-code CE63
unknown, no source stated legacy PA HEX chassis-code CE63
unknown, no source stated legacy PA HEX chassis-code CE63
unknown, no source stated legacy PA HEX chassis-code CE

If I remove those FC cables and reboot, it is rebooting normally, no problems, if I connect those cable and reboot I am getting the above problem. If anyone knows about this please help me

thanks in advance
-Arun
6 REPLIES 6
Tony Horton
Frequent Advisor

Re: boot time error

Hi,

Not sure but it looks like you have an L class box. Was the system working with this FC connection before? If not then check the firmware version (actually check it even if it was)....

there was a problem with booting (with your card) fixed in firmware version 41.39.

heres an extract from the firmware patch.

PHSS_25685:
The following features, fixes and enhancements appear in
server firmware revision 41.39:
- Earlier versions of PDC prevented online diagnostics
for reading PIM data. PDC revision 41.39 corrects this
problem.
- previous versions of PDC were unable to dump using the
A6795A 2Gb Fibre Channel HBA on all platforms, unable to
to dump to disks on Point to Point Fabric with McData
switches and unable to boot/dump with Brocade 3800 2GB
Switches with A6795A cards installed. PDC revision 41.39
fixes these issues.
- HPMC chassis codes reported an incorrect PDC base
address. When the chassis code is sent for HPMC MONARCH
SELECTED it will now report the full 64-bit address.
- Changed IODC to allow DDS4 tape drives to operate in LVD
mode on the A5149A SCSI Card.
-Improved single memory error handling.

Regards,

Tony.
No man is an isthmus
Michael Steele_2
Honored Contributor

Re: boot time error

ALERT LEVEL: 12 is incomplete, or, it is not enough to keep your server down. Probably there are additional error messages. Go into the GSP Logs and retrieve these messages.

control b
sl (* show logs *)
e (* error *)
return through each error message

Note 1: Get every error message with a similar time stamp.

Note 2: The time stamp will be in GMT.
Support Fatherhood - Stop Family Law
Tobias Hartlieb
Trusted Contributor

Re: boot time error

Hi Arun,

when booting up without FC cable, you rae able to get to Hp-UX, right?
Do you see any crash dump (in /var/adm/crash)? Do you see a recent HPMC (cd /var/tombstones/ => grep -i time ts99 => any recent "Timestamp"?)
Check also FC Patch level and SAN Switch FW...

Regards.

Tobias
Stefan Stechemesser
Honored Contributor

Re: boot time error

Hi,

to get an idea what error is responsible for the early system panic during bootup, boot the server without the cables connected. After the server is up, connect the cables and try to make some traffic via the cards (first check with fcmsutil /dev/td? if the specific card has a link to the switch) Check /var/opt/resmon/log/event.log for fibre channel related errors. Maybe this helps. A common source of errors like this are duplicate loop IDs in an arbitrated loop, but this should not happen in fabric mode (PtP connection to the switch).
Arunkumar_3
Advisor

Re: boot time error

Hi Stefan Stechemesser,
Actually I used the method you specified. Removed the cables at boot time and reconnected. Booting process was successful. But this will be pain. For some testcases we have to reboot the system couple of times or more in a day.
Once the booting process is over and if we reconnect the cables, it is working fine.

I checked /var/opt/resmon/log/event.log files, it is showing an informational message, if cable is not connected,it is showing critical error message.
Arunkumar_3
Advisor

Re: boot time error

thanks