Integrity Servers
1747988 Members
4655 Online
108756 Solutions
New Discussion юеВ

Re: RX2620 error, LED 2 red "System firmware Hang"

 
Richard Jordan
Regular Advisor

RX2620 error, LED 2 red "System firmware Hang"

Our 2620 is from the porting class and is not currently under a hardware support contract.

System info:
- VMS V8.2-1, ECOs through November 2008.
- 4GB RAM
- 2x 73GB internal drives
- one power supply
- dual processors, Madison9 1.6GHz/3MB
- Management processor card

Current firmware revisions:

MP FW : E.03.15
BMC FW : 03.47
EFI FW : 03.14
System FW : 03.17

I know the firmware is old; this is running the same config it came from the porting class. It is clustered and has been idling except for running some internal test websites and minor program testing. About 3 days ago we moved a large project to it (BASIC compilations, not really tasking this box at all).

Yesterday the front panel started showing error status; the system light was flashing red, and LED 2 was solid red. The system remained up and running.

The manual indicates this status indicates a firmware error "System Firmware Hang".

The VFP command from a telnet iLo session shows the 'fatal' status. Using the GUI iLo access shows everything nominal, no errors. GUI system status also shows all normal. VMS has logged no errors.

The system was inadvertently powered down (cable management and heavy power cords do not mix well) while pulling it from the rack to see the internal status lights. It came back up without errors and is running fine.

I've attached the relevant section from the system event log. If there's other info that would be useful in determining what was wrong I'll get it posted. Any thoughts on dealing with this situation would be appreciated. Spurious one-shot? Likely to recur? Thanks.
9 REPLIES 9
BrianC
Valued Contributor

Re: RX2620 error, LED 2 red "System firmware Hang"

Richard

You can review the MP console SL log. Look for any type 7 fatal events for details.

Brian
Richard Jordan
Regular Advisor

Re: RX2620 error, LED 2 red "System firmware Hang"

The attachment didn't take; I'll try again. It had the entries from the event log corresponding to the time the error lights came on. Hopefully attached this time, unless the first attempt is still held up being 'checked'.
BrianC
Valued Contributor

Re: RX2620 error, LED 2 red "System firmware Hang"

The MP SEL log show a hardware machine check , Normally on a reboot the machine check data that is seen from the EFI level "errdump mca" command is passed into the OpenVMS error log. It may occur in the VMS error log as a type 660, 670, 680 entry for fatal machine check errors or a type 113 entry Console Data Log. In some cases you may see both the 600 class entry and the 113 entry. Check the timestamp of the crash as they may be reporting the same event.


If you have installed WEBES V5.1 or lower on the system the SEA System Event Analyser tool can be used to view and analyse this fatal crash. There is a GUI and CLI mode

Example from the CLI

$ WSEA v /analyze/since=27-jan-2009


If you do not have WEBES then the Analyze command may be able to give a basic bit to text translation of the hardware registers. It will probably require a Hardware Specialist to determine the source of the problem.

Example

$ analyze/error_log/ELV/since-27-jan-2009


Richard Jordan
Regular Advisor

Re: RX2620 error, LED 2 red "System firmware Hang"

Brian,
no WEBES installed. I didn't think to look at the VMS error log because no errors were showing at the DCL level (SHOW ERROR); my oversight. An excerpt translation from ELV is attached in case its of use; translations are not provided for most of the environmental entries and the actual machine check. The machine check is logged several minutes after the events in the MPC SE log, assuming time is properly synchronized between VMS and the console.

VMS never crashed so there is no dump to analyze. It went down when we slid the server out of the rack and the power cord pulled out.

Thanks for taking the time to reply.
BrianC
Valued Contributor

Re: RX2620 error, LED 2 red "System firmware Hang"

Richard

Sorry for the confusing the fatal event ( power failure) with the Machine checks.

These non-fatal machine check events could be a correctable main memory or cache error, a temperature warning , a fan warning or a loss of redendancy.

The way that OpenVMS "Analyze/error_log/ELV" formats the SEL log data makes it hard to decode.

can you supply a dump of the hex data from the mp console . Make sure you have flow control set to Xon-Xoff or software flow control on your terminal emulation since this can be a large printout.

Here is an example

MP Menu SL System log.

E System Events

D dump



Here is an example of the format of the Dump..

482 OS 0 1 0x548016E100E02930 0000000000000001 OS_BOOT_COMPLETE
28 Jan 2009 08:41:41
483 BMC *7 0x20498033A9022950 FFFF026FCF090300 Type-02 096f02 618242
28 Jan 2009 10:30:01
484 BMC 2 0x20498033A9022960 CF00A370CF120300 Type-02 127003 1208323




Brian
Richard Jordan
Regular Advisor

Re: RX2620 error, LED 2 red "System firmware Hang"

Brian,
unfortunately I can't get the log in hex format. I must have dumped or cleared it in the process of pulling all the other information because there's nothing there since the reboot.

Thanks for following up. With any luck this was a spurious error; its still not happy-making that the GUI was showing no problems while the text mode console showed the faults.

We're still looking at upgrading VMS on the system in the next few weeks. All relevant firmware updates will be performed as well.
cnb
Honored Contributor

Re: RX2620 error, LED 2 red "System firmware Hang"

Firmware hangs were resolved with upgrading the firmware.

The release notes indicate BMC hangs, machine checks, etc...

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15351&prodSeriesId=447331&swItem=ux-58102-1&prodNameId=441910&swEnvOID=54&swLang=13&taskId=135&mode=4&idx=1

More than likely you'll resolve these intermittent issues with the planned upgrade you noted.

hth,
Richard Jordan
Regular Advisor

Re: RX2620 error, LED 2 red "System firmware Hang"

Thanks. I didn't think the hangs mentioned in the firmware release notes were relevant to what we were seeing but there's certainly a chance. I'm surprised we didn't see any problems before now; the system has been running for about 2 years solid.
Richard Jordan
Regular Advisor

Re: RX2620 error, LED 2 red "System firmware Hang"

The problem has not recurred and we now have go-ahead to upgrade firmware and OS. Hopefully no more problems.