ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

IPMI reboots DL360

sedahl
Occasional Visitor

IPMI reboots DL360

I have a DL360 server, which is about 5 months old. This sunday and again today, the server rebooted. There has been no updates or changes done to the server the last 2 weeks. There is no logged problems in the event viewer.

IML logs "An Unrecoverable System Error has occurred (Error code 0x0000002D, 0x00000000)" I have been unable to find out what this means.

The ILO2 logs shows the following : "BMC IPMI Watchdog Timer Timeout: Action=System Power Reset." The server has rebooted at exactly 14:55 both times, so I wouldn't be surprised if the server reboots again in 3 days time! Any ideas as to why this is happening, and how I stop it ...

Svend-Erik
38 REPLIES
YAQUB_1
Respected Contributor

Re: IPMI reboots DL360

Hi Sedahl,

First U check Ur system firmware version & check the latest firmware version. If the firmware version is old then U apply current version.

And if Ur using MS Windows OS or LINUX then update latest patches (service pack).

Hope it will fix Ur problem.

Thanks--Yaqub
HP Support!!!


Fabio_S
Advisor

Re: IPMI reboots DL360

Hello Sedahl,
I just had exactly same issue, with same error logged. This is the first time, since the server is up (a couple of months).

Please let me know if you find anything more about that.

fabio
Suhi
Occasional Visitor

Re: IPMI reboots DL360

Hello to all

An Unrecoverable System Error has occurred (Error code 0x0000002D, 0x00000000,) the same thing happen to me an hour ago on a ML 370G5 2 months old. Any ideas?

Andrej
Wieser
Occasional Advisor

Re: IPMI reboots DL360

I'm also getting the same error. The hardware is ML370 G5 running Windows Server 2003 Enterprise.

If anyone finds a solution please post.
Dave Case
Occasional Visitor

Re: IPMI reboots DL360

Same here on a DL380 G5 running Windows 2k3 SP2 that's only about a month old (in production since 4/10). And to think we were pro-active in replacing 2 other servers because they were getting a little aged...they never rebooted for no reason.

3 un-enexplainable reboots in the last 2 weeks. On this last one we turned off the 'automatic reboot' and caught this error. Can't find anything on it.
Fabio_S
Advisor

Re: IPMI reboots DL360

I installed all the possibile updates (SupportPack, drivers, firmware) but the issue is still there.

is it possibile to turn off the check\service that auto-reboots the server?

thanks
rafaelito
Advisor

Re: IPMI reboots DL360

I have a similar problem on debian etch amd64 with a DL360.

From the syslog:
May 8 12:49:22 farallon -- MARK --
May 8 13:01:20 farallon kernel: ipmi_si(SI_CHECK_BMC): Failed to get Global Enables 0xc6.
May 8 13:02:45 farallon kernel: hpasmxld[4446]: segfault at 0000000000000031 rip 0000000000000031 rsp 00007fff1b307168 error
4
May 8 13:14:07 farallon syslogd 1.4.1#18: restart.

The iLo2 logs:

05/08/2008 13:12

BMC IPMI Watchdog Timer Timeout: Action=System Power Reset.

hpasm 7.8.0-91.etch26
Joshua Pott
Occasional Advisor

Re: IPMI reboots DL360

My issue started three weeks ago, and still I'm stuck like all of you.

Now one interesting thing about my situation. I have two of DL360, both running Win 03, but one has SP1 and the other has SP2. The box that keeps giving me issues is the SP2 box. I've debated on installing SP2 on the box that is healthy. This would help determine if its a SP2 issue. However, if thats the case. I would probably need to rebuild both servers. So who else is running SP2 with this issue?
Blazhev_1
Honored Contributor

Re: IPMI reboots DL360

Hi all,

the linux users can check this thread :

http://forums12.itrc.hp.com/service/forums/questionanswer.do?admit=109447627+1210268949475+28353475&threadId=1135440

Error code 0x0000002D, 0x00000000) according MS is a SCSI subsystem problem

http://msdn.microsoft.com/en-us/library/ms793645.aspx

If you all use G5 servers(or blades), they have the e200 and P400 SAS controllers which can cause the issue...
Someone opened a case at HP support?

Pac

Joshua Pott
Occasional Advisor

Re: IPMI reboots DL360

Ok, you've nailed my setup. I am running a G5 with a P400i scsi controller, and to answer your question I have not opened a ticket.

But as I mentioned before I have two DL360 G5's. Both are running the same driver. Sorry to sound like a noob. But are you saying we have bad hardware?
Blazhev_1
Honored Contributor

Re: IPMI reboots DL360

Hi,

can't tell you for sure, but I don't think it is a hardware error, can be a bug in firmware or something like this.

If you have the actual firmware, system rom & latest PSP and so many people see the same error 0x0000002D in IML then seems either like a revision problem or firmware problem...

If the problem is only that the servers reboot without this 0x2d in IML, then the G5's have known issues with the PSUs and this is maybe the cause(but not the DL360 G5).

You can boot smartstart and try an offline hardware diagnosic to check if something will come up, bu I doubt.

If we assume it is storage related issue, storport plays role here :

http://support.microsoft.com/kb/941276

As for stopping the reboot, you can disable ASR from RBSU->Server Availabiity->ASR Status.
This can stop the server from rebooting or will show bluescreen pointing to the problem(for windows)

I don't think 0x2d is logged in IML by Linux/ESX users, so there the agents 8.0 and IPMI deinstall/ASR stopping will solve the problem.

Pac
Blazhev_1
Honored Contributor

Re: IPMI reboots DL360

About SP1 and SP2 thing, the main "problem related" difference is storport again...

http://support.microsoft.com/kb/946448/en-us

here is a newer version, you can give it a try it solves a lot of problems with hangs and bluescreens.
Blazhev_1
Honored Contributor

Re: IPMI reboots DL360

Sorry, me again :)

is there a dump file saved in c:\windows\minidumps or if you search c:\windows for *.dmp?

Someone analyzed the minidump file or if he can attach it if it is less than 1MB?
Joshua Pott
Occasional Advisor

Re: IPMI reboots DL360

I appreciate your help with this. However it appears that it has writen a dump file. Is there something I need to enable on the ILO to get it to spit that out?
Blazhev_1
Honored Contributor

Re: IPMI reboots DL360

so there is or there is no dump?

the function should be enabled by default.
cotrol panel->system->Advanced options->Startup and recovery-settings.
under write debiggung information should be chosen minidump f.e.
from this screen you can disable the server restart too...

Can be that the server crashes without dump, but there should be a dump
Joshua Pott
Occasional Advisor

Re: IPMI reboots DL360

Ok, had to change it from Kernel dump to small memory dump. We'll see what we get.
rafaelito
Advisor

Re: IPMI reboots DL360

Some more log information:

May 8 13:01:30 farallon hpasmxld[4446]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
May 8 13:01:40 farallon hpasmxld[4446]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
May 8 13:01:50 farallon hpasmxld[4446]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
May 8 13:02:00 farallon hpasmxld[4446]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
May 8 13:02:00 farallon hpasmxld[4446]: iLO 2 Communications Error - Attempting synchronization!
May 8 13:02:45 farallon hpasmxld[4446]: iLO 2 has responded to reset request . . .
May 8 13:02:45 farallon hpasmxld[4446]: Stopping the Watchdog Timer . . .
May 8 13:02:45 farallon hpasmxld[4446]: Resetting Internal Data structures . . .
May 8 13:02:45 farallon hpasmxld[4446]: Initializing Internal Data structures from iLO 2. . .
May 8 13:02:45 farallon hpasmxld[4446]: The iLO 2 reset / synchronization has completed successfully
May 8 13:02:45 farallon kernel: hpasmxld[4446]: segfault at 0000000000000031 rip 0000000000000031 rsp 00007fff1b307168 error 4

HP Debian sowtware:
hpasm-7.8.0-91.etch26.amd64.deb
hp-OpenIPMI-7.8.0-108.etch26.amd64.deb
hprsm-7.8.0-104.etch26.amd64.deb
cmanic_7.9.0-5b.etch_amd64.deb
hpsmh-2.1.7-167.debian.amd64.deb
Fabio_S
Advisor

Re: IPMI reboots DL360

@ Pac: thanks fot the tips. I just filled the form needed to request the patch to Microsoft. As soon I have it I'll install and check if the issue is solved.
Indeed my situation reflects the one described in the link you posted (WS2003 x64, 6GB RAM, P400I storage controller).

Regarding the ASR disabling, would it only disable the reboot, but system crash would occur anyway?

Last thing, how do I launch RBSU? system start, before Windows loading?

thanks

Blazhev_1
Honored Contributor

Re: IPMI reboots DL360

Hi,

the ASR disable, will not stop the system to crash, it will maybe point you to a the problems if it shows a bluescreen.

You can access RBSU when you start the server and press F9 when prompted.
Same thing you can do withot downtime from windows : cotrol panel->system->Advanced options->Startup and recovery-settings - UNCHECK the Automatically reboot checkbox.

@rafaelito :

Check :

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c01330219

although you don't have RHEL5...

For the IPMI reset, Server BIOS/iLO2 Firmware and E200/P400 controller firmware are important...

Pac
rafaelito
Advisor

Re: IPMI reboots DL360

@ Pac: Thanks, I was aware of that advisory and there is another thread about it. It seems it's not RHEL 5 specific.
I will check the Server BIOS & E200i Controller firware, since the iLO 2 is up to date.
rafaelito
Advisor

Re: IPMI reboots DL360

After upgrading the Bios & E200i:

May 9 12:12:22 farallon kernel: ipmi_si(SI_CHECK_BMC): Failed to get Global Enables 0xc6.
May 9 12:12:32 farallon hpasmxld[4901]: OsKcsExecCmd: IPMI NetFN 0x6 CMD: 0x25 has timed out!
May 9 12:12:42 farallon hpasmxld[4901]: OsKcsExecCmd: IPMI NetFN 0x6 CMD: 0x25 has timed out!
May 9 12:12:52 farallon hpasmxld[4901]: OsKcsExecCmd: IPMI NetFN 0x6 CMD: 0x25 has timed out!
May 9 12:13:02 farallon hpasmxld[4901]: OsKcsExecCmd: IPMI NetFN 0x6 CMD: 0x25 has timed out!
May 9 12:13:02 farallon hpasmxld[4901]: iLO 2 Communications Error - Attempting synchronization!
May 9 12:13:47 farallon hpasmxld[4901]: iLO 2 has responded to reset request . . .
May 9 12:13:47 farallon hpasmxld[4901]: Stopping the Watchdog Timer . . .
May 9 12:13:47 farallon hpasmxld[4901]: Resetting Internal Data structures . . .
May 9 12:13:47 farallon hpasmxld[4901]: Initializing Internal Data structures from iLO 2. . .
May 9 12:13:47 farallon hpasmxld[4901]: The iLO 2 reset / synchronization has completed successfully

No reboot by ASR. It happened 2 times since upgrade.

Blazhev_1
Honored Contributor

Re: IPMI reboots DL360

I see you have the latest Value Add software for Debian, seems like problem is solved only for RHEL.

Did you remove the OpenIPMI driver?

This seems the only way to get rid of the error under Debian. As far as I know the 8.0 for RHEL solved the problem, but there is no such for Debian.

rafaelito
Advisor

Re: IPMI reboots DL360

No. I didn't remove it.
Joshua Pott
Occasional Advisor

Re: IPMI reboots DL360

So my question is, has this issue magically resolved it self also for anyone else? It started early April, my server was rebooting ever week. Now its been reboot free just over two weeks.