- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers (ML,DL,SL)
- >
- Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-11-2007 10:17 AM
тАО06-11-2007 10:17 AM
DL380 G5 / Linux RHEL 5 / Reboot by ASR
Since couple of week I'm experiencing a strange problem on my brand new HP server. Its restart by itself at anytime. This happends 3 times so far (of course all time this happened, I was way of the office).
I'm running RHEL 5 on this system. Here are some lines I'v found in my /var/log/messages files:
Jun 11 16:26:54 callisto hpasmxld[5200]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
Jun 11 16:27:04 callisto hpasmxld[5200]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
Jun 11 16:27:14 callisto hpasmxld[5200]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
Jun 11 16:27:24 callisto hpasmxld[5200]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
Jun 11 16:27:24 callisto hpasmxld[5200]: iLO 2 Communications Error - Attempting synchronization!
Jun 11 16:28:09 callisto hpasmxld[5200]: iLO 2 has responded to reset request . . .
Jun 11 16:28:09 callisto hpasmxld[5200]: Stopping the Watchdog Timer . . .
Jun 11 16:28:09 callisto hpasmxld[5200]: Resetting Internal Data structures . . .
Jun 11 16:28:09 callisto hpasmxld[5200]: Initializing Internal Data structures from iLO 2. . .
Jun 11 16:28:09 callisto hpasmxld[5200]: The iLO 2 reset / synchronization has completed successfully
Jun 11 16:28:09 callisto kernel: hpasmxld[5200]: segfault at 0000000000000031 rip 0000000000000031 rsp 00007fffce427ab8 error 4
Couple minutes after the server restart itself and the operating system isn't freeze. I receive an alert on my e-mail:
Trap-ID=6025
An 'ASR Recover Complete' trap signifies that the system has been shutdown by the ASR feature and has just become operational again.
When I'm going to "System Management Homepage", on the logs I saw this:
ASR Detected by System ROM 5/27/2007 7:06AM 5/27/2007 7:06AM 1
and at the end, when I'm going to iLO-2 Log, I have:
Informational iLO 2 06/11/2007 16:38 06/11/2007 16:38 1 Server power restored.
Informational iLO 2 06/11/2007 16:38 06/11/2007 16:38 1 BMC IPMI Watchdog Timer Timeout: Action=System Power Reset.
So, based on those information, is somebody can tell me what's wrong with my system? All system state are "OK" or "GREEN" and HP support (yes I have a service contract) doesn't seems to be aware of that issue. They want me to run from SmartStart CD a diagnistics that may take up to three hours and now I'm 2000 miles away of my server.
If somebody can provide me some information about this, it will be really appreciated.
Best Regards,
Yanick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-04-2007 07:28 PM
тАО07-04-2007 07:28 PM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
I have the same problem, no answer found yet. Everything looks right, no yellow / red leds, yet the server reboots randomly at least once a week. This is the only message in the ILO log. Have you found anything wrong with your server? Any answer would be much appreciated.
10x
Cosmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-05-2007 12:29 AM
тАО07-05-2007 12:29 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
I'm happy to see that I'm not alove having that problem. At this time I only disable the ASR auto-reboot feature from System Management web page. I open a case at HP, and I need to run the diagnostic tools from smartstart CD. I did not have the time to run this tool yet, because this server is online 24/7. I plan to run it this weekend.
If I found something, I will let you know. Also, if you found something before me, please let me know.
Regards,
Yanick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-17-2007 04:22 AM
тАО07-17-2007 04:22 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
Cheers
Fernando
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-17-2007 06:28 AM
тАО07-17-2007 06:28 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
Unfortunately I did not find anyting solid. I just disabled ASR feature from System Management web page. Since then, the server has never rebooted itself and the OS never hang. I also runs all diagnistics from SmartStart CD and no error was reported.
If you get something else, please let me know.
Regards,
Yanick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-26-2007 07:36 AM
тАО07-26-2007 07:36 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-01-2007 01:36 AM
тАО08-01-2007 01:36 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
Disabled ASR as others while waiting for a fix.
Log entries:
Jul 31 08:09:22 fri-ww01 kernel: ipmi_si(SI_CHECK_BMC): Failed to get Global Enables 0xc6.
Jul 31 08:09:32 fri-ww01 hpasmxld[4645]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
Jul 31 08:09:42 fri-ww01 hpasmxld[4645]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
Jul 31 08:09:52 fri-ww01 hpasmxld[4645]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
Jul 31 08:10:02 fri-ww01 hpasmxld[4645]: OsKcsExecCmd: IPMI NetFN 0x36 CMD: 0x2 has timed out!
Jul 31 08:10:02 fri-ww01 hpasmxld[4645]: iLO 2 Communications Error - Attempting synchronization!
Jul 31 08:10:47 fri-ww01 hpasmxld[4645]: iLO 2 has responded to reset request . . .
Jul 31 08:10:47 fri-ww01 hpasmxld[4645]: Stopping the Watchdog Timer . . .
Jul 31 08:10:47 fri-ww01 hpasmxld[4645]: Resetting Internal Data structures . . .
Jul 31 08:10:47 fri-ww01 hpasmxld[4645]: Initializing Internal Data structures from iLO 2. . .
Jul 31 08:10:47 fri-ww01 hpasmxld[4645]: The iLO 2 reset / synchronization has completed successfully
Jul 31 08:10:47 fri-ww01 kernel: hpasmxld[4645]: segfault at 0000000000000031 rip 0000000000000031 rsp 00007fff530279c8 error 4
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-11-2007 09:12 PM
тАО10-11-2007 09:12 PM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-12-2007 03:51 AM
тАО10-12-2007 03:51 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
The ASR feature can be disable from System Management Homepage without having to reboot. I did that and since then the server never rebooted.
When loggon on to System Management, select "Autorecovery" feature under "Recovery" section then change the status to "disable" and click "set".
Your problem should now be solved.
Regards,
Yanick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-12-2007 05:12 AM
тАО10-12-2007 05:12 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2007 03:30 AM
тАО10-15-2007 03:30 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
I have the same exact problem with 4 brand new DL360G5 and 2 DL380G5 running RHEL5 x86_64.
Unexepected reboots occured (last one on last friday for one of the 360) on some of these servers: 3 of the 4 360 had this behaviour, 1 of the 2 380 too.
They all passed 72 hours of memtest86+ (v1.70) and 48 hours of hp diags (from smartstart CD 7.90) without problem before going to production, firmwares and packages are all up to date.
The following lines showed in /var/log/messages about 10 minutes before the ASR reboots the servers (last reboot for a 360) :
Oct 12 12:07:08 plam0043 kernel: ipmi_si(SI_CHECK_BMC): Failed to get Global Enables 0xc6.
Oct 12 12:07:18 plam0043 hpasmxld[5082]: OsKcsExecCmd: IPMI NetFN 0x6 CMD: 0x25 has timed out!
Oct 12 12:07:28 plam0043 hpasmxld[5082]: OsKcsExecCmd: IPMI NetFN 0x6 CMD: 0x25 has timed out!
Oct 12 12:07:38 plam0043 hpasmxld[5082]: OsKcsExecCmd: IPMI NetFN 0x6 CMD: 0x25 has timed out!
Oct 12 12:07:48 plam0043 hpasmxld[5082]: OsKcsExecCmd: IPMI NetFN 0x6 CMD: 0x25 has timed out!
Oct 12 12:07:48 plam0043 hpasmxld[5082]: iLO 2 Communications Error - Attempting synchronization!
Oct 12 12:08:33 plam0043 hpasmxld[5082]: iLO 2 has responded to reset request . . .
Oct 12 12:08:33 plam0043 hpasmxld[5082]: Stopping the Watchdog Timer . . .
Oct 12 12:08:33 plam0043 hpasmxld[5082]: Resetting Internal Data structures . . .
Oct 12 12:08:33 plam0043 hpasmxld[5082]: Initializing Internal Data structures from iLO 2. . .
Oct 12 12:08:33 plam0043 hpasmxld[5082]: The iLO 2 reset / synchronization has completed successfully
Oct 12 12:08:33 plam0043 kernel: hpasmxld[5082]: segfault at 0000000000010000 rip 0000000000010000 rsp 00007fff75dea648 error 4
A call is opened at hp europe.
Regards,
Nathana├Г┬лl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2007 04:20 AM
тАО10-15-2007 04:20 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
I ran the same tests and came up with nothing substantial. Everything is normal it seems.
Good luck.
Fernando
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-22-2007 10:58 PM
тАО10-22-2007 10:58 PM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
I think HP must take time seriously to learn about this issue because it came to be very frequent.
I have the same issue about my 2 servers DL380G5 wich run Windows 2003.
Let us share our experience about this issue if anybody have the solution.
Cheers
Raymond
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-23-2007 12:07 AM
тАО10-23-2007 12:07 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-30-2007 04:39 AM
тАО10-30-2007 04:39 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
Disabled ASR to see what happens. Interestingly though, in the management console the ASR 'log' showed no ASR events.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-30-2007 06:08 AM
тАО10-30-2007 06:08 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-08-2007 07:42 AM
тАО11-08-2007 07:42 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
Has HP been able to give you guys a complete fix or do you all still have open cases?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-08-2007 07:56 AM
тАО11-08-2007 07:56 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
Yanick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-11-2007 09:34 AM
тАО11-11-2007 09:34 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
I have the very same issue on 4 brand new DL380 G5 in production.
All servers are running Red Hat Enterprise Linux 5, latest patches and updates (RHEL 5.1 now since a couple of days).
I also disabled the ASR because all 4 servers are production Oracle Databases and they keep crashing every 18 hours or so.
HP definitely needs to fix this ASAP, has anybody got a fix yet on this yet ?
Here's what you can see in the ILO2 log (from most recent to oldest, that is you get the BMC error first then it reset itself):
---
Informational iLO 2 11/11/2007 13:16 Server power restored.
Informational iLO 2 11/11/2007 13:15 Server power removed.
Informational iLO 2 11/11/2007 13:15 BMC IPMI Watchdog Timer Timeout: Action=System Power Reset.
---
Patrick Monfette
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-12-2007 03:39 AM
тАО11-12-2007 03:39 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
Nov 12 10:00:55 bdprod-1 kernel: ipmi_si(SI_CHECK_BMC): Failed to get Global Enables 0xc6.
Nov 12 10:01:05 bdprod-1 hpasmxld[7373]: OsKcsExecCmd: IPMI NetFN 0x4 CMD: 0x2d has timed out!
Nov 12 10:01:15 bdprod-1 hpasmxld[7373]: OsKcsExecCmd: IPMI NetFN 0x4 CMD: 0x2d has timed out!
Nov 12 10:01:25 bdprod-1 hpasmxld[7373]: OsKcsExecCmd: IPMI NetFN 0x4 CMD: 0x2d has timed out!
Nov 12 10:01:35 bdprod-1 hpasmxld[7373]: OsKcsExecCmd: IPMI NetFN 0x4 CMD: 0x2d has timed out!
Nov 12 10:01:35 bdprod-1 hpasmxld[7373]: iLO 2 Communications Error - Attempting synchronization!
Nov 12 10:02:20 bdprod-1 hpasmxld[7373]: iLO 2 has responded to reset request . . .
Nov 12 10:02:20 bdprod-1 hpasmxld[7373]: Stopping the Watchdog Timer . . .
Nov 12 10:02:20 bdprod-1 hpasmxld[7373]: Resetting Internal Data structures . . .
Nov 12 10:02:20 bdprod-1 hpasmxld[7373]: Initializing Internal Data structures from iLO 2. . .
Nov 12 10:02:20 bdprod-1 hpasmxld[7373]: The iLO 2 reset / synchronization has completed successfully
Nov 12 10:02:20 bdprod-1 hpasmxld[7373]: Failed GET SENSOR READING, sensor 9
Nov 12 10:02:20 bdprod-1 hpasmxld[7373]: iLO 2 Communications Error - Attempting synchronization!
Nov 12 10:03:05 bdprod-1 hpasmxld[7373]: iLO 2 has responded to reset request . . .
Nov 12 10:03:05 bdprod-1 hpasmxld[7373]: Stopping the Watchdog Timer . . .
Nov 12 10:03:05 bdprod-1 hpasmxld[7373]: Resetting Internal Data structures . . .
Nov 12 10:03:05 bdprod-1 hpasmxld[7373]: Initializing Internal Data structures from iLO 2. . .
Nov 12 10:03:05 bdprod-1 hpasmxld[7373]: The iLO 2 reset / synchronization has completed successfully
Patrick Monfette
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-12-2007 11:45 AM
тАО11-12-2007 11:45 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
And this is happening on two new servers, one running MS SQL 2005 and another running Sharepoint 2007.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-13-2007 02:06 PM
тАО11-13-2007 02:06 PM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
For my part, I disabled it (ASR) using the system management homepage and since then, I haven't got any reboot. I can check in the BIOS though because those are live servers and I can't restart them.
However, IPMI did fail many times since then and wanted the server to reboot because of that (I can see that in the iLO logs). But since it is disabled, it only seems to restart IPMI and continue on, I wonder why it doesn't do that by default instead of rebooting.
So the whole thing is still bugged but at least, I can use my server without having them rebooting every now and then.
It is very bad though that I had to disable the ASR, I really need this feature working perfectly, especially for my disaster recovery site.
I know I am repeating myself but HP really needs to fix this rapidly.
Patrick Monfette
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-14-2007 12:17 AM
тАО11-14-2007 12:17 AM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
I am running on DL 380 G5's with:
CentOS 5 x86_64 (Xen kernel)
RHEL 5 x86_64
CentOS 4 i386 (Asterisk server)
All servers are on HP PSP 7.90 with the latest firmware versions ( as of 7.9.0 Firmware update CD)
I will also be logging a call shortly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-19-2007 09:34 PM
тАО11-19-2007 09:34 PM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
I had the same problem on a DL360G5, running RHEL5, with hp-OpenIPMI-7.8.0-83.rhel5.
HP support Level 2 has recommended these actions (in my case, after gathering information on my systems with cfg2html) :
1) rpm -e hp-OpenIPMI (this can take some time to stop the service and uninstall)
2) chkconfig --level 35 ipmi on (this sets redhat's native ipmi deamon to start in run level 3 and 5)
3 ) service hpasm restart (this will stop all snmp agents and restart hpasm with hpasmlited service to use redhat native IPMI)
Then you can reactivate ASR if you disabled it to prevent reboot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-19-2007 10:12 PM
тАО11-19-2007 10:12 PM
Re: DL380 G5 / Linux RHEL 5 / Reboot by ASR
The solution of Bernard is applied to RHEL, do someone has the solution for WIndows environnement?
Regards