- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers (ML,DL,SL)
- >
- Re: os/ilo freezes, possible ilo problem on dl320?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-28-2008 08:22 AM
тАО04-28-2008 08:22 AM
os/ilo freezes, possible ilo problem on dl320?
DL320 G5:
W04 08/21/2007
1.43 12/12/2007
DL360 G5:
P58 11/13/2007
1.43 12/12/2007
Both DL320's became unresponsive within about 12 hours of each other. Both the ILO and the OS (Linux, bonded NIC's) were unpingable. I at first thought power, but when I went to the datacenter both systems were alive and the NIC's were flickering away. When I hooked up a monitor, there was no video being sent. When I rebooted via the power button on the front, there was no video and the NIC's never came up for either the ILO or the OS. I finally pulled the power from the back, which seemed to do the job. Both the ILO and OS came up.
The time/date on both the OS and ILO reverted back to 8/20/2007, which was the BIOS build date (minus 1, for some reason). The OS logs showed nothing useful, but the ILO log was a bit more interesting.
I run a script from another system every hour that SSH's into the ILO on each box, and gathers HW data to ensure there are no failures (temp, power, etc). This script isn't the most efficient script, as it does about 10 separate SSH connections to the ILO on the top of every hour, but it works great.
The ILO log for BOTH DL320's shows that exactly 26 hours before each system went down, my SSH connections were failing. The log claims authentication failed, but since I use SSH keys and not password authentication, I highly doubt it. The final two messages at the time that the ILO/OS crashes on both systems are "server reset", then "server power restored".
No other systems in the datacenter logged any power events, and keep in mind the failures for each DL320 were offset by about 12 hours.
The DL360's, on the other hand, are fairing only somewhat better. All their OS's are still up and running, but 3 of the 4 systems have broken ILO's (pingable, but can't login via http or ssh). The ILO that's still available had it's clock reset and has the same "server reset" and "server power restored" messages as the DL320's (although in this case it was 2 days before the first DL320 logged it's instance of the messages), however in this one and only case the ILO seems to have survived. Oh, and there were no SSH login failures logged like there were on the DL320's.
So, that's pretty much it. If you made it this far, thanks for reading this book of a post ;-) Anyone have any ideas? I have since stopped my script that gets HW vitals and I'll probably open a ticket with HP, but I was hoping someone else has seen similar behavior from their systems and might have some advice.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-28-2008 12:52 PM
тАО04-28-2008 12:52 PM
Re: os/ilo freezes, possible ilo problem on dl320?
If the servers are on a UPS, can you try one of them on raw mains just to see if that makes any difference. There have been numerous reports about the G5 models being very fussy about the types of attached UPS models.
Regards,
Brian
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-28-2008 01:55 PM
тАО04-28-2008 01:55 PM
Re: os/ilo freezes, possible ilo problem on dl320?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-29-2008 01:20 AM
тАО04-29-2008 01:20 AM
Re: os/ilo freezes, possible ilo problem on dl320?
Yes the problems are mainly when the UPS swaps to battery power and the attached G5's reboot rather than stay up.
There are also a number of other PSU issues with the G5's but that's generally with the DL38X and ML3XX chassis which use the common PSU, unlike the 1U units (DL320,DL360).
The resets/date issues are certainly strange (Just as if the CMOS reset switch had been flicked and it reverted back to the CMOS date) - Odd.
Sorry I can't be of more use at this point but I'll keep looking.
Regards,
Brian
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-29-2008 11:58 AM
тАО04-29-2008 11:58 AM