- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: RHEL 5.5 hung on ProLiant DL380 G6
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 02:34 AM
тАО03-30-2011 02:34 AM
My ProLiant DL380 G6 server (running RHEL 5.5) hung last night.
No network connection was avaliable.
This morning I tried to log in via remote console: I typed the word "root", pressed the Enter key and waited for a password request but it hadn't... I had to restart the system.
The message shown in console (before trying login) looked a kernel trace. The following strings were included in the message:
wait_for_completion
default_wake_function
:cciss:start_io
:cciss:cciss_ioctl
do_lookup
dput
kobject_get
exact_lock
:cciss:do_ioctl
:cciss:cciss_compat_ioctl
blkdev_open
blkdev_open
compat_blkdev_ioctl
compat_sys_ioctl
sysenter_do_call
After booting up I found no error message reported in /var/log/messages.
The BIOS version is
Version: P62
Release Date: 03/01/2010
Firmware Revision: 1.81
Is there any way to get info about this issue?
Is it possible to get a crash dump in these servers?
Thanx in advance
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 10:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 06:53 PM
тАО03-30-2011 06:53 PM
Re: RHEL 5.5 hung on ProLiant DL380 G6
What it does is it automated doing SysRq+T (System Traces).
There are KDB's on RHEL support to set up your kernel dump (/etc/kdump.conf) as well as how to capture system traces during hng situations (SysRQ+t) and finally how to force a crash/kernel dump - (SysRQ+t).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 10:06 PM
тАО03-30-2011 10:06 PM
Re: RHEL 5.5 hung on ProLiant DL380 G6
I didn't know kdump
The only message in the iLO log is found in the reboot
POST Error: 1719 - A controller failure event occurred prior to this power-up
In my original post I didn't explain that my server stopped to log any activity (/var/log/messages, sar, ...) at 21:18 BUT running Oracle daemons (pmon, dbwr,...) started then to log weird traces, i.e., Oracle logs were updated after 21:18 while system logs weren't. As Oracle log files reside in external fibre channel disks I guess this problem was caused by the SCSI disk controller (the string cciss in the trace is another key).
The HP Document "HP ProLiant Servers Troubleshooting Guide" (Part Number: 375445-401 January 2011 Edition: 10) lists the POST codes and their meanings:
1719-Slot X Drive Array - A controller failure event occurred prior to this power-up
(previous lock-up code = 0x####)
Audible Beeps: None
Possible Cause: A controller failure event occurred before the server powered up.
Action: Install the latest version of controller firmware. If the condition persists, then replace the controller.
I'm going to use HP CLI commands to try to get the events and any info about internal RAID and SCSI controller.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-31-2011 04:33 AM
тАО03-31-2011 04:33 AM
Re: RHEL 5.5 hung on ProLiant DL380 G6
See below:
Version: 1.84 (26 Aug 2009)
Fixes
- Fixed an issue where a controller failure error 1783 is reported at POST, if a power cycle is initiated during a drive rebuild on a RAID 5 volume.
- Fixed a potential unresponsive condition related to controller lockup error code 0x83.