HPE GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: Gernerate NMI to hung server - get me a nice v...
Operating System - Linux
1832871
Members
2950
Online
110048
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-26-2006 01:11 AM
09-26-2006 01:11 AM
Gernerate NMI to hung server - get me a nice vmcore :-)
Hello,
We have a numbr of Intel ProLiant servers running RedHat Enterprise Linux 3. From time to time, and for no apparent reason these boxes just hang. We can't SSH to them, but they still respond to a ping. We connect to the iLO and attempt to login to the console, but after putting in the username and password again it just hangs and won't actually give a command prompt.
In an effort to try and figure out why this is happening, we need to force a hung box in this state to perform a crash dump so we can send it off to whoever for some analysis.
So, I setup a netdump-server and setup the server which crashes as a client. Tested it and it all worked OK. I modified the kernel parameter kernel.unknown_nmi_panic to 1 from 0 so when I sent it an NMI it should die on it's arse and give me a nice dump...
However, the box went this morning. So I logged onto the iLO, generated an NMI, but all it did was dump a couple of log lines onto the netdump server, no actual crash dump was produced.
Have I missed anything out here? I did this on another couple of RHEL3 test boxes and got a lovely big vmcore file of about 4 gig on my netdump server, but I'm getting nothing on the server I actually WANT a crashdump from.
Thanks in advance - Lee
We have a numbr of Intel ProLiant servers running RedHat Enterprise Linux 3. From time to time, and for no apparent reason these boxes just hang. We can't SSH to them, but they still respond to a ping. We connect to the iLO and attempt to login to the console, but after putting in the username and password again it just hangs and won't actually give a command prompt.
In an effort to try and figure out why this is happening, we need to force a hung box in this state to perform a crash dump so we can send it off to whoever for some analysis.
So, I setup a netdump-server and setup the server which crashes as a client. Tested it and it all worked OK. I modified the kernel parameter kernel.unknown_nmi_panic to 1 from 0 so when I sent it an NMI it should die on it's arse and give me a nice dump...
However, the box went this morning. So I logged onto the iLO, generated an NMI, but all it did was dump a couple of log lines onto the netdump server, no actual crash dump was produced.
Have I missed anything out here? I did this on another couple of RHEL3 test boxes and got a lovely big vmcore file of about 4 gig on my netdump server, but I'm getting nothing on the server I actually WANT a crashdump from.
Thanks in advance - Lee
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-26-2006 03:51 PM
09-26-2006 03:51 PM
Re: Gernerate NMI to hung server - get me a nice vmcore :-)
It sounds like you are on an early version of RHEL 3. Or, at least prior to RHEL 3U7.
There is a known problem with memory management that will cause a silent hang. The fix is to get the updated memory manager that Red Hat released with Update 7 for RHEL 3.
The hang happens at various times with various different loads. I've seen this problem more than a few times, and the update usually resolves the hangs.
Good luck!
CG
There is a known problem with memory management that will cause a silent hang. The fix is to get the updated memory manager that Red Hat released with Update 7 for RHEL 3.
The hang happens at various times with various different loads. I've seen this problem more than a few times, and the update usually resolves the hangs.
Good luck!
CG
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-29-2006 02:51 AM
09-29-2006 02:51 AM
Re: Gernerate NMI to hung server - get me a nice vmcore :-)
Check the netdump server's /var/log/messages for messages. You can run into this situation if:
- you don't have enough space in /var on the server to capture the dump
- you have a noisy network (netdump uses udp)
- you have one of the many bugs in early versions of netdump :-)
There were issues prior to RHEL3 U6 that prevented netdump from working properly. Ensure that your are NOT using bonding on the netdump server interface OR the client if you are trying to attempt this prior to U6.
Your second option is diskdump. diskdump was enhanced starting with RHEL3 U6 to allow for dumping to cciss (Smart Array) devices.
- you don't have enough space in /var on the server to capture the dump
- you have a noisy network (netdump uses udp)
- you have one of the many bugs in early versions of netdump :-)
There were issues prior to RHEL3 U6 that prevented netdump from working properly. Ensure that your are NOT using bonding on the netdump server interface OR the client if you are trying to attempt this prior to U6.
Your second option is diskdump. diskdump was enhanced starting with RHEL3 U6 to allow for dumping to cciss (Smart Array) devices.
Necessary questions: Why? What? How? When?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-06-2006 10:32 PM
10-06-2006 10:32 PM
Re: Gernerate NMI to hung server - get me a nice vmcore :-)
what's your release?
if it's redhat you need a fix or workaround:
see notes on cciss and diskdump:
http://www.redhat.com/docs/manuals/enterprise/RHEL-3-Manual/release-notes/as-x86/RELEASE-NOTES-U6-x86-en.html
Bill
if it's redhat you need a fix or workaround:
see notes on cciss and diskdump:
http://www.redhat.com/docs/manuals/enterprise/RHEL-3-Manual/release-notes/as-x86/RELEASE-NOTES-U6-x86-en.html
Bill
It works for me (tm)
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Events and news
Customer resources
© Copyright 2025 Hewlett Packard Enterprise Development LP