- Community Home
- >
- Servers and Operating Systems
- >
- Legacy
- >
- HPE 9000 and HPE e3000 Servers
- >
- N4000 - Unexpected HPMC
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-16-2009 10:57 AM
06-16-2009 10:57 AM
N4000 - Unexpected HPMC
I have attached the console log of the HPMC message, taken through the GSP. The chassis logs didn't seem very informative to me, but I can post them if it would help.
Thanks,
Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-16-2009 01:19 PM
06-16-2009 01:19 PM
Re: N4000 - Unexpected HPMC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-16-2009 11:53 PM
06-16-2009 11:53 PM
Re: N4000 - Unexpected HPMC
As suggested above file called as ts99 under /var/tombstones will help us analyze the reason for this HPMC..
Can you attach this file here....
Best Regards,
Prashanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2009 06:06 AM
06-17-2009 06:06 AM
Re: N4000 - Unexpected HPMC
Here is the ts99 file.
Thank you for any insight you can give.
-Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2009 09:11 AM
06-17-2009 09:11 AM
Re: N4000 - Unexpected HPMC
Well you definitely got a HW failure but I can't pin it down. I suggest you have HP look at it. I'm sure they'll spot it in a sec.
Good luck.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2009 09:15 AM
06-17-2009 09:15 AM
Re: N4000 - Unexpected HPMC
Please send the following in attach.
cstm<<-EOF
runutil logtool
rs
EOF
ioscan -fnk
tail /etc/shutdownlog
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2009 10:06 AM
06-17-2009 10:06 AM
Re: N4000 - Unexpected HPMC
In addition, the PDC is too old and should be upgraded to have critical fixes installed at some point.
hth,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2009 01:11 PM
06-17-2009 01:11 PM
Re: N4000 - Unexpected HPMC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2009 05:50 PM
06-17-2009 05:50 PM
Re: N4000 - Unexpected HPMC
21:13 Mon Jun 08 2009. Reboot after panic: , isr.ior = 0'10340007.0'b42c0f8
22:19 Mon Jun 08 2009. Reboot after panic: , isr.ior = 0'4340003.0'338ef6b0
12:40 Tue Jun 09 2009. Reboot after panic: , isr.ior = 0'10340743.0'91e46148
22:21 Mon Jun 15 2009. Reboot after panic: , isr.ior = 0'10340003.0'6bdae918
23:28 Mon Jun 15 2009. Reboot after panic: , isr.ior = 0'340757.0'98a637b8
08:36 Wed Jun 17 2009. Reboot after panic: , isr.ior = 0'2401ec.0'26c9a178
You've had 7 panic's since June 2nd.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2009 07:09 AM
06-18-2009 07:09 AM
Re: N4000 - Unexpected HPMC
As I mentioned, the server will be fine for a few days before failing. After crashing, it will try to reboot, but appears to crash again while booting or shortly thereafter. After several cycles (as many as 10 over one weekend) it hangs during a boot and requires a reset, TOC, or power cycle to restart.
This is a test box, and not currently under HW support, so if I cannot figure it out, we will have to call somebody in.
I assume the next step would be to strip it down to a minimal config (i/o cards, RAM, and CPUs) and gradually add components to find the faulty device. Had this been a 'hard' (or at least a predictable) failure I would have begun that process already, but as it may run for several days before failing, it could be winter by the time I isolate the problem... ;)
I am open to any suggestions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2009 07:23 AM
06-18-2009 07:23 AM
Re: N4000 - Unexpected HPMC
Look in
/etc/opt/resmon/log/
There are five are so with daily event information. Some of the logs will be useless, but one for sure will have more information. Probably the *.html log. When you have something you'd like me to look at then post it.
Regarding stripping down the machine of I/O devices, this is not where your problem is occurring. If it were then it would show up in CSTM. You have something like a bad system board, cpu, DIMM, something associated to a High Priority Machine Check or HPMC. And I/O devices will usually go NO_HW before causing a panic like you're experiencing.
There should be events in your GSP logs as well. From the console, cntrol B, SL, look at the time stamps and alert levels. Post these when you have them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2009 09:18 AM
06-19-2009 09:18 AM
Re: N4000 - Unexpected HPMC
Attached are the last 20 chassis codes form the Error, and Activity buffers. I have not had a chance to look at the resmon logs, but hope to yet today.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2009 10:08 AM
06-19-2009 10:08 AM
Re: N4000 - Unexpected HPMC
and the last entry in reslog.html was in May 2008. The other 4 logs had timestamps within the last two days, but I didn't see much (except possibly some resmon configuration issues) in them. However, I have attached the last few entries of each, in case I am missing something.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2009 10:16 AM
06-19-2009 10:16 AM
Re: N4000 - Unexpected HPMC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2009 10:58 AM
06-19-2009 10:58 AM
Re: N4000 - Unexpected HPMC
Question: What do you do with c3t7d0 and c2t7d0?
From your 6/9/2009 Resmon / api.logs
0/0/2/1.7.0
ctl 2 0/0/2/1.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c2t7d0
0/5/0/0.7.0
ctl 3 0/5/0/0.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c3t7d0
Check your /etc/passwd file for and rscsi entries like:
rscsi:x:1999:1000:Tape:/export/home/rscsi:/opt/schily/sbin/rscsi
This is a tape entry from another server for remote scsi over ip to the tape.
Re: GSP logs. I donâ t see any high GSP alert messages in your GSP logs, only one alert 7 and 19 alert 2â s which are all informational. If you look at the source of the alerts though theyâ re all system bus and local bus.
Looks like thereâ s nothing wrong with your 0/10 bus and Hitachi disk array, where I suppose most of your data is kept.
Wild guess, youâ ve got a bus issue, start with the remote scsi, which I know nothing about, and remove them is you donâ t need them.
What do you do with c3t7d0 and c2t7d0?
Else, I see nothing being recorded that is informational.
Please double check syslog.log for these times as well as ALL THE RESMON LOGS.
STM is fine.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-05-2009 09:18 AM
11-05-2009 09:18 AM
Re: N4000 - Unexpected HPMC
In early Sept one of our Brocade Fibre-Channel switches began causing connectivity problems on a different server and was replaced. The N4000 server with the random crash/reboot problem was also connected to SAN disk thru this same FC switch.
Since the switch has been replaced the server has not crashed once! It has gone over 80 days without a crash.
As a comparison, during the Jan to Aug period it had crashed over 35 times, with about 15 of the reboots lasting less than 1 day. Average uptime was 9.2 days, and the longest was about 20 days.
I am calling this one fixed... Thanks to all for your input, and to you Michael for your time reviewing all the log files!
-Dan