- Community Home
- >
- Servers and Operating Systems
- >
- Legacy
- >
- HPE 9000 and HPE e3000 Servers
- >
- Re: Event 646: Partition being reset due to watchd...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-25-2005 02:44 AM
тАО07-25-2005 02:44 AM
I'm receiving the following Critical Error from the Event Monitor:
"Event 646: Partition being reset due to watchdog timeout expiring"
The entire message is attached to this post.
Should I be concerned about this? I receive it about once a month.
Also, any reccomendations on what I can do to resolve it?
Any help/insight is appreciated.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-25-2005 05:52 PM
тАО07-25-2005 05:52 PM
Re: Event 646: Partition being reset due to watchdog timeout expiring
have u done what the action statement ask u to follow:
Action: Find out why the partition's OS had hung. The cause could be bad HW that crashed the partition, or in rare cases, a combination of events that caused the OS to be unable to refresh the watchdog timer. Look for other events preceeding the timeout for clues to the root cause of the partition bei! ng unresponsive.
any error from /var/adm/syslog/syslog.log or did dmesg output give any scsi error, etc?
regards.
(p.s. please remember to assign points.
http://forums1.itrc.hp.com/service/forums/pageList.do?userId=CA1176297&listType=unassigned&forumId=1)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-25-2005 10:13 PM
тАО07-25-2005 10:13 PM
Re: Event 646: Partition being reset due to watchdog timeout expiring
I assume that the server is an rp7420 and the firmware is not the latest. I had a similar problem and after the message the MP was not receiving any further events.
you can try this workaround to recover from the WATCHDOG Reset and get the OS talking to the MP again by doing the following...
(This is safe to do with the partition up and running)
- Connect to the MP
- go into 'cm'
- reset the utility interface to the core cell of the partition using the 'ru'
command.
Here is an example where Cell-0 is the core cell:
[test-mp] MP:CM> ru
This command resets the selected MP bus device.
B - BPS (Bulk Power Supply)
A - PACI (Partition Console Interface)
G - MP (Management Processor)
H - PDHC (Cell Board Controller)
Select device: h
Enter cell number: 0
Do you want to reset the Cell PDH Controller Slot 0? (Y/[N]) y
-> The selected MP bus device will be reset.
[test-mp] MP:CM>
Then you should plan for the firmware upgrade.
With regards,
Mohan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-26-2005 02:20 AM
тАО07-26-2005 02:20 AM
Re: Event 646: Partition being reset due to watchdog timeout expiring
I checked the syslog.log and it doesn't look like there is anything that caused the hang up. I am rather new to this so attached a snippet a snippet of the log at the time of the problem.
dmesg: The file hasn't been updated since the 12th so I don't think anything has been logged there. But I could be wrong since I've never looked at it. Actually I don't know how to view the file correctly.
Mohan,
The watchdog eventually resets itself and I had put a call through support and they told me not to worry about it.
Regarding the firmware update. I am running a rp7420. Where can I find if there is firmware updates?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-26-2005 03:07 PM
тАО07-26-2005 03:07 PM
Re: Event 646: Partition being reset due to watchdog timeout expiring
syslog.log is too small a snippet to show if there are any error, u may like to grep any "warning" or "error" from that file.
post yr dmesg output:
# dmesg
for OS installable firmware updates:
http://www2.itrc.hp.com/service/patch/search.do?BC=patch.breadcrumb.main|&pageContextName=firmware:
regards.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-26-2005 06:29 PM
тАО07-26-2005 06:29 PM
Re: Event 646: Partition being reset due to watchdog timeout expiring
You can check for the firmware at itrc site and select the "patch/firmware database" section.
Select "firmware" sub section.
Select the "CPU" as the firmware type and in the search string type "rp7420" and perform a search.
Or just see if the below URL helps,
http://www5.itrc.hp.com/service/patch/patchDetail.do?BC=patch.breadcrumb.main|patch.breadcrumb.search|&patchid=PF_CRAIMED0310&context=firmware:cpu
You need to have a valid ITRC login to access this page. Firmware 3.10 is the latest for rp7420 and rp8420.
With regards,
Mohan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-27-2005 01:29 AM
тАО07-27-2005 01:29 AM
Re: Event 646: Partition being reset due to watchdog timeout expiring
Mohan, how can I find out what firmware my cpu currently has?
Again, sorry everyone, I very new to the whole HP mainframe / Unix deal.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-27-2005 01:38 AM
тАО07-27-2005 01:38 AM
SolutionThe current firmware version can be checked at MP.
1) Connect to console and press
2) if prompted for a login and passwd give Admin/Admin. this is the default login and passwd.
3) Once you get the MP prompt, type "cm"
4) You should get a "CM>" prompt.
5) Type "sysrev" and capture that output.
6) The firmware documentation contains a matrix to show which firmware you were on.
Otherwise, send that sysrev output to us, We can tell you the firmware version.
Since you indicated that a call was logged to HP, you can also take the CE's help to ascertain your firmware version.
Please let us know if you are receiving any further messages at the live logs /errorlogs/Forward progress logs in MP after you got the Watcdog reset message.
With regards
Mohan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-27-2005 02:42 PM
тАО07-27-2005 02:42 PM
Re: Event 646: Partition being reset due to watchdog timeout expiring
the last line of the dmesg output, "Line 1232 in /ux/core/kern/common/io/pat_psm.c: pat_heartbeat-send log - rc -1 s
tatus -5" gives rise to a need to update your PDC firmware if the error repeatedly appears:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=868351
regards.
(p.s. please remember to assign points.)