- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: isr.ior TOC panic
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 08:55 PM
08-02-2004 08:55 PM
isr.ior TOC panic
One of our V-class servers had a bit of a panic, and HP here can't seem to diagnose.
Here's some detail. Anybody have a clue?
Reboot after panic: isr.ior = 0'240020.0'cc730dd0
q4> trace event 0
stack trace for event 0
crash event was a TOC
preArbitration+0x2e4
wait_for_lock+0x120
sl_retry+0x1c
pset_get_num_spu+0x18
pset_idle_loop+0x70c
idle+0x114
swidle_exit+0x0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 09:04 PM
08-02-2004 09:04 PM
Re: isr.ior TOC panic
Usually indicates either an HPMC, a user induced TOC, or a Serviceguard TOC.
as the q4 output says it is a TOC, I would suggest someone TOC'ed the system, or there may be a very rare event where a hardware issue has caused the tOC.
I would continue to pursue chasing your local HP Response Centre to look into this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 09:10 PM
08-02-2004 09:10 PM
Re: isr.ior TOC panic
We can rule out the human error.
So that leaves our great friend MC/SG.
What is the accepted code level for MC/SG on 11i these days?
11.15? and 2.02 for the API?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 09:25 PM
08-02-2004 09:25 PM
Re: isr.ior TOC panic
Not a lot to go on .
Do you use veritas .
How is patch level.
Is there a tombstone
A good patch level is especially important
Steve Steel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 09:43 PM
08-02-2004 09:43 PM
Re: isr.ior TOC panic
If this is a service-Guard TOC, then you need not worry about the TOC itself. You need to check the syslog.log (or OLDsyslog.log) on both the servers to determine if there was a cluster reformation and why there was reformation.
My guess is, you had problem with the heartbeat of the cluster and hence one of the nodes obtained the lock and the other one TOC'ed. This is a proper serviceguard behaviour.
If that is not the case, then I would suspect your patch levels. Q4 trace seems to indicate some deadlock(I am not sure). Get the system to a good patch level. Involve HP to analyse the dump.
If it is HPMC, then you need to involve HP. I presume there is no HPMC as you have indicated "HP here can't seem to diagnose."
I am sure HPMC is the first thing they would have checked and found nothing. So, they might be looking for problems in other area.
Cheers,
Mohan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 09:44 PM
08-02-2004 09:44 PM
Re: isr.ior TOC panic
Still running Cluster Monitor A.11.13, Cluster API A.01.03.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 09:51 PM
08-02-2004 09:51 PM
Re: isr.ior TOC panic
I'm pretty convinced this is MC/SG. I've seen this previously: when working on a cluster, using accepted commands (cmhaltpkg, cmhaltnode) one of the other nodes decides that it needs some type of quorum, but then instead of just removing itself from the cluster it panics instead.
Hence my question on the latest trusted verion of CM.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 09:51 PM
08-02-2004 09:51 PM
Re: isr.ior TOC panic
Up your patch level
There has been a lot of changes in 2 years
Also upgrade your software from 11.13 to 11.14 if you can
Steve Steel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 09:56 PM
08-02-2004 09:56 PM
Re: isr.ior TOC panic
I hear what you say about the patches, but due to the sensitivity of the servers in question (telco billing production servers), we have to integrate a patch update into a test stream, and are constantly lagging 12 months behind. Our next patch update is scheduled for Oct 2004, and we will only be able to certify on Dec2003 GoldQPK. See my problem? I have to pin-point a patch so that I can motivate a fix-on-fail install.
But thanks for the responses: I sort of suspected MC/SG all along.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 10:04 PM
08-02-2004 10:04 PM
Re: isr.ior TOC panic
I see your point. But if this is a telecom environment, then the vendor would have already given you the recommended and tested versions.
Is it Nokia/Lucent/Logica who has given you this solution? They will provide you the details you wanted to know.
Cheers,
Mohan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 11:08 PM
08-02-2004 11:08 PM
Re: isr.ior TOC panic
as Mohanasundaram already told you... if this TOC was caused by an expired safety timer then you need to look at the syslogs of all cluster node first (of course those that were active at dump time). You may also check if you find a core dump in /var/adm/cmcluster... cmcld may have crashed if the patch level is old.
Best regards...
Dietmar.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 11:51 PM
08-02-2004 11:51 PM
Re: isr.ior TOC panic
Firstly: patch level Dec2003 is currently being tested in partnership with vendor/supplier. They will only certify this way since we are running a modified source of their app. We're talking about $13 million for each bill cycle, and there are 8 bill cycles per month. Do the math.....not a system to play with!
If I cause a revenue loss, I think beheading will be my choice of punishment!
On the side of CM: no core dump, but as I said previously, a definite activity on OTHER nodes on the same cluster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2004 01:20 AM
08-03-2004 01:20 AM
Re: isr.ior TOC panic
I understand that the Telco systems are critical. We are not asking you to experiment with this system either.
We do not want your head on the chopping block :). But looking at some symptoms, It looks like somebody has already put your head on the block.
You said "no core dump". Was it not configured? or was there inadequate space to dump? If the server is so critical why such things are not monitored?
Then where did you run your Q4?
Are you sure this TOC was not as a result of any genuine network problems? thats what all the respondents here want to ascertain. Can you share with us what you found in the syslog.log?
I am sorry if I could not be of big help to you.
Cheers,
Mohan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2004 03:08 AM
08-03-2004 03:08 AM
Re: isr.ior TOC panic
Jakes, I didn't ask you to play with this cluster. I just asked for syslog contents... up to now we only know of "a definite activity on OTHER nodes". Sorry, not enough to tell anything. Currently we are all reading tea leaves. Please post syslog extracts that show the history of the reformation (which must have happened when one of the node died).
Best regards...
Dietmar.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2004 08:01 PM
08-03-2004 08:01 PM
Re: isr.ior TOC panic
Oops!I missed that Dietmar.
Jakes, I hope you found the root cause by this time. If so, just share it with us.
Cheers,
Mohan.
P.S Just call me "MOHAN".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2004 08:10 PM
08-03-2004 08:10 PM
Re: isr.ior TOC panic
I'll post the syslog stuff in a couple of days: up to my eye-balls in work @ the moment.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2006 12:28 AM
01-22-2006 12:28 AM
Re: isr.ior TOC panic
Anyway, I gave some points....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2006 09:58 PM
01-22-2006 09:58 PM
Re: isr.ior TOC panic
DOn't tell me you resigned due to this problem :-)
With regards,
Mohan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2006 11:02 PM
01-22-2006 11:02 PM