- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Two VMS servers are comming up as Duty
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-21-2007 11:19 PM
02-21-2007 11:19 PM
Two VMS servers are comming up as Duty
We have three VAX 4105 servers running as Duty, Hot & Warm mode and all servers are up for arround 90 days. The problem we are facing is that, when ever we are switching our Duty to Hot server,it looks HOT server is strugling & Warm is also comming up as Duty leads to Two-Duty scenario. But when we did very frequent switch overs last year (with few days uptime), there was no such issues.
My Question: Is it because of very long uptime, these old VAX servers are behaving like this. Do we need to reboot the servers in certain intervals ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-21-2007 11:27 PM
02-21-2007 11:27 PM
Re: Two VMS servers are comming up as Duty
More details are needed. Looking at the information in this posting, my conclusion is that there is some problem with the local procedures used to switch between the different roles.
The posting does not include any information about the details of how the roles are managed. I would not expect a problem with OpenVMS itself to be the issue.
It is possible that a resource usage problem is affecting things, but the reason for that problem should be tracked down. I would also avoid just re-booting, as that will likely destroy the evidence of what the problem actually is.
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-22-2007 01:09 AM
02-22-2007 01:09 AM
Re: Two VMS servers are comming up as Duty
Can you post a
$ show cluster
from each node ?
It may be related to the uptime if the non paged pool is too small and expands until it can't, for example.
Usually Vms server need to be rebooted every 18 years (the famous Irish Railways !) or every 22 years (a node in a restricted area, so HP will refuse to confirm it) or more.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-22-2007 01:26 AM
02-22-2007 01:26 AM
Re: Two VMS servers are comming up as Duty
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-22-2007 01:44 AM
02-22-2007 01:44 AM
Re: Two VMS servers are comming up as Duty
Depending on your Vms version (before 7.3 or after), you can do
$ monitor rlock
and you have the great SDA extension
$ ana/sys
lck
to get more info.
If you can install Amds on the 3 nodes, it could help you a lot.
Depending on your Vms version, check if you can do
$ ana/sys
sh lock/waiting
sh lock/blocking
sh resource/contention
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-22-2007 02:34 AM
02-22-2007 02:34 AM
Re: Two VMS servers are comming up as Duty
I would prefer to better to understand the trigger, as it is very easily possible the trigger is not sensitive to the application or system uptime. Rebooting might not cure the problem and -- applying Murphy's Law -- rebooting particularly probably won't work exactly when you really need it to work.
If something like the distributed lock manager (DLM) is not used to coordinate the roles, it's potentially easy to get the applications into the wrong states. Proper use of the DLM greatly eases the effort of ensuring each node is in exactly one state. Having coded this stuff manually -- outside a configuration where DLM is available -- it's not easy to get this right, and there are all manner of odd corner cases.
But as Bob G. says, there's nowhere near enough here to go on. And I concur, this looks to be an application or application coordination issue. In particular, take a detailed look at how the applications are coordinating the roles. If it is not using the DLM or if this is not a cluster, then the first spot I'd look is for race conditions and sequencing errors within this area.
Stephen Hoffman
HoffmanLabs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-22-2007 04:55 AM
02-22-2007 04:55 AM
Re: Two VMS servers are comming up as Duty
I'll also second Labadie's idea of looking at local resources. Collect feedback and run autogen and look at AGEN$PARAMS.REPORT. Don't set or reboot until you've looked over the recommendations.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-22-2007 05:24 AM
02-22-2007 05:24 AM
Re: Two VMS servers are comming up as Duty
In an OpenVMS Cluster, having two primaries is the logical equivalent of a partitioned cluster.
There may well be performance issues here in the underlying system, or some other hardware or software problem. This could be anything from system settings to RMS file internal fragmentation to disk fragmentation to process quotas to, well, you name it. And these can most certainly stretch the timing or stress the error paths and can open up a case where you have multiple primaries.
I'd clean out the cruft in MODPARAMS.DAT (and particularly look for any parameter settings where there is no identified reason for the value, and cases where absolute settings were used and where ADD_, MIN_ or MAX_ should be used) and perform a full AUTOGEN pass as a start, and take a look at the Performance Management manual to try to get a handle on what is going on. Start up a MONITOR recording task to see what's happening over time. Record most stuff at, say, ten or fifteen minute intervals, and look at the trends. Look at the error logs. And if the application(s) are dropping into MWAIT states, find out what particular mutex is involved. (Having the IDSM internals and data structures manual can be very helpful here, as it details the implementation of mutexes on OpenVMS.)
Stephen Hoffman
HoffmanLabs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-22-2007 10:15 PM
02-22-2007 10:15 PM
Re: Two VMS servers are comming up as Duty
The MWAIT type issue indicates some kind of performance problem or resource constraint. It's almost impossible to tell what it could be without taking a careful look at the systems and the application. It may be one of the triggers that for a sequence of events which leads to the "multiple systems coming up as Duty" symptom.
3 way automatic determination of status is not easy. The state machine and the transitions are complex. 2 way is hard enough to get right under all possible circumstances. Here's the typical state transitions for 2 way to help you understand the kind of logic that's required:
Machine A Machine B
--------------------------------------
Off to Master Off
Master Off to Standby
Master Standby to Off
Master to Off Off to Master
Master to Off Standby to Master
Master to Standby Standby to Master
And so on. Of course, these states actually represent the application - not the physical machine. In complex cases there's sometimes a time lag between the start of a transtion and the completeion of a transition.
You can see the thinking - it's not just the states they're in, but the states they're transitioning to and what action needs to be taken. You also have to cater for machines that hang rather than die.
I've designed quite a few real-time control systems in my time - and this is probably the most difficult area of the whole system design to get right and to test. I once found code that I thought was correct to have a small timing flaw that hardly ever showed up - and it finally happened 7 years after the system went into operation. It's now "perfect" because we revisited the design and went through the entire state machine, the transitions and the actions to be performed very very carefully (again).
If you have trouble understanding how the application works then you may need to involve the original supplier / designers, or seek external help.
Good luck.
Cheers, Colin (www.xdelta.co.uk).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-23-2007 02:37 AM
02-23-2007 02:37 AM
Re: Two VMS servers are comming up as Duty
Machine A Machine B
--------------------------------------
Off to Master Off
Master Off to Standby
Master Standby to Off
Master to Off Off to Master
Master to Off Standby to Master
Master to Standby Standby to Master
and so on. There are also transitions that shouldn't happen which you need to guard against - as you've seen.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-23-2007 02:37 AM
02-23-2007 02:37 AM
Re: Two VMS servers are comming up as Duty
Machine A Machine B
--------------------------------------
Off to Master Off
Master Off to Standby
Master Standby to Off
Master to Off Off to Master
Master to Off Standby to Master
Master to Standby Standby to Master
and so on. There are also transitions that shouldn't happen which you need to guard against - as you've seen.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-23-2007 10:35 PM
02-23-2007 10:35 PM
Re: Two VMS servers are comming up as Duty
from your Forum Profile:
I have assigned points to 239 of 321 responses to my questions.
Most of the threads with unassigned answers date back to 2005.
Maybe you can find some time to do some assigning?
http://forums1.itrc.hp.com/service/forums/helptips.do?#33
Mind, I do NOT say you necessarily need to give lots of points. It is fully up to _YOU_ to decide how many. If you consider an answer is not deserving any points, you can also assign 0 ( = zero ) points, and then that answer will no longer be counted as unassigned.
Consider, that every poster took at least the trouble of posting for you!
To easily find your streams with unassigned points, click your own name somewhere.
This will bring up your profile.
Near the bottom of that page, under the caption "My Question(s)" you will find "questions or topics with unassigned points " Clicking that will give all, and only, your questions that still have unassigned postings.
If you have closed some of those streams, you must "Reopen" them to "Submit points". (After which you can "Close" again)
Thanks on behalf of your Forum colleagues.
PS. - nothing personal in this. I try to post it to everyone with this kind of assignment ratio in this forum. If you have received a posting like this before - please do not take offence - none is intended!
PPS. - Zero points for this.
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-24-2007 11:10 PM
02-24-2007 11:10 PM
Re: Two VMS servers are comming up as Duty
I would amplify Colin's comment. There is a significant possibility that the communications problem is caused by the MWAIT. The MWAIT itself would be caused by some completely unrelated bug in the code.
As was noted in your last post, these systems are not in an OpenVMS cluster.
My experience in situations like this is the same as that of Colin and Hoff, this type of logic is very sensitive to small errors, and it requires painstaking, careful analysis to ensure that all of the cases are properly taken care of.
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-02-2007 03:11 AM
03-02-2007 03:11 AM
Re: Two VMS servers are comming up as Duty
We have found the bug & the problem is happening only when my current duty is serving the external interfaces.