- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- cluster transition timeout??
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-13-2010 07:53 AM
тАО09-13-2010 07:53 AM
cluster transition timeout??
Any time we have to remove a node from our cluster - it causes issues on the other nodes. it would appear that adding/removing nodes into the cluster trashes a lot of our detached processes. These things all fail with %RMS-E-RRF, recovery unit recovery failed -RMS-F-ACC_AIJ, after image journal can not be accessed %RMS-F-BUG, fatal RMS condition (00000004), process deleted
We notice this same effect everytime the cluster has nodes added/removed, it seems dependent on what AIJ files are being accessed at the time.
Is there some kind of timeout that we can extend to make things a bit more persistent/resilient during the cluster transitions? Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-13-2010 10:22 AM
тАО09-13-2010 10:22 AM
Re: cluster transition timeout??
Are these files located on disks which are being served by the system which is shutting down??
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-13-2010 01:19 PM
тАО09-13-2010 01:19 PM
Re: cluster transition timeout??
Lots more detail required here for anything better that wild guesses... OpenVMS version, architecture, number of cluster nodes, storage technology, locations of RMS data files and journals (direct access or served disks?)
I suspect RMS is permanently losing access to the journal files, so timeouts won't help. Also realise that during a cluster state transition, all user mode processes are suspended, so timeouts don't apply.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-14-2010 01:09 AM
тАО09-14-2010 01:09 AM
Re: cluster transition timeout??
The cluster has 6 nodes - all Alpha servers running openvms 8.3. The disks are all direct disks from the SAN and RMS files are on these.
We've also received the same RMS error messages a few other times when not going through cluster transition
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-14-2010 01:14 PM
тАО09-14-2010 01:14 PM
Re: cluster transition timeout??
>We've also received the same RMS error
>messages a few other times when not going
>through cluster transition
In other words, the issue is independent of the cluster state transition... Perhaps a problem with the SAN which is exacerbated by a state transition, maybe because the nodes stop talking to the SAN for a while?
The error message ACC_AIJ means exactly what it says, the system can't see the AIJ disk.
I'd be looking at logs for errors on the SAN, and/or re-checking the SAN configuration.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-14-2010 03:52 PM
тАО09-14-2010 03:52 PM
Re: cluster transition timeout??
from the accounting records, you can find out the exact time of those processes being deleted. Check for messages in OPERATOR.LOG at those times. Check ERRLOG.SYS, there should be non-fatal RMS bugcheck entries reported.
Find out, if there are any other events reported at exactly the same time as those RMS errors.
Volker.