Operating System - OpenVMS
1748270 Members
3788 Online
108760 Solutions
New Discussion юеВ

Re: Logical name timeout on OVMS8.3?

 
Milton Baar
Frequent Advisor

Re: Logical name timeout on OVMS8.3?

Hi Volker

I will use WireShark and see where I get....but how can I tell what timers to tweak? As far as I can tell from the OVMS NFS client documentation, I have already experimented with all of them. Could you give me some idea of how to tell, from a TCP trace, what timers may be involved other than those I already know?

Cheers
Volker Halle
Honored Contributor

Re: Logical name timeout on OVMS8.3?

Milton,

by watching the packets on the wire and looking at the timestamps, one may find certain packets repeating at certain intervals. Then it's time to look for timer settings in the various protocol levels.

OpenVMS does a DIR, which translates to a QIO to the mounted NFS disk. This will need to cause some TCPIP (RPC ?) traffic to the NFS server, probably repeating due to no responses received.

Volker.
John Gillings
Honored Contributor

Re: Logical name timeout on OVMS8.3?

Hi Milton,

Not a solution to the timeout setting, but something that might help.

When using systems that depend on network connections, you need to think in terms of network breakages being EXPECTED conditions, rather than rare exceptions. Your systems need to be able to detect and adapt to any connectivity issues.

Your issue is that after detection, your processes don't remember that a breakage has occurred, so have to rediscover it multiple times (which costs significant time). You need a way to flip a switch to place the system into a "degraded" state until the issue is resolved. You may then need a mechanism to recover any necessary state and reintroduce the lost resource.

Assuming this is a system wide logical name, you could have a process monitor the connections to the search list entries. If it finds one getting slow, or lost, redefine the logical name, either dropping the dead entry, or changing the order (putting it on the end of the list). Also send an alert to someone to fix whatever's wrong with it.

Getting back to the timeout... be careful about reducing it too far. You risk introducing the issue of false triggers. A one second delay to a network device is hopefully rare, but it's entirely possible it's a transient. If it were generally "a good plan"(tm) to set defaults that low, that's what they would be!
A crucible of informative mistakes
Milton Baar
Frequent Advisor

Re: Logical name timeout on OVMS8.3?

Hi John

Nice to hear from you after such a long time :)

The particular issue here is that I have a clustered system of DS10Ls serving a web site running WASD - so there is no real redundancy in the "common" file system as there is none - the data source lives on an NFS server and each cluster node gets data from it.

Although I have front-end redundancy (multiple DS10Ls), I want to get back-end redundancy and the simplest (?) way seems to be to have multiple NFS servers. The NFS servers are synchronised in real-time (sub-second anyway) and the site never changes between updates, so there is no issue with one WASD server showing different pages to the other.

Anyhow, I need some mechanism so that if one of the NFS servers dies, OVMS will default to the other without manual interaction. This is not possible within the WASD functionality and the best solution appeared to be a logical search list - sigh.

Of course, I could have a batch job running that, every second, checks that both NFS servers exists and if one doesn't, alters the system-wide logical name for the data source. But I was hoping for something more elegant within OVMS!!
Hoff
Honored Contributor

Re: Logical name timeout on OVMS8.3?

This is a re-post. The previous attempt to submit this (as has happened with ITRC) failed. Apologies for any duplicate postings.

The classic OpenVMS approach here would be a cluster, with round-robin DNS or load-balancing web appliances out front or other such distribution. You'd bring the contents of your storage on-line as cluster host(s) or storage. Not as NFS. Perhaps via FC SAN with some MSA widgets, or via HBVS with client-local spindles.

Or you'd see some form of replication; how that might work depends on what sort of data synchronization requirements are here. There are various replication offerings and options here, though comparatively few are seen on OpenVMS Alpha.

More recently, you might see hosting on EC2 with EBS, or with another (distributed) content provider. But that's the entire retirement of these WASD boxes.

I'd not expect to find searchlists of NFS devices used for this sort of thing. Which leads me to guess you have some particular reason here for using NFS and not clustering.

Ping HP customer support directly and see if they have suggestions beyond the NFS MOUNT /TIMEOUT and /RETRIES stuff. (The /CACHE stuff is around how long existing data within the client is stashed away for another I/O before a fresh fetch and a fresh copy of the data is needed; this is not the server timeouts, BTW. If the stuff is not changing often, then keeping the data bits in the cache is no big deal.)
Milton Baar
Frequent Advisor

Re: Logical name timeout on OVMS8.3?

Hi

Um, Hobbyist licenses on a small cluster, serving web pages for a dancing group. I do it because I have used OVMS since 1978 and still like to keep my skills up - or not, in this case :) So, no ability to use any of the "correct" technology, just some robust DS10Ls that run WASD and just don't stop, relatively immune from nasty people trying to do things, static web pages that only change a after a dancing competition, but over 200Gb of data (mainly photos, stretches back to 1997).

So, I realise this is not the way to do it, but I just don't have the ability to use the better/correct tools and technologies. Hence, in the spirit of all hackers/DECUS members (since 1979), I am trying to kludge/force something to work with what I have or what I can get :)
Hoff
Honored Contributor

Re: Logical name timeout on OVMS8.3?

Scrounge up a multi-host SCSI controller for these old AlphaServer DS10L boxes (for the open PCI slot) and some disk and a shelf? The hobbyist license has the license PAK for clustering and the other licenses you need here.

Or (if you're within shipping range, or have a good friend that is) one of the c. US$200 HP rx2600 boxes that have been available around, and retire the existing Alpha boxes.

Alpha prices are in free-fall (AlphaServer ES45 at US$500), and the Integrity prices are often low, so spending much on parts for old boxes is something you'll want to carefully consider.

Milton Baar
Frequent Advisor

Re: Logical name timeout on OVMS8.3?

Sigh - here in the unfashionable part of the universe - that is Australia, prices are high, the dollar is low, and shipping and import tax from the US is approximately the price of a MacPro!

I actually have 1.8Tb of shelf, but the FC cards, switches etc just makes it too expensive, and there are no cheap integrity units here...haven't seen a multi-host SCSI card, I will do some checking
Milton Baar
Frequent Advisor

Re: Logical name timeout on OVMS8.3?

So, I have now done a lot of reading in the last few days and also spent a couple of hours on Hoff's site, HP, OpenVMS.org and, of course, Google. It seems that:
1. OpenVMS has poor support for NFS clients and uses outdated versions (no surprise there) that perform poorly.
2. Ditto SMB client.
3. Ditto iSCSI, even if I was using Itanic (which I'm not and can't really afford to).

I just need 3 DS10Ls to access a 250Gb data source that will be replicated - for redundancy. That data source will be populated by OSX and Windows (whatever I am using at the time), so SMB/AFP/FTP are all methods to get data onto it. It currently sits on replicating FreeNAS servers because a) they work, b) they are cheap, c) it is a single manageable data repository that all cooperating OSs can share (Windows via SMB, OSX via SMB or AFP, OVMS via NFS). Easy to manage, maintain and backup.

Apart from migrating servers to Itanic (not happening - far too expensive), installing new SCSI cards and replacing the shelf with an MA500 (same problems as former option), or dumping OVMS/WASD and moving to the Dark Side, can anyone offer other suggestions? Again, this is a self-funded, non-income community service and I really want to try to do it within the technologies at hand, and I *really* want OVMS to be in there!
Hoff
Honored Contributor

Re: Logical name timeout on OVMS8.3?

Given there's apparently no SCSI gear here and given the ATA disk addressing limits (c. 137.4 gigabytes) within SYS$DQDRIVER, you're rather limited on what storage you can share in a cluster (and with what you can shadow) among your OpenVMS Alpha and AlphaServer DS10L boxes.

I went through a fairly similar decision process a while back, and posted up some notes from that. Though you are fairly well stuck here, given the available hardware gear and the budget and the application requirements and the current implementation. Or you can continue to operate with NFS and such.

The (used) HP rx2600 series boxes available in the US are running c. $200, plus the cost of a DVD and the requisite SCSI disks, and scrounging up an OpenVMS I64 DVD kit. The cost here is going to be the shipping. And if the aggregate costs including the shipping costs and the import fees are up in the Mac Pro range, well, then the various alternatives here probably aren't really fodder for this forum.