Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Best Practices on OpenVMS patches on a static system

 
Gary H. Mims
Occasional Contributor

Best Practices on OpenVMS patches on a static system

Definition of static system from my perspective: We are running OpenVMS 8.31H1, Cache, and Multinet 5.3A. We are not adding additonal products--especially OpenVMS products. The variables are users that login to application software written for Cache. The Cache version is new. Essentially this is my definition of a static OpenVMS system. Our current OpenVMS system came online in Feburary of 2010. Since that time we have had problems due to Cache and Multinet. Both have been resolved. For maintenance, I reboot everything once a month.

My boss thinks that we need to be 100% up to date on all of the OpenVMS patches, regardless of their priority.

On our previous hardware and OpenVMS 7-3.2, I basically ignored all patches, except for a firmware patch that was necessary for our SAN. In essentially ignoring the OpenVMS patches for 7-3.2, we never experienced a single downtime incident due to OpenVMS. That alone attests to the stability of OpenVMS.

Back on track, we reboot our current static system once a month.

As to my thinking on any OpenVMS patches, I take the side of "if it ain't broke, let's don't try to fix it", regardless of the latest patches from HP on OpenVMS.

From my perspective, any introduction to a patch introduces an element of chaos to a static system. We don't know how it will effect Cache or Multinet. Proof of stabilty lies in the monthly reboots without crashes in between.

So what is the "Best Practice" in this environment? I say that if we upgrade Cache, or Multinet, then it is time to look at the OpenVMS patches at all levels.

But, in between those times, I believe the safest "Best Practice" is to ignore the montly OpenVMS updates, and to only apply the big patches when they are released.

I can say that with our previous hardware and version of 7-3-2, I did minimal patching, and in 6 years, we never had a single crash that was due to OpenVMS. On our new system, we are simply using newer versions of Cache and Multinet.

My response to my boss, is that in being on the leading edge of OpenVMS with patches, we expose ourself to the "bleeding edge."

To summarize, I would like to hear from other OpenVMS systems managers on Best Practices in applying patches to a static system.





14 REPLIES
Jan van den Ende
Honored Contributor

Re: Best Practices on OpenVMS patches on a static system

Gary,

(when WE still controlled things) we tried a somewhat less conservative strategem:

1 .preferably be some 6 months behind on new versions and pathces
2. actively scan (OpenVMS.org, ITRC, comp.os.vms) for any adverse effects. Skip any reported patches.
3. Install patches to a sandbox. Try to trow everything at it that may be relevant to production ( ah, but reality is a surprisefull bitch!)
4. Evaluate "critical" patches for relevance to OUR config. Skip if irrelevant; until included in UPDATE bundles. (but the relevant patches in there usually were installed before the bundle)
5. NEVER shut down - except single nodes for "rolling reboots".

Full 365 * 24 service for 10 years.

Of course, that all changed drastically when IT was outsourced, and a totally different strategem was dictated which essentially reduced us to mindless drones.

-- I no longer have any knowledge how the site fares the last couple of years.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Ian Miller.
Honored Contributor

Re: Best Practices on OpenVMS patches on a static system

I would look at using the UPDATE patches. When a new UPDATE kit appears, wait for a few weeks, download it and install on a test system. Do testing for a few months then install on production (probably about the time the next UPDATE kit is announced).
This would then be a quarterly patch regime. You could stretch this to 6 months or a year by looking at every 2nd UPDATE kit or every 3rd UPDATE kit.
Note that UPDATE kits are supposed to consist of previously released patches.
You can buy a patch service as part of your support contract with HP.
____________________
Purely Personal Opinion
Robert Gezelter
Honored Contributor

Re: Best Practices on OpenVMS patches on a static system

Gary,

I essentially agree with Jan and Ian. Being at the "bleeding edge" is potentially a problem for a production system.

However, I would not consider a system "static". While the system may be "static" in the grand-scale sense, this is almost never true in the details. For example, consider a bug relating to disk error handling on a particular device. If your system never hits an error, then the bug is never a problem. When one encounters the error, it is a different story.

With OpenVMS, I have rarely needed a regular reboot. I have encountered applications that have memory leaks, and need restarting, but not the underlying system.

Quarterly path kits applied to a sandbox 30-days after releases, tested for two to three months on the test system is certainly one balance.

The OP does not note which platform (Alpha/Integrity) is in use. Certainly, virtual environments make creating test systems far easier.

- Bob Gezelter, http://www.rlgsc.com
SDIH1
Frequent Advisor

Re: Best Practices on OpenVMS patches on a static system

I agree with the other respondents, patch conservatively.

Also consider this: most patches are published for a reason: someone else ran into a problem with the software you are running before you ran in to that problem yourself. The software is broken. By not applying patches you choose not to fix what is broke.

I do not understand the monthly reboot at all, unless you have found a specific issue and found no other way to solve it than by a reboot.
SDIH1
Frequent Advisor

Re: Best Practices on OpenVMS patches on a static system

Something else: by most people it's considered bad practice to combine changes that have no strict dependency. You said you would consider VMS patches at the time an application update is due. If this means that you would install VMS patches at the same time as the application updates, then you would really complicate finding the root cause of any problems arising after that: it could be VMS, it could be the application.
Hoff
Honored Contributor

Re: Best Practices on OpenVMS patches on a static system

What are your production downtime costs?

As for your question...

You have at least two choices here.

+ Do what your boss wants. Yes, this is an approach that is not without risks. This approach can also be somewhat more expensive (without any patch-related problems), as (if you don't "batch" the patches together and apply them in bigger granules) the usual pre- and post-upgrade data archives do lengthen your production downtime.

+ Follow a more conservative approach and to patch on a less frequent schedule (eg: quarterly) in the production environment (obviously barring specific process-critical errors) and to not patch in any "new" patches.

Monthly reboots can be an entirely reasonable practice in some production environments, and particularly if you're clustered. These reboots allow you to test the resumption of operations, and can allow you to roll in upgrades and patches (the so-called rolling upgrade), and to get (good, complete, consistent) copies of otherwise active disks.

As for best-practices, get copies of your data before and after the patch application. This can be via BACKUP or via splitting out (quiescent) shadowset member volumes, etc. This is your recovery path if Bad happens. The before-patch archive is the rollback path; your exit strategy should a patch go wrong. The after-patch archive is to get a consistent archive for DT purposes.

And get a test environment available before deploying patches into a production environment, too.

No, I don't trust any software vendor to always work. Not for a production application, and specifically not for life-critical applications, nor various classes of manufacturing applications.

This decision is the boss's call regardless, and s/he may have good reason(s) for asking for this fast-patch procedure.
Hoff
Honored Contributor

Re: Best Practices on OpenVMS patches on a static system

Gary H. Mims
Occasional Contributor

Re: Best Practices on OpenVMS patches on a static system

Thanks for all of the responses, they have been helpful.

We have a single new Itanium server which replaced our Alpha ES-47. So I don't have a test environmet to work with.

To clarify one point: I would never mix patches and new software at the same time. Our current standing on OpenVMS is a quaterly patch, and two others--which unpon reading the details, will not touch anything we use. Each patch requires a reboot. None of these patches are critical.

On the other hand, I have two Multinet patches which are important, but not critical. Our IT Security group is messing with FTP's and somehow the patches got corrupted. I missed my maintenance window in trying to get them ready.

I want the Multinet patches to exist in the system for a while before applying the OpenVMS patches.

Based on your great responses in the thread, the consensus is that you would only recommend applying the quarterly patches, and at the least try to track down the importance of the two other patches.

Thanks,

Gary
Hoff
Honored Contributor

Re: Best Practices on OpenVMS patches on a static system

For the basic OS testing stuff, you can snag a used rx2600-class box and some disks and a single- or dual-core VMS license for around US$2000.

The hardware itself regularly shows up used for US$300 to US$500, plus some parts, or somewhat more than that if you want a vendor that will support the stuff and answer your calls.

When last I checked, the base FOE license was around US$900 per core, and (for testing) you don't really need both cores.

I haven't checked the OpenVMS I64 base license prices again since the base license got renamed BOE.

If you're running critical production and are concerned around maintaining uptime, do yourself a favor and ask for a target-practice box.
Bob Blunt
Respected Contributor

Re: Best Practices on OpenVMS patches on a static system

Gary, I worked in premium support for twelve years. We recommended to all of our customers that patches were important and that loading them was intended to make their environment more stable. Most of them most serious about uptime would maintain a system or set of systems where patches were introduced and "burned-in" before being installed in production. The patching schemes some used were incredibly elaborate, but I digress.

Your site has been *incredibly* lucky that you have had few downtime issues. I can say from direct experience that you could easily see that change as the amount of load or complexity of utilization increases. I would consider patches as path to help prevent problems and develop some compromise with your boss. No matter what the individual patches are intended to fix risk is always involved and your site would have to decide if any specific one was important to your needs. For instance a TCPIP Services patch might be a moot issue if you have Multinet or another stack. Or a patch to support or fix a very unique piece of hardware might not be important if your equipment doesn't include that peripheral or controller type. That was part of the deliverables we provided in premium support. We knew customer configurations and we'd recommend patches based on that information. Some of my most demanding customers, at least, installed the UPDATE patches on a quarterly basis after testing. Usually by that time it was well known if any problems existed with that patch bundle.

As long as your workload and luck hold out you could go years without problems. Keep in mind that when or if you do experience a software-related crash that, in general, the first questions relate to the version of the O/S you're using and what patches you've installed. If you're not relatively current the first recommendations are, usually, update the patches to current and call back if you're still having trouble. Ultimately its your decision. If you're comfortable with loading patches before getting a "software" problem investigated and possibly fixed, that's fine.

bob
Gary H. Mims
Occasional Contributor

Re: Best Practices on OpenVMS patches on a static system

I continue to appreciate the input I am getting from all. I would simply point our in our environment, I have never had the luxury of test system, or a true "test environment" for our operating system and applications that run on that system. We have do have "test system" at the applications level in the main system environment. I would suspect that larger companies would have a test systems environment and a test environment at application levels.

Putting it all in perspective, we are small, but based on what we do, the choice by those who fund us, is to minimize funding support for hardware, software, and people to maintain the same.

I can say that in our current position, the posture has worked very well. Upper management has been willing to risk the minimum downtimes that we have experienced.

But, as our "business" responsibilities grow, and our as main application software becomes more complex, with the addition of modules and connections to other systems, I have become more concerned about the base core support at the OpenVMS level that could be provided without a real "test environment.â

Considering all of the factors and what I have to work with, I think everyone responding has underscored, the non-critical OpenVMS patches should wait. Basically, those patches should wait until they are included in the quarterly UPDATE patch. And only those UPDATE patches should be considered after researching any problems, if any, are reported in the various OpenVMS forums.

From everyoneâ s input, have I correctly defined the Best Practices for Patches in my situation at the OpenVMS level?

Thanks for your continued comments,

Gary
Sloan Essman
Occasional Advisor

Re: Best Practices on OpenVMS patches on a static system

This was a good thread. I have an ES40 that has been in continuous production since 2002. In that time, MOST of my downtime has been facilities or infrastructure related but I've had 2 incidents that were not.

One was a bad batch of memory that didn't raise it's head until about the 3rd year of use. The one other was Audit-induced patching.

They insisted, against our protests, that we make the system 100% up to date on all patches immediately instead of letting us check everything out and give the released patches some time to mature. Well, we did so and about 3 weeks later the system started crashing. All indications pointed to a hardware issue, and even HP thought that too at first, but it was finally tracked down to a corrupt SCSI driver in one of the applied patches. Replacing the driver cured our ills, and the writeup that I presented to our Audit group resulted in them listening to us the next they wanted to dictate our job practices.

I will say that it was cool in a way to see a new SCSI/Fibre patch get released a few weeks later with a fix for the bad patch we installed and say, "Look, there's OUR patch. We did that!" LOL
Man's flight through life is sustained by the power of his knowledge.
Robert Gezelter
Honored Contributor

Re: Best Practices on OpenVMS patches on a static system

Gary,

One topic that did not get explored is the question of "Why would I need patches for hardware that I do not have?"

I will relay a story from a long-ago client. He had a stable environment, and fell far behind in patches, particularly those that were not "relevant" for his site.

A hardware casualty ensued, and I had to navigate provisioning replacement hardware in a sequence to deal with the fact that the client's unpatched system disks could not deal with some of the replacement hardware.

One can get into what pilots refer to as "behind the power curve" when a mandatory patch in one layered product has a series of prerequisites that were not deemed important. It is something to watch for.

A similar problem often occurs with software version updates.

- Bob Gezelter, http://www.rlgsc.com
John Gillings
Honored Contributor

Re: Best Practices on OpenVMS patches on a static system

Gary,

A bit of history... (bearing in mind that things may have changed a bit in recent times) OpenVMS "patches" have always been very heavily tested. From a quality perspective they're closer to most organisation's "production" grade software than "patch" grade.

In all the time I was involved in OpenVMS support, I can only remember ONE patch being withdrawn after release, and it was very fast, within a few days of release. So, if a patch is more than a week or so old, you can be fairly certain it's safe.

I can also only remember ONE patch which I would have recommended customers drop everything and install immediately (the NFS floating point bug). In general it's very rare that patches are that urgent.

In terms of your situation, I don't think the "bleeding edge" is quite as sharp as you might think, as anything which is released is generally safe. On the other hand, there are very few patches which are so urgent and critical that you are taking any significant risk by delaying installation.

"Best Practice" is a matter of opinion. I'd suggest you schedule downtime according to your business needs. At each scheduled downtime, plan to install any UPDATE kits, plus the current "Rating 1" kits. Check the Rating 2 kits to see if they're relevant to your system, and consider anything lower.

FWIW, the analogy I used for ratings was a pharmacy...

Rating 0 - like an epi-pen, get it in as quick as you can! (extremely rare).

Rating 1 - like an immunisation, apply to everyone "just do it"

Rating 2 - like a prescription drug, only take if you have the specific symptoms it treats, and possibly under supervision of your customer support centre ("doctor").

Rating 3 - like an over-the-counter medication, your choice to take if you like the sound of it.
A crucible of informative mistakes