Occasional Advisor
Re: Upgrading Agent on Infrastructure and satellite servers

I wouldn't say "supposed to", although I can see it possibly being better described as "nice to have".

 

If an agent upgrade on a core or satellite server were critical to the success of the CORD patch deployment, then I would expect that either a) this process was automated as part of the CORD deployment, or b) the release notes would explicitly call out the activity.

 

Conversely, upgrading simply for the sake of upgrading tends to create opportunity for failure. This tends to be much moreso the case on a core server than on a satellite, mainly due to users using the Agent Upgrade custom extension. There's no official word that using this custom extension on a core or satellite is officially supported. One of the big risks of an agent upgrade on a core or satellite has to do with node attachments / service levels, ie, they can end up detached. Time-consuming reconstruction of the device record is the path of least resistance at that point.

 

The other risk (for core servers, as this isn't a risk for satellites) is that an opswgw.args file can get deployed. It may not have an immediate detrimental effect, but as soon as services get restarted, the effects become obvious and significant. If the services restarts don't occur for weeks/ months after the agent upgrade, then it's not immediately obvious what the cause/effect could be.

 

One other incorrect result that we see from time to time is that a new MID gets created for the core server in question and the value overwritten to /etc/opt/opsware/agent/mid. Once the agent restarts, the server has effectively been dissociated with the original core server device record. So long as the original core server device record still exists (and hasn't been deleted by some well-meaning admin doing cleanup), the recovery is pretty straightforward: just edit /etc/opt/opsware/agent/mid and put the original core server's device record there, then restart the agent. (Make a backup of the interim mid file, just in case there's a need to retrace footsteps.) Make absolutely certain you're restoring the correct device record's MID to the mid file, as specifying one for any other server could have unpredictable results.

 

I can see wanting to keep all devices' agents up to the exact same version level, but in the case of core and satellite servers, they should be treated as special cases. At the very least, don't use the Agent Upgrade custom extension on cores/satellites, this really should be a manual process, one at a time.