Re: Preventing CONFIGURE startup

EdgarZamora · ‎10-10-2007

SYCONFIG is older than that and predates SYSMAN. If I recall correctly, SYSMAN probably came out in the 5.0 timeframe.

Anyway...

Bob,

I agree with you in principle and I'm a big fan of SYSMAN STARTUP. I especially like the resiliency of SYSMAN in tolerating startup errors. If I had a brand new VMS system I wouldn't even modify the SYxxx procedures, but too many people get confused with SYSMAN startup especially if you have a cluster and start doing stuff like enabling and disabling procedures on a per node basis (which you don't see unless you do a SHOW/FULL). Most people favor the original way of using systartup and the rest of the SYxxx procedures.

This is really an old religious debate and it boils down to personal preference I think. In this case the SYCONFIG way is faster (if Bart's SYCONFIG.COM is in SYS$COMMON.) Bart could either:

1. edit SYCONFIG.COM
2. delete SYCONFIG.COM;0 afterwards

OR

1. create xxx.com containing the same commands
2. insert it into SYSMAN startup for execution during the "initial" phase.
3. remove from SYSMAN startup when done.
4. delete xxx.com

If he has multiple sys$specific syconfig files then your way wins.

Robert Gezelter · ‎10-10-2007

Edgar,

Indeed. The practice of modifying the various SYS$STARTUP:SY*.COM files goes back to the beginning (okay, not quite 17 November 1858).

SYSMAN/STARTUP does appear at, if I recall correctly without reference to the manuals (or my systems as I am writing this) at 5.0, which is a bit later.

The files are activated with an implicit SYS$STARTUP prefix, so the file searching obeys the SYS$SPECIFIC/SYS$COMMON hierarchy.

Proper checking of STARTUP controlled files DOES require checking BOTH the enable/disable status and the parameters to each file. While this needs to be done, it is not particularly different that dealing with the SYS$SPECIFIC/SYS$COMMON hierarchy (where oie does need to remember WHICH SYS$SPECIFIC applies; I have had people assure me that the crashed cluster member does not have anything in SYS$SPECIFIC, when I checked, I found out that they were looking at the RUNNING NODE'S SYS$SPECIFIC).

In the end, I try to use the STARTUP database where possible. It is far safer to disable a product (using SYSMAN) than it is to edit SYSTARTUP_VMS.COM.

A side effect of using the STARTUP database is that with a modest degree of care, the time to system availability is substantially reduced.

- Bob Gezelter, http://www.rlgsc.com

Robert Gezelter · ‎10-10-2007

To all,

There is a typo in my previous posting:

... (where oie does need to remember ...

should read:

... (where one does need to remember ...
^^

My apologies for any confusion.

- Bob Gezelter, http://www.rlgsc.com

Jon Pinkley · ‎10-10-2007

I am not convinced that avoiding the startup of the CONFIGURE process is sufficient to prevent MSCP served devices from being seen. I did the following on a satellite node, and it did prevent the CONFIGURE process from starting, but it did NOT prevent the node from seeing all the MSCP served devices present at the time the node booted.

$ create sys$specific:[sys$startup]VMS$INITIAL-050_CONFIGURE.COM ! prevent CONFIGURE process from starting on this node
$! NOTE: this didn't prevent MSCP served devices from being seen on the next reboot!

See the previous thread:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1114738

The main problems are:

1. Once a device is set to served, it can't be set /noserved, whether as a result of mscp_serve_all or explicit set device/served. Setting the device /noavailable does not prevent it being MSCP served either.

2. Most VMS disk devices cannot be disconnected, like LD virtual devices can be.

3. There is no way I am aware of to selectively exclude MSCP served devices from CONFIGURE's point of view.

Is there a problem other than the devices showing up, but not being usable? I too wish there was a better solution than rebooting multiple times, but other than the devices showing up, and using some memory for the io database, is there a problem with having the devices in the io database?

Jon

it depends

Bart Zorn_1 · ‎10-10-2007

The thread Jon refers to confirms my suspicion that you cannot exclude DGA devices. CONFIGURE will configure anything it sees.

I have here a test cluster of two nodes. One DGA device has been made invisible by the storage team. One cluster member still MSCP serves this DGA device.
When I reboot the other member, while preventing CONFIGURE to startup, this DGA device does not get configured. As soon as I start CONFIGIRE afterwards, the DGA device appears.

For a first test I prevented the CONFIGURE startup by editing SYS$STARTUP:VMS$INITIAL-050_CONFIGURE.COM. I will decide for a more sophisticated way later.

However, my first attempt to do it in a more supported way, by setting STARTUP$AUTOCONFIGURE_ALL to 0 and doing the SYSMAN IO commands in SYCONFIG, resulted in CONFIGURE not being started, but the DGA device did get configured!
I have no idea (yet) why.

Note that not starting CONFIGURE does not interfere with a correct system startup. Of course, if the accessibility of disks depend on MSCP, that is an other story.

Bart

Bart Zorn_1 · ‎10-10-2007

In response to Jon's last question:

Other than the fact that all those device litter the IO database, I concur that there is no impact.

However, the next step will be that the remaining DGA devices will be moved from our current HDS storage boxes to new XP24000 boxes. That means (again) that a whoule bunch of DGA devices will become obsolete and other new ones will appear.

The primary reason for planning the reboots is installing the latest ECO's. Because I am rebooting anyway, I thought that it would be a good idea to get rid of those zombies.

Bart

Jon Pinkley · ‎10-11-2007

STACONFIG runs early in boot in several cases, you can see the evidence of this on the console while you are booting. For example if you do a minimum boot of a cluster node from a SAN device, it will still run.

My test yesterday was from a satellite that had no direct (non-MSCP) paths to the SAN storage, CONFIGURE never ran, yet I was able to see all the MSCP served devices. It was booting from a HBVS system disk.

You saw a different behavior, specifically that when CONFIGURE was not started, you saw only devices with direct attachments. Did you notice if the SAN devices that were loaded had MSCP served paths?

Regarding SYSMAN IO SET EXCLUDE: Comments in STACONFIG state that it supports the permanent exclusion list (PEL), where CONFIGURE does not have any explicit claim to honor the PEL. However, I put the following in my satellite's io exclude list

$ sysman io show excl

%SYSMAN-I-OUTPUT, command execution on node DELTA
%SYSMAN-I-IOEXCLUDE, the current permanent exclusion list is: $1$DGA6902:
$

And prevented CONFIGURE startup with the empty sys$specific:VMS$INITIAL-050_CONFIGURE.COM hack, but the device still shows up (and it is not a member of a shadowset).

See attachment for evidence.

Jon

it depends

Jess Goodman · ‎10-11-2007

I'm pretty sure that if the CONFIGURE process is not running on node A then if node B reboots it will not "see" devices being MSCP served by node A, even if CONFIGURE is running on node B.

If the CONFIGURE process is later started on node A then node B will soon notice those devices.

So CONFIGURE apparently "announces" the availability of MSCP devices. I do not know what process on the system "hears" these announcements but it is not CONFIGURE anymore.

I say anymore because if IIRC this behavior is somewhat new (VMS 6.2?). In earlier VMS versions if CONFIGURE was not running on a node it would not find devices being MSCP served by other nodes.

So I think all you have to do is STOP CONFIGURE on all nodes and do a rolling reboot, assuming that the rebooted nodes do not need access to other MSCP served devices.

I have one, but it's personal.

Jon Pinkley · ‎10-11-2007

In my note dated Oct 11, 2007 15:35:07 GMT, I stated that I had excluded device $1$DGA6902 with sysman io set exclude=($1$DGA6902:) and it had no effect. However, I hadn't read all the help, specifically this from
--------
$ mcr sysman help io set exclude description ! from Alpha VMS 7.3-2
...
You cannot use the SYSMAN IO SET EXCLUDE command to exclude any
of the following device types:

o SCSI class-driver devices (DK, MK, GK) whose names include a
port allocation class or an HSZ allocation class

o Fibre Channel class-driver devices (PG, DG, GG)

This restriction also applies to SCSI devices on OpenVMS Alpha
Version 7.1 systems, if the SCSI device names include a port
allocation class.
---------

So STACONFIG may honor the Permanent Exclusion List (PEL), but the PEL doesn't honor all devices for exclusion.

The "zombie" devices are also a problem when an EVA snapshot of a device is presented and later unpresented.

Does anyone know why the ability to set a device /noserved when they are not mounted does not exist, or why DU/DK/DG devices can't be removed from the IO database? If they were cloned from a template device, would they then be disconnectable like LD devices, or is it just a coincidence that all devices that I can think of that can disappear from the IO database all have xxx0: template devices (LTA0, VTA0, LDA0, RTA0, etc). However, none of those devices can be MSCP served.

It seems that MSCP serving a device is a binding contract to continue to serve that device for the duration of the boot. At the time MSCP was designed, devices had more permanence than the virtual devices that can be dynamically created and destroyed with things like snapshots on an EVA. The ability to cleanup the zombie devices from the io database without multiple reboots of all cluster members is something I think many VMS customers would appreciate.

Jon

it depends

Bart Zorn_1 · ‎10-11-2007

Jess,

Stopping CONFIGURE on the other nodes does not prevent the rebooting node from seeing the MSCP served devices. I just tested that.

Jon,

So it looks like there is a certain unpredictability about when MSCP served devices are configured. Once CONFIGURE is running, they will get configured eventually, but STACONFIG may or may not be able to do it. This may be an explanation of what I saw.

The DGA devices that were configured also did have the MSCP path from the other node.

Thanks!

Bart

Bart Zorn_1 · ‎10-11-2007

I am closing this thread now.

I will go for a temporary hack in SYS$STARTUP:VMS$INITIAL-050_CONFIGURE.COM to conditionally start CONFIGURE based on the USERD2 SYSGEN parameter.

Thanks for all your thoughts!

Bart

Bart Zorn_1 · ‎12-19-2007

Although I already closed this topic, i want to report here that my solution did not work.

The CONFIGURE processes did not start, but the obsolete disk devices appeared none the less, of course only as MSCP served devices from the other nodes.

So it seems that there are two solutions to get rid of those devices:

- an entire cluster reboot
- a rolling reboot of all members, performed twice

Both are not nice.

It *would* be nice if the MSCP server stops advertising devices which are either inaccessible or SET /NOAVAIL !

Regards,

Bart

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Preventing CONFIGURE startup