Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Madgoat Watcher

SOLVED
Go to solution
James T Horn
Frequent Advisor

Madgoat Watcher

Has anyone seen any strange behavior on running Watcher V3.2-1 on I64 platform?
We have run Watcher for years with no issues, until recently. When we moved from AXP to I64, on our development system, it acted as it ignored the config file that was setup, but our production system worked fine. Then recently the production system started ignoring configuration file setup.
17 REPLIES
Craig A Berry
Honored Contributor

Re: Madgoat Watcher

Have you enabled tracing to see if it gives you any clues about what's going on? Does the control program show you the rules that you think you've entered in the configuration file, e.g.,

$ mcr watcher_dir:wcp show watch

?

Note that WATCHER_STARTUP.COM does a RUN/DETACHED with a number of quotas explicitly set. The values that were appropriate on Alpha might well be inappropriate on Itanium.

I've only dabbled with Watcher but haven't rolled it out because I couldn't get the following very simple rule working. Basically this should allow everyone to stay logged in during the day but kick them out at night:

WATCH *
EXCLUDE */DURING=(PRIMARY:6-20,SECONDARY:6-20)

After enabling tracing and reading the code I was pretty sure that UIC wildcarding doesn't work for the EXCLUDE directive (and probably never has). But I don't know BLISS so I'm not at all sure I've understood the code correctly.

But in answer to your question, no, I have not seen a general inability to read the configuration file.
James T Horn
Frequent Advisor

Re: Madgoat Watcher

I've "SET DEBUG=31", and looking at the log it sets up my exclusions:

Example:
EXCLUDE HORN/DURING=(PRIMARY:0-23,SECONDARY=8-17)

Here I believe it is checking to see if HORN is excluded and finds the exclude record:
( 8-NOV-2010 13:58:04.15) Process: PID=000886BF, user=HORN, term=ERP01$FTA397:, accpor=
( 8-NOV-2010 13:58:04.15) -- Searching exclude list...
( 8-NOV-2010 13:58:04.15) -- Username: exclude=SYSTEM, process=HORN
( 8-NOV-2010 13:58:04.15) -- Username: exclude=FIELD, process=HORN
( 8-NOV-2010 13:58:04.18) -- Username: exclude=HORN, process=HORN
( 8-NOV-2010 13:58:04.18) -- UIC: exclude=[0,0], process=[2,5]
( 8-NOV-2010 13:58:04.24) -- Exhausted list: no match.
( 8-NOV-2010 13:58:04.24) -- Process found on count list
( 8-NOV-2010 13:58:04.24) -- Searching override list...
( 8-NOV-2010 13:58:04.25) -- Username: exclude=SAMINFO, process=HORN
( 8-NOV-2010 13:58:04.25) -- Username: exclude=NGL_SAMINFO, process=HORN
( 8-NOV-2010 13:58:04.25) -- Exhausted list: no match.
( 8-NOV-2010 13:58:04.25) -- Queueing count record for checking



I believe from this HORN has been idle for 54 minutes, and will be terminated.

( 8-NOV-2010 13:58:08.71) Check for warn/force: PID=000886BF, user=HORN
( 8-NOV-2010 13:58:08.71) -- Force check: Is 00:54:55.68 GTR 01:10:00.00?
( 8-NOV-2010 13:58:08.71) -- Warn check: Is 00:54:55.68 GTR 01:00:00.00?
Jan van den Ende
Honored Contributor

Re: Madgoat Watcher

James,

from your Forum Profile:


I have assigned points to 53 of 112 responses to my questions.


Maybe you can find some time to do some assigning?

http://forums1.itrc.hp.com/service/forums/helptips.do?#33

Mind, I do NOT say you necessarily need to give lots of points. It is fully up to _YOU_ to decide how many. If you consider an answer is not deserving any points, you can also assign 0 ( = zero ) points, and then that answer will no longer be counted as unassigned.
Consider, that every poster took at least the trouble of posting for you!

To easily find your streams with unassigned points, click your own name somewhere.
This will bring up your profile.
Near the bottom of that page, under the caption "My Question(s)" you will find "questions or topics with unassigned points " Clicking that will give all, and only, your questions that still have unassigned postings.
If you have closed some of those streams, you must "Reopen" them to "Submit points". (After which you can "Close" again)

Do not forget to explicitly activate "Submit points", or your effort gets lost again!!

Thanks on behalf of your Forum colleagues.

PS. - nothing personal in this. I try to post it to everyone with this kind of assignment ratio in this forum. If you have received a posting like this before - please do not take offence - none is intended!

PPS. - Zero points for this.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Craig A Berry
Honored Contributor

Re: Madgoat Watcher

James, I think your reading of the trace makes sense. Which means, I *think*, that it's not ignoring the config file. You're not using wildcard UICs in the exclusion list, so there's no reason to believe your problem is related to mine.

Does anything appear in the log file? Does it say that it logged out processes that are in fact still there?
James T Horn
Frequent Advisor

Re: Madgoat Watcher

I changed the config to use your example (exclude * /during=(primary:6-20,secondary:8-18) and it still is "ignoring" the exclusions:


( 9-NOV-2010 10:28:02.25) Process: PID=0005E987, user=HORN, term=ERP01$FTA403:, accpor=
( 9-NOV-2010 10:28:02.25) -- Searching exclude list...
( 9-NOV-2010 10:28:02.25) -- Username: exclude=SYSTEM, process=HORN
( 9-NOV-2010 10:28:02.25) -- Username: exclude=*, process=HORN
( 9-NOV-2010 10:28:02.25) -- UIC: exclude=[0,0], process=[2,5]
( 9-NOV-2010 10:28:02.25) -- Exhausted list: no match.
( 9-NOV-2010 10:28:02.25) -- Process found on count list
( 9-NOV-2010 10:28:02.25) -- Searching override list...
( 9-NOV-2010 10:28:02.25) -- Username: exclude=SAMINFO, process=HORN
( 9-NOV-2010 10:28:02.25) -- Username: exclude=NGL_SAMINFO, process=HORN
( 9-NOV-2010 10:28:02.25) -- Exhausted list: no match.
( 9-NOV-2010 10:28:02.25) -- Queueing count record for checking
( 9-NOV-2010 10:28:02.51) Check for warn/force: PID=0005E987, user=HORN
( 9-NOV-2010 10:28:02.51) -- Force check: Is 00:07:13.17 GTR 01:10:00.00?
( 9-NOV-2010 10:28:02.51) -- Warn check: Is 00:07:13.17 GTR 01:00:00.00?

currently using "SET NOACTION" so it does not stop processes, but seems to be saying it will stop all processes.


Craig A Berry
Honored Contributor
Solution

Re: Madgoat Watcher

I can now reproduce your problem with ignored EXCLUDE directives using both wildcarded and explicitly specified usernames. And I think I'm starting to understand what the trace means. It matches on username, but fails to match on UIC. Since we're not specifying a UIC, it should default to a wildcarded UIC which matches everything. However, if you look at what it thinks it has like so:

$ mcr watcher_dir:wcp show exclude

You'll see records like

Username: BACKUP, UIC: [0,200]
Device: *, Port name: *
Running image: *
Times: MONDAY:(0-23),TUESDAY:(0-23),WEDNESDAY:(0-23),THURSDAY:(0-23),FRIDAY:(0-23),SATURDAY:(0-23),SUNDAY:(0-23)


This derives from the record that looks like:

EXCLUDE BACKUP

in the sample configuration file. The design appears to be that all the fields not specified explicitly default to wildcard values. However, a UIC of [0,200] is definitely not a wildcard UIC. Not sure if it's supposed to display as [*,*] or [0,0], but [0,200] is definitely not right. Any attempt to specify a UIC explicitly looks like this:

WCP> exclude backup/uic=[1,1]
%WCP-W-UICERR, error translating UIC "[1,1]"
-LIB-F-SYNTAXERR, string syntax error detected by LIB$TPARSE

The use of LIB$TPARSE might be a clue here, as I believe it's translated VAX code and probably ought to be replaced with LIB$TABLE_PARSE. I thought I'd have a go at doing that, so I installed this:

$ prod show hist *bliss*
------------------------------------ ----------- ----------- --- -----------
PRODUCT KIT TYPE OPERATION VAL DATE
------------------------------------ ----------- ----------- --- -----------
HP I64VMS BLISSI64 V1.12-72 Full LP Install (U) 11-NOV-2010
------------------------------------ ----------- ----------- --- -----------

1 item found

Is that really the latest? It's dated 2006 and it's hard to believe there haven't been any tweaks to the Itanium BLISS compiler since then, but this was the latest I could find; I got it from

ftp://ftp.hp.com/pub/openvms/freeware/bliss

Then I grabbed the sources for Watcher v4.0 (which appears identical to v3.2-1 except for the license text). I got it from:

http://vms.process.com/scripts/fileserv/fileserv.com?WATCHER

Before making any code changes, I tried compiling like so:

$ @[.source]compile
$ BLISS/LIBR=FIELDS.L32I FIELDS.R32
$ BLISS/LIBR=WATCHER.L32I WATCHER.R32
; %MESSAGE: Structure GBLDEF size: 296 bytes
; %MESSAGE: Structure TRMDEF size: 206 bytes
; %MESSAGE: Structure EXCDEF size: 507 bytes
; %MESSAGE: Structure IDDEF size: 72 bytes
$ BLISS/LIBR=WATCHER_PRIVATE.L32I WATCHER_PRIVATE.R32
; %MESSAGE: Structure CTRDEF size: 535 bytes
; %MESSAGE: Structure PRCDEF size: 469 bytes
; %MESSAGE: Structure PGBLDEF size: 10 bytes
; %MESSAGE: Structure CHKDEF size: 12 bytes
; %MESSAGE: Structure MIOSBDEF size: 8 bytes
$ BLISS/OBJECT=[-.BIN-IA64]WATCHER.OBJ WATCHER.B32
$ LIBRARY/REPLACE [-.BIN-IA64]WATCHER.OLB [-.BIN-IA64]WATCHER.OBJ
$ BLISS/OBJECT=[-.BIN-IA64]COLLECT.OBJ COLLECT.B32
$ LIBRARY/REPLACE [-.BIN-IA64]WATCHER.OLB [-.BIN-IA64]COLLECT.OBJ
$ BLISS/OBJECT=[-.BIN-IA64]LOG.OBJ LOG.B32
$ LIBRARY/REPLACE [-.BIN-IA64]WATCHER.OLB [-.BIN-IA64]LOG.OBJ
$ BLISS/OBJECT=[-.BIN-IA64]FORCE.OBJ FORCE.B32
$ LIBRARY/REPLACE [-.BIN-IA64]WATCHER.OLB [-.BIN-IA64]FORCE.OBJ
$ BLISS/LIBR=WCP.L32I WCP.R32
; %MESSAGE: Structure DFLTDEF size: 41 bytes
$ BLISS/OBJECT=[-.BIN-IA64]CONFIG.OBJ CONFIG.B32
$ LIBRARY/REPLACE [-.BIN-IA64]WATCHER.OLB [-.BIN-IA64]CONFIG.OBJ
$ BLISS/OBJECT=[-.BIN-IA64]DECW_DISPLAY.OBJ DECW_DISPLAY.B32
$ LIBRARY/REPLACE [-.BIN-IA64]WATCHER.OLB [-.BIN-IA64]DECW_DISPLAY.OBJ
$ BLISS/OBJECT=[-.BIN-IA64]MEM.OBJ MEM.B32
$ LIBRARY/REPLACE [-.BIN-IA64]WATCHER.OLB [-.BIN-IA64]MEM.OBJ
$ MESSAGE /OBJECT=[-.BIN-IA64]WATCHER_MSG.OBJ WATCHER_MSG.MSG
$ LIBRARY/REPLACE [-.BIN-IA64]WATCHER.OLB [-.BIN-IA64]WATCHER_MSG.OBJ
$ BLISS/OBJECT=[-.BIN-IA64]PERFORM_DISCONNECT.OBJ PERFORM_DISCONNECT.B32

PRESERVE=NO);
.............................^
%BLS32-W-TEXT, Illegal register number 0 in LINKAGE declaration
at line number 183 in file D0:[craig.WATCHER.source]perform_disconnect.b32;1

PRESERVE=NO);
.............................^
%BLS32-W-TEXT, Illegal register number 1 in LINKAGE declaration
at line number 183 in file D0:[craig.WATCHER.source]perform_disconnect.b32;1

PRESERVE=NO);
.............................^
%BLS32-W-TEXT, Illegal register number 2 in LINKAGE declaration
at line number 183 in file D0:[craig.WATCHER.source]perform_disconnect.b32;1

NEWIPL=.TMP, PRESERVE=NO);
..........................................^
%BLS32-W-TEXT, Illegal register number 0 in LINKAGE declaration
at line number 190 in file D0:[craig.WATCHER.source]perform_disconnect.b32;1

NEWIPL=.TMP, PRESERVE=NO);
..........................................^
%BLS32-W-TEXT, Illegal register number 1 in LINKAGE declaration
at line number 190 in file D0:[craig.WATCHER.source]perform_disconnect.b32;1

NEWIPL=.TMP, PRESERVE=NO);
..........................................^
%BLS32-W-TEXT, Illegal register number 2 in LINKAGE declaration
at line number 190 in file D0:[craig.WATCHER.source]perform_disconnect.b32;1
$

All I can find on this is that you can't use registers 0, 1, and 2 in BLISS on IA64, but I have no idea why it thinks this code is doing that.

Clearly someone has built Watcher on Itanium, but it seems unlikely they did so using the released sources with the released BLISS compiler. I'm a bit stuck here unless some BLISS gurus can spot something I'm doing wrong.
James T Horn
Frequent Advisor

Re: Madgoat Watcher

I have it down to two rules:

wcp show exclu
%WCP-I-READCFG, read configuration from file SYS2:[WATCHER]WATCHER_CONFIG.WCFG;146

Exclude records:

Username: SYSTEM, UIC: [0,0]
Device: *, Port name: *
Running image: *
Times: MONDAY:(8-23),TUESDAY:(8-23),WEDNESDAY:(8-23),THURSDAY:(8-23),FRIDAY:(8-23),SATURDAY:(8-17),SUNDAY:(8-17)

Username: *, UIC: [0,0]
Device: *, Port name: *
Running image: *
Times: MONDAY:(6-20),TUESDAY:(6-20),WEDNESDAY:(6-20),THURSDAY:(6-20),FRIDAY:(6-20),SATURDAY:(8-18),SUNDAY:(8-18)

I figured it had something to do with matching USERNAME and UIC, and I was willing to code the exclusions with /UIC=... but you can't specify the UIC without getting that error.

With the environment of our system, I am basically running watcher from between 7pm and 5am so processes will be cleaned up, then during the day leaving them going with no exclusion.

I'm hoping someone with Bliss knowledge can assist.
Hoff
Honored Contributor

Re: Madgoat Watcher

I'm not inclined to power up and boot an Itanium box to go poke at this, but (from a cursory look) it appears that either the syntax of the $devicelock and $deviceunlock macros has changed, or possibly that the macros aren't being found during the compilation.

If you've not rebuilt the Bliss system require libraries after the most recent OpenVMS upgrade or since installing Bliss, here are the 32-bit library builds...

$ BLISS /LIBRARY=SYS$COMMON:[SYSLIB]LIB.L32 -
SYS$LIBRARY:LIB.REQ+SYS$LIBRARY:STARLET.REQ
$ BLISS /LIBRARY=SYS$COMMON:[SYSLIB]STARLET.L32 -
SYS$LIBRARY:STARLET.REQ

Have a look at the macro declarations within the LIB.REQ file (IIRC, though if it's not there go look in STARLET.REQ), as the REQ files are directly readable; they're source code.
James T Horn
Frequent Advisor

Re: Madgoat Watcher

Ok, so I am a little excited for the help on this issue, I might have been a little too generous on points.
Craig A Berry
Honored Contributor

Re: Madgoat Watcher

I executed the commands Hoff posted. Both emitted a large number of warnings and informational messages. The warnings were about illegal register numbers like so:


JSB ( REGISTER = 2, ! rsn
...........................^
%BLS32-W-TEXT, Illegal register number 2 in LINKAGE declaration
at line number 102486 in file SYS$COMMON:[SYSLIB]LIB.REQ;1

and the informational messages were, as far as I could see, all about numeric literal overflow like so:


literal VA$M_VRNX = %X'F000000000000000';
......................^
%BLS32-I-TEXT, Numeric literal overflow
at line number 22858 in file SYS$COMMON:[SYSLIB]STARLET.REQ;1

I attempted to build Watcher again anyway, but now I don't even get as far as I did before:


$ BLISS/OBJECT=[-.BIN-IA64]DECW_DISPLAY.OBJ DECW_DISPLAY.B32

LIBRARY 'SYS$LIBRARY:LIB';
.............................^
%BLS32-W-TEXT, Warnings issued during LIBRARY precompilation: 263
at line number 58 in file D0:[craig.WATCHER.source]decw_display.b32;1

Luckily I can easily get back to where I was by simply deleting the .L32 files in SYS$LIBRARY that I just created. So, no forward progress yet, but no backward either.
Craig A Berry
Honored Contributor

Re: Madgoat Watcher

Finally read all the way to the *end* of the BLISS release notes, which is where it tells you how to install the package containing the release notes. The gist of it is that I think the incantation Hoff was recommending should be unnecessary because the installation now does that, and if it is necessary, some of the details have changed:



CHAPTER 5

HOW TO INSTALL THE COMPILERS

5.1 Installing the compilers on OpenVMS

This is a PCSI kit. See the appropriate PCSI documentation for
information on how to install it.

5.2 Building the Starlet and Lib .L32 and .L64 libraries

As part of the PCSI installation the Starlet and Lib .L32 and .L64

libraries will be built and placed in SYS$LIBRARY by default. If you
chose not to have the installation build these files and wish to build
them by hand yourself, below are the commands required to do so:

$bliss/i32/terminal=noerrors/lib=sys$common:[syslib]:starlet.l32 sys$library:starlet.req
$bliss/i32/terminal=noerrors/alpha_register_mapping/lib=sys$common:[syslib]:lib.l32 sys$library:lib.req
$bliss/i64/lib=sys$common:[syslib]:starlet.l64 sys$library:starlet.r64
$bliss/i64/alpha_register_mapping/lib=sys$common:[syslib]:lib.l64 sys$library:lib.r64

You will need to be logged into an account with system privileges to
successfuly write the files to SYS$LIBRARY.


or, to mimic exactly what the installer does, use the procedure the installer uses (substituting sys$common for pcsi$destination):

$ product extract file/select=BLISSI64$LIBINSTAL.COM blissi64/version=1.12-72

$ type BLISSI64$LIBINSTAL.COM
$! Copyright 2003 Hewlett-Packard Company
$!
$! Builds the Bliss system libraries
$!
$! Turn off errors to the terminal because building starlet and lib
$! with BLISS32 will cause lots of informational overflow messages
$! which are entirely normal.
$!
$ bliss/i32/terminal=noerrors/lib=pcsi$destination:[syslib]starlet.l32 sys$library:starlet.req
$
bliss/i32/terminal=noerrors/alpha_register_mapping/lib=pcsi$destination:[syslib]lib.l32 -
sys$library:starlet.req+sys$library:lib.req
$ bliss/i64/lib=pcsi$destination:[syslib]starlet.l64 sys$library:starlet.r64
$ bliss/i64/alpha_register_mapping/lib=pcsi$destination:[syslib]lib.l64 sys$library:starlet.r64+sys$library:lib.r64
$ exit $status


So far that;s just BLISS basics. On to struggling with the Watcher sources.

Craig A Berry
Honored Contributor

Re: Madgoat Watcher

I think I fixed it. I had to do this to complete the set-up of my BLISS environment:

$ BLISS/I32/TERMINAL=NOERRORS/LIB=SYS$LIBRARY:TPAMAC.L32 SYS$LIBRARY:TPAMAC.REQ

And I had to add the /ALPHA_REGISTER_MAPPING option when compiling DECW_DISPLAY.B32 and PERFORM_DISCONNECT.B32.

I discovered that the switch from LIB$TPARSE to LIB$TABLE_PARSE had already been made via macros like:

%IF %BLISS (BLISS32E) OR %BLISS (BLISS32I) %THEN
MACRO LIB$TPARSE = LIB$TABLE_PARSE%;
%FI

which means just use LIB$TABLE_PARSE as a drop-in replacement for LIB$TPARSE. I think that mostly works *except* when creating user action routines, and Watcher makes extensive use of a user-action routine called CVT_ASCTOID_STORE to process UICs. According to the docs at:

http://h71000.www7.hp.com/doc/82final/5932/5932pro_050.html#tp_sec_3_1

LIB$TPARSE has the VAX-like expectation that the action routine can just process arguments as offsets from the argument pointer, but "LIB$TABLE_PARSE uses the standard calling mechanism and passes the argument block, by reference, as the only argument to the action routine."

Watcher very cleverly has a TPA_ROUTINE macro which does both types of argument handling. BUT it was not using the correct option on Itanium. I made the following change:

$ diff [.source]wcp.r32
************
File D0:[craig.WATCHER.source]wcp.r32;2
88 %IF %BLISS(BLISS32E) OR %BLISS(BLISS32I) %THEN
89 %IF NOT %DECLARED (TPA_ARGCNT) %THEN
******
File D0:[craig.WATCHER.source]wcp.r32;1
88 %IF %BLISS(BLISS32E) %THEN
89 %IF NOT %DECLARED (TPA_ARGCNT) %THEN
************

Number of difference sections found: 1
Number of difference records found: 1

DIFFERENCES /MERGED=1-
D0:[craig.WATCHER.source]wcp.r32;2-
D0:[craig.WATCHER.source]wcp.r32;1

Then I recompiled and appear to be in business:

$ wcp/nofile
WCP> exclude system
WCP> show exclude

Exclude records:

Username: SYSTEM, UIC: [*,*]
Device: *, Port name: *
Running image: *
Times: MONDAY:(0-23),TUESDAY:(0-23),WEDNESDAY:(0-23),THURSDAY:(0-23),FRIDAY:(0-23),SATURDAY:(0-23),SUNDAY:(0-23)

WCP> exclude */uic=[1,*]
WCP> show exclude

Exclude records:

Username: SYSTEM, UIC: [*,*]
Device: *, Port name: *
Running image: *
Times: MONDAY:(0-23),TUESDAY:(0-23),WEDNESDAY:(0-23),THURSDAY:(0-23),FRIDAY:(0-23),SATURDAY:(0-23),SUNDAY:(0-23)

Username: *, UIC: [1,*]
Device: *, Port name: *
Running image: *
Times: MONDAY:(0-23),TUESDAY:(0-23),WEDNESDAY:(0-23),THURSDAY:(0-23),FRIDAY:(0-23),SATURDAY:(0-23),SUNDAY:(0-23)

WCP>

It seems to be handling UICs correctly now. I have not (yet) tested whether the exclusions are handled properly, but I suspect they will be. New Itanium images only (not complete kits) are attached. Use with caution until more testing has been done.

I will try to get the attention of someone who can release a new kit.

James T Horn
Frequent Advisor

Re: Madgoat Watcher

I am running the new versions on our development system and if all works will should start the new changes on our production system tomorrow.
Craig A Berry
Honored Contributor

Re: Madgoat Watcher

Looks like exclusions are now working correctly on our test system. I haven't tested overrides but they use the same logic.

I've been in touch with Tim Sneddon off-forum and he plans to release a new version with my change soonish and also has other changes in the works. If I hear of a release that hasn't already been posted, I'll try to remember to add details here.
James T Horn
Frequent Advisor

Re: Madgoat Watcher

Test testing on the development server and changes worked. Implemented on production server and is working great. Thank you for all the assistance.

Re: Madgoat Watcher

Maybe you could email the solution to Hunter Goatley? The V4.0 he is distributing is not working. I had some email contact with him about Watcher.
Anybody knows if WATCHER handles ACMS users correctly?

 

Resistance is not an option.

Re: Madgoat Watcher

Thanks to Bart and Craig, I have updated my distribution of WATCHER to include this fix.  I also modified the DESCRIP.MMS and LINK.COM so they actually work for Alpha and I64.

 

I have numbered my update WATCHER V4.1.

 

ftp://ftp.process.com/vms-freeware/fileserv/watcher.zip

 

Hunter