Operating System - OpenVMS
1827835 Members
8901 Online
109969 Solutions
New Discussion

Re: Authorize commands hanging and susped processes

 
SOLVED
Go to solution
Mike Smith_33
Super Advisor

Authorize commands hanging and susped processes

3 node network cluster, 1 system disk per node. Sysuaf, rightslist etc... on shadowset disk (3 member), each node contributing one member.

Several processes went into susp state on one node. On this same node, modifications to sysuaf hang but work fine from the other two. No susp processes on the other nodes either. All system processes appear to be fine, audit_server, security_security, job_control etc... My first thought was that security_server might have died.

I have not found the cause of this situaion but all susp processes states are occurring during login. I determined this from an $anal/sys

We know it is node specific and that it is related to access of the sysuaf on this node.

Any thoughts out there would be appreciated. Anyone see anything similar?
15 REPLIES 15
Hein van den Heuvel
Honored Contributor

Re: Authorize commands hanging and susped processes

The (system) disk holding the audit log is too full and the audit_server is doign fun stuff ?!

Hein.
Volker Halle
Honored Contributor

Re: Authorize commands hanging and susped processes

Mike,

the system processes only are 'fine', if they are in HIB most of the time. If any of them is in LEF, you might need to investigate.

Is the disk containing the Audit Server Logfile filling up ?

Volker.
Mike Smith_33
Super Advisor

Re: Authorize commands hanging and susped processes

Disk space is fine across the cluster on all disks. That was one of the things I checked early on.
Mike Smith_33
Super Advisor

Re: Authorize commands hanging and susped processes

I was biased against security_server, audit_server is the one that is in lef state.

I am going to look closer at that one.
Hein van den Heuvel
Honored Contributor

Re: Authorize commands hanging and susped processes

Even if the disk space 'looks' fine, how about trying to create a largish file rigth next to the audit log (COPY/ALL=xxx NL: yyy)

I guess you read $HELP AUDI /THRESHOLD
and $HELP AUDI /EXCLUDE

With security auditing known to work, use it to audit suspend events?

Hein.


Volker Halle
Honored Contributor

Re: Authorize commands hanging and susped processes

Mike,

did someone turn on lots of audit messages or alarms ? While each process sends it's audit messages, it could be put in SUSP. If AUDIT_SERVER cannot keep up with the amount of messages in it's mailbox, you might see these symptoms...

Volker.
Mike Smith_33
Super Advisor

Re: Authorize commands hanging and susped processes

Large file copy worked.

No one has modified auditing.

still looking for threshold qualifier
Volker Halle
Honored Contributor

Re: Authorize commands hanging and susped processes

Mike,

if AUDIT_SERVER seems to be in LEF (either constantly or very often) what is it waiting for or doing ?

SDA> SHOW PROC/CHAN ! look for busy channels
SDA> SHOW PROC/LOCK ! look for waiting locks

Volker.
DECxchange
Regular Advisor

Re: Authorize commands hanging and susped processes

Hello,
So are your sysuaf, rightslist, and other system level files shared as one file in sys$common amongst your three nodes? Is this a mixed architecture cluster (VAX and Alpha) or just one architecture? So I take it that you mean each system has a shadow set member system disc physically inside each machine?
This might sound unbelievable, but could it be that part of your application programs are getting started on the wrong node of the cluster? Are certain applications cluster member dependent?
Hoff
Honored Contributor

Re: Authorize commands hanging and susped processes

DECxchange may be on the right track here. There are twenty-some files that should be configured as shared within a cluster. See SYLOGICALS.TEMPLATE on V7.2 and later for the list of usual suspects. Skews among these files can tend to lead to various weirdnesses.

There have also been a few fixes over the years for storage allocation bugs in the auditing; where auditing saw a spike or transient in the traffic, and made a bad decision, and has wedged subsequent activity due to lack of these added resources. This is correct on current and recent commonly-used releases (eg: V7.3-2 with ECOs), and tends to apply to older releases and systems lacking the DEFCON-1 ECOs.

Mike Smith_33
Super Advisor

Re: Authorize commands hanging and susped processes

Ok we have gotten an update from HP and they believe this is a problem they saw with another customer. The audit_server is having a locking problem with another node. According to HP we are running version 4.0 of the audit server patch which came in VMS update 12. Each week we run a script to pull out audit data and create a new audit file. HP has asked that we stop creating the new file until we get version 5 of the audit server patch installed.

They have requested that we reboot the node in order to release the lock.

I am going to hand out some points now and I will leave this open for any additional comments.
Mike Smith_33
Super Advisor

Re: Authorize commands hanging and susped processes

Sysuaf, rightslist, etc are all shared. All nodes are alphas.

I believe update 13 is the newest vms update for 7.3-2. We are on update 12.
Volker Halle
Honored Contributor

Re: Authorize commands hanging and susped processes

Mike,

the most recent publicly available ECO for AUDIT_SERVER is VMS732_AUDSRV-V0400 (from 22-FEB-2007), which is included in VMS732_UPDATE-V1200 and -V1300.

The release notes of the VMS732_AUDSRV-V0400 kit describe similar symptoms as you may have experienced. The -V0400 kit seems to just have been a re-release of the -V0300 kit due to a kit problem (old AUDIT_SERVER.EXE shipped in -V0300). If you look at the VMS732_AUDSRV-V0400 release notes, you may get the impression, that the -V0400 kit still ships the same OLD AUDIT_SERVER.EXE file (link date 17-JUL-2006 14:28:36.46). This may explain, why you are still seeing the problem and why HP is talking about VMS732_AUDSRV-V0500...

If you have not yet rebooted (or forced a crash to document the problem symptoms), you might want ot have a look, whether ASTs are disabled for the AUDIT_SERVR process:

SDA> SHOW PROC/PHD AUDIT_SERVER

Look for ASTs enabled kesu

Volker.
Volker Halle
Honored Contributor
Solution

Re: Authorize commands hanging and susped processes

Mike,

VMS732_AUDSRV-V0500 has just been announced. From the problem description section, it seems to address the problem you may have been seeing...

Volker.
Mike Smith_33
Super Advisor

Re: Authorize commands hanging and susped processes

Thanks, I will definitely check into it.