Operating System - OpenVMS
1827707 Members
2688 Online
109967 Solutions
New Discussion

OpenVMS TASK 100% CPU LOAD - HELP!

 
SOLVED
Go to solution
SMSC
Occasional Advisor

OpenVMS TASK 100% CPU LOAD - HELP!


Hello,
I've a problem with OpenVMS 7.2.
I've a task running that take 100% CPU LOAD.

If I use PS I get:
20800424 TASK_145A0094

But witch is TASK_145A0094???
How can get information about this task??

Please Help!
19 REPLIES 19
SMSC
Occasional Advisor

Re: OpenVMS TASK 100% CPU LOAD - HELP!


This is TOP CPU LOAD list (attached)




Heinz W Genhart
Honored Contributor
Solution

Re: OpenVMS TASK 100% CPU LOAD - HELP!

Hi dfdfdsf

First of all, welcome to HP ITRC OpenVMS Forum.

Task_xxxxx is a network process. It is created if you invoke a non-transparent Task-Task communication over DECnet.

What you could do is the following:

- Use SDA to find out what files this task has oppened
- set proc/suspend/id= ! Suspend the process, so it stops looping and you can check why it's looping
- stop/id= ! to stop the looping process.

If you find out the login Directory of your Task-process (with SDA), then you will find there a file called NETSERVER.log (DECnet V4) or a NET$SERVER.LOG (DECnet OSI9. In this file you can find further informations whats going wrong

hope that helps

Regards

Geni
SMSC
Occasional Advisor

Re: OpenVMS TASK 100% CPU LOAD - HELP!


Thanks for your fast reply Geni.
Fast and clear, except for SDA using.

Can you please post an example? I don't know what is SDA!

Thanks againg!
Heinz W Genhart
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!

Hi dfdfdsf

SDA stands for SystemDumpAnalyzer

Here a small example:

OBELIX_GENI $ analyze/system

OpenVMS system analyzer

SDA> show summary

Current process summary
-----------------------

Extended Indx Process name Username State Pri PCB/KTB PHD Wkset
-- PID -- ---- --------------- ------------ ------- --- -------- -------- ------
20200101 0001 SWAPPER SYSTEM HIB 16 820E37C8 820E3200 0
20201402 0002 IAM_SERVER_SUB IAM_STARTUP LEF 6 82D1B800 8992A000 377
20201403 0003 IAM_ACTION IAM_STARTUP LEF 6 82D37500 8992C000 98
20201404 0004 IAM_STARTUP IAM_STARTUP LEF 4 82D24880 8992E000 367
20200106 0006 CLUSTER_SERVER SYSTEM HIB 12 82A076C0 898C2000 103
20200107 0007 SHADOW_SERVER SYSTEM HIB 6 82A091C0 898C4000 118
20200108 0008 CONFIGURE SYSTEM HIB 8 828F3EC0 898BE000 20

.........

2020015F 005F GFR_PAGER_SRV5 SYSTEM HIB 8 82CB3400 89920000 116
20204260 0060 APACHE$SWS000B APACHE$WWW LEF 6 82D7F340 8994E000 1151

Press RETURN for more.

SDA> set proc/index=0060 ! APACHE$SWS000B
SDA> sho proc/channel ! Show Open Channels/open Files process has

Process index: 0060 Name: APACHE$SWS000B Extended PID: 20204260
--------------------------------------------------------------------


Process active channels
-----------------------

Channel CCB Window Status Device/file accessed
------- --- ------ ------ --------------------
0010 7FF70000 00000000 DSA7:
0020 7FF70020 8374D340 DSA1:[WEBSERVER.APACHE]APACHE$HTTPD.EXE;1
0030 7FF70040 829F60C0 DSA0:[VMS$COMMON.SYSLIB]DECC$SHR_EV56.EXE;1 (section file)
0040 7FF70060 829F5940 DSA0:[VMS$COMMON.SYSLIB]DPML$SHR.EXE;1 (section file)
0050 7FF70080 82A00780 DSA0:[VMS$COMMON.SYSEXE]DCL.EXE;1 (section file)
0060 7FF700A0 83BD01C0 DSA0:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;442 (section file)
0070 7FF700C0 832E6600 DSA1:[WEBSERVER.APACHE.SPECIFIC.OBELIX]APACHE$SWS000B.LOG;662
0080 7FF700E0 82CFB1C0 DSA1:[WEBSERVER.APACHE.SPECIFIC.OBELIX]APACHE$SWS000B.COM;1
0090 7FF70100 829F41C0 DSA0:[VMS$COMMON.SYSLIB]CMA$TIS_SHR.EXE;1 (section file)
00A0 7FF70120 829F1E40 DSA0:[VMS$COMMON.SYSLIB]LIBOTS.EXE;1 (section file)

.........

03D0 7FF70780 00000000 MBA37266:
03E0 7FF707A0 00000000 MBA37266:
03F0 7FF707C0 00000000 MBA37267:
0480 7FF708E0 00000000 MBA37271:

Total number of open channels : 79.
SDA> exit

Regards

Geni
SMSC
Occasional Advisor

Re: OpenVMS TASK 100% CPU LOAD - HELP!


Find one DCL script in loop. That's why TASK take 100% CPU LOAD.
Ten points for your reply! ;)

Thanks Geni you save my life! :D
Jan van den Ende
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!

dfdfdsf,

before you kill the process as per Geni's advice, try to get as much info as possible. SUSPENDing the process might stop activity, it also prevents a lot of possibilities for investigation.
You might first try a $ SET PROCES/ID=20800424/prio=0
Now it will still consume any available CPU cycles, but at least the other processes get them first.
Now $ SHOW PROCES/ID=20800424/ALL will give a lot of info, and it may well be the case, that you can find out which remote process is (or WAS!!) causing this. It may be the case, that this process is trying to send something to a remote process that is no longer there.

Yes, searching for these cases can be somewhat of trail-seeking, with every bit of info you find pointing the direction for the next step.

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
SMSC
Occasional Advisor

Re: OpenVMS TASK 100% CPU LOAD - HELP!


Too late Jan,
customer ask me to stop process ID and give me no time for analyze it. Look at the following channel list.
DSA1:[SMSC.SCRIPTS]BLACKLIST.COM was called every 15 mins from another remote node and by another script using GOLD:
$@gld_lib:lib_exec_remote SMWI31 dsa1:[smsc.scripts]blacklist.com
Normaly I've no problem on it!!!!

Process active channels
-----------------------

Channel CCB Window Status Device/file accessed
------- --- ------ ------ --------------------
0010 7FED0000 00000000 DSA1:
0020 7FED0020 820CBD00 DSA1:[SMSC.SCRIPTS]BLACKLIST.COM;18
0030 7FED0040 81FD0254 Busy NET6406:
0040 7FED0060 821C34D4 NET6407:
0050 7FED0080 81AFD300 DSA0:[VMS$COMMON.SYSEXE]DCL.EXE;1 (section file)
0060 7FED00A0 81AEDF80 DSA0:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;257 (section file)
0070 7FED00C0 00000000 NLA0:
0080 7FED00E0 81C46040 DSA0:[VMS$COMMON.SYSEXE]NET$SERVER.COM;1
Robert Gezelter
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!

dfdfdsf,

I would recommend two things:

- $ SHOW PROCESS/ALL/ID=20800424
- $ SHOW DEVICE/FILES on the various disks, particularly the disk that is the default device for the account that owns that process

The symptoms are those of a program in a long or perhaps an infinite loop. Tracking down the problem is a step-by-step process of understanding. In the end, it is always clear what is happening, but it can take several steps to get there.

- Bob Gezelter, http://www.rlgsc.com
Volker Halle
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!

dfdfdsf,

to prepare for the next time this will happen, consider to have a look into the BLACKLIST.COM DCL procedure.

Maybe you can find out just by code inspection, if there are possible infinite loops in the procedure itself or in any procedures or images it may invoke.

If this happens again, start with SDA> SHOW PROC/CHAN to document the open files, before you stop the process. Run $ MONITOR MODE to find out, if the CPU loop is in Supervisor Mode, then you would know it's a loop within a DCL procedure.

When it comes to this, consider to read the following VTJ article, which explains how to find the current DCL command with SDA:

http://h71000.www7.hp.com/openvms/journal/v1/dcl.html

Volker.
SMSC
Occasional Advisor

Re: OpenVMS TASK 100% CPU LOAD - HELP!


I would like to thank all precious advices.
I write down this usefull post on my Hard Disk for future use!

There is a loop into blacklist.com script that read one text file line by line:
.....
.....
$ START:
$ OPEN /ERROR=OPEN_ERROR HFILE3 tmp_black.txt
$ OPEN /WRITE HFILE4 black.prl
$ LOOP:
$ READ/END_OF_FILE=CLOSEFILE1 HFILE3 black
$ IF black .NES. "EXIT"
$ THEN
$ WRITE HFILE4 "ADD BLACK ''black'"
$ ENDIF
$ GOTO LOOP
$ OPEN_ERROR:
$ GOTO START
.....
.....

Probably "tmp_black.txt" file was used by last instance and was impossible to open/read... Then goto to infinite LOOP!
Volker Halle
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!

Consider to add some code here:

$ OPEN_ERROR:
$ SHOW SYMB $STATUS
$ WAIT 0:0:1
$ GOTO START

This would at least show (in the NETSERVER.LOG file), if there had been an error during the OPEN. It will also give you the error status and it will wait 1 second before re-trying the OPEN. This would at least lower the CPU load if this error should happen again.

You could also implement a retry counter and send mail to someone or send an OPCOM message, if the OPEN fails after n retries...

Whether it would make any sense at all to go back to START: if the open fails, is another question alltogether. Answering this question would need a lot more background about this procedure.

Volker.
SMSC
Occasional Advisor

Re: OpenVMS TASK 100% CPU LOAD - HELP!


Interesting solution, anyway I modified BLACKLIST.COM as follow:

This will wait 1 min and then retry, and after 3 retry will send a TRAP (using DCLSIG app) if FAIL still occour.

$ START:
$ OPEN /ERROR=OPEN_ERROR HFILE3 tmp_black.txt
$ OPEN /WRITE HFILE4 smsc$root:[tmp]black.prl
$ LOOP:
$ READ/END_OF_FILE=CLOSEFILE1 HFILE3 black
$ IF black .NES. "EXIT"
$ THEN
$ WRITE HFILE4 "ADD BLACK ''black'"
$ ENDIF
$ GOTO LOOP
$ OPEN_ERROR:
$ ErrorCount=ErrorCount+1
$ if ErrorCount .EQS. "4" THEN
$ DCLSIG "Error Opening tmp_black.txt. Trying to backup it as tmp_black.fail" "FA"
$ copy tmp_black.txt tmp_black.fail
$ goto CLOSEFILE1:
$ END IF
$ SHOW SYMB $STATUS
$ WAIT 00:01:00
$ CLOSE HFILE3
$ CLOSE HFILE4
$ GOTO START
$ CLOSEFILE1:
Volker Halle
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!

SMSC,

there are a couple of errors in the DCL code shown:

Make sure to initialize ErrorCount=0

$ GOTO CLOSEFILE1: ! remove ':'

$ SHOW SYMB $STATUS must be moved to the line immediately following the $ OPEN_ERROR: label.

$ CLOSE HFILE3 ! will give %DCL-W-UNDFIL as the open failed.

Volker.
SMSC
Occasional Advisor

Re: OpenVMS TASK 100% CPU LOAD - HELP!


Volker,
it was just a try, not a final version, but anyway, thanks for your WARNING! ;)

For "CLOSE HFILE3", I must be sure that all handles was closed first to LOOP, so I'll put also:

$ ON ERROR THEN CONTINUE

Just to be sure that DCL script ends and to avoid errors.
Jan van den Ende
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!



>>>
For "CLOSE HFILE3", I must be sure that all handles was closed first to LOOP, so I'll put also:
<<<

You can first test it:
$ if f$trnlnm("hfile3") .nes. "" then close hfile3

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
SMSC
Occasional Advisor

Re: OpenVMS TASK 100% CPU LOAD - HELP!


It works thanks!
I forgot f$trnlnm lexical command!
hehehe! I need to refresh my DCL knoledge!
Robert Gezelter
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!

SMSC,

It is also a good safety check to make sure that the file is not already opened when opening it.

It is particularly useful when manually debugging.

- Bob Gezelter, http://www.rlgsc.com
Hoff
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!

>>>It is also a good safety check to make sure that the file is not already opened when opening it.<<<

I usually slam the file with a CLOSE/NOLOG, just before issuing an OPEN command.
Volker Halle
Honored Contributor

Re: OpenVMS TASK 100% CPU LOAD - HELP!

SMSC,

and there is another error:

$ copy tmp_black.txt tmp_black.fail

Why would you expect this command to work, if a DCL OPEN failed ?

Something like:

$ IF F$SEARCH("tmp_black.txt") .NES. ""
$ THEN
$ BACK/IGN=INTER tmp_black.txt tmp_black.fail
$ ENDIF

may have a better chance to succeed.

Volker.