Operating System - OpenVMS
1825987 Members
3201 Online
109690 Solutions
New Discussion

Re: My program is not stable

 
SOLVED
Go to solution
St.wing
Advisor

My program is not stable

Now i encounter a very strange phenomena,my program will breakdown frenquently in some specific time ,but will do fine in anthother time.All the enviroment include the input and hardware is the same .is it possible that something is wrong in the machine itself?such as some bad-point in the memory ?
by the way,who know some tools in OVMS to check the memory ,is there some way to know the healthy state of hardware of the machine ?

expect your answer,it is urgent.

Thanks

19 REPLIES 19
Karl Rohwedder
Honored Contributor

Re: My program is not stable

Check the error count (SHOW ERROR) and look into the error log (ANA/ERR/ELV,DIAGNOSE or SEA), anything strange?

Explain 'breakdown'? Are there errormessages when the application stops or does it create a process dump?

Does it depend on the time, how long the application has been running, before it breaks (may be a resource exhausted)?

regards Kalle
Volker Halle
Honored Contributor

Re: My program is not stable

St.wing,

welcome to the OpenVMS forum.

OpenVMS is normally very good at providing error information, if something goes wrong. This includes both software and hardware problems.

If you could include this information, it would help a lot in determining what problem you are seeing. Consider to include the error messages as an attachment .TXT file in your next reply.

What kind of 'instability' are you seeing: System crash ? Program terminating due to some error ?

OpenVMS includes a suite of software tools called UETP (User Environment Test Package), which is a collecting of test tools, which can put a certain load on the system and 'stress test' the system. You could run this, if you believe the system to have a general problem.

See the UETP documentation in chapter 5 of the System Manager's Manual Volume 2:

http://h71000.www7.hp.com/doc/82FINAL/aa-pv5nj-tk/aa-pv5nj-tk.HTMl

Volker.
St.wing
Advisor

Re: My program is not stable

thanks for your help.
i will do some experiment to collect error information at once,and next time i can repeat more details of the error .
as karl mentioned,the cause of the phenomena is like some resource exhausted.maybe something(it is ambiguos,i don't know exactly) occured,the working set and the virtual pages of the process will increase at an abnormal rate(the rate of working set increase is about 40 per second).when the working set parameter reached about 3400,the process will breakdown.in my option,the phenomena is at random.i do not catch any rule of the error.if the process is in the fine state ,all the parameters is stable.
my program do not rely on the time ,and it do not record any state of itself.
St.wing
Advisor

Re: My program is not stable

Today,I tried to dump some error files to collect some useful information,but I failed.Because the program seems in a strange state.It does not work any more,however, it does not shutdown either.It seems that my program is waiting something sunch as some specific resource that is locked.during the abnormal time, the cpu time does not increase and the memory size does not increase either.
who can give me some advice?
It is so boring.
thank you!
Ian Miller.
Honored Contributor

Re: My program is not stable

what state is the process?
(LEF, HIB, etc)
____________________
Purely Personal Opinion
Volker Halle
Honored Contributor

Re: My program is not stable

St.wing,

why you're describing sounds like a 'hung process'.

As Ian said, look for the process state. You can also examine a hung process with SDA in the running system:

$ ANA/SYS
SDA> SET PROC/IND=
SDA> SHOW PROC
SDA> SHOW PROC/CHAN ! look for busy channels
SDA> SHOW PROC/LOCK ! is first lock shown in WAITING state ?
SDA> CLUE CALL ! show current call stack

Volker.
Ian Miller.
Honored Contributor

Re: My program is not stable

Volker,
"SDA> SHOW PROC/CHAN ! look for busy channels
SDA> SHOW PROC/LOCK ! is first lock shown in WAITING state ?"

Shouldn't these be done in the reverse order as the SHOW /CHAN can hang if the process is waiting for certain locks?
____________________
Purely Personal Opinion
Volker Halle
Honored Contributor

Re: My program is not stable

Ian,

good point ! It doesn't matter in a crash, but on the running system, it may cause SHOW PROC/CHAN to hang, if SDA tries to translate the FID to the file name !

Thanks,

Volker.
St.wing
Advisor

Re: My program is not stable

The state of my process is HIB ,this morning i am going to do the experiment.
thank for your clue,Volker,i am green one.
i need your support.
St.wing
Advisor

Re: My program is not stable

Maxjobs: 0 Fillm: 4096 Bytlm: 64000000
Maxacctjobs: 0 Shrfillm:0 Pbytlm: 0
Maxdetach:0 BIOlm:300 JTquota: 8192
Prclm: 10 DIOlm: 300 WSdef: 8192
Prio: 4 ASTlm: 500 WSquo: 8192
Queprio: 0 TQElm: 200 WSextent: 16384
CPU: (none) Enqlm: 2000 Pgflquo: 2097152

the table above show the quota of my account,my process is medium-sized instant process,it should work all the time,give me a suggestion to optimize the configurationï¼
the machine is only just for my process.

than
St.wing
Advisor

Re: My program is not stable

this morning i check the state of my process,in the attachment there is a file that contains some details of the strange state of the process.
Volker Halle
Honored Contributor

Re: My program is not stable

St.wing,

your process is multithreaded (using PTHREADS). The HIB state would be a normal state, if there are no runnable threads.

You can check the status of the threads in your process with the SDA PTHREADS extension:

SDA> SET PROC/IND=
SDA> PTHREAD t

You can get a little help for the PTHREAD SDA extension with: SDA> PTHREAD help

Please also note, that your process seems to run as a detached process and does not use your account's quotas.

Volker.
St.wing
Advisor

Re: My program is not stable

yep,as you say,my program is multithreaded .In fact,my program always shutdown without any interference.sometimes,the phenomena is just like hung up as i mentioned above.i will upload
the result of your command.

expect your reply as soon as possible.

thanks
Volker Halle
Honored Contributor

Re: My program is not stable

St.wing,

note, that if you are running OpenVMS Alpha V7.3 or higher, you could take a forced process dump of a hung process with the command:

$ SET PROC/DUMP=NOW/ID=

This command will create an .DMP process dump file in the current default directory of the process. The process will NOT be terminated - if necessary, you can do this with STOP/ID afterwards.

You can then look at this process dump with ANAL/PROCESS.

Looking at a process dump is about the same as looking at the hung process in the running system. You can also (from the DBG> prompt) enter SDA to examine the process dump with SDA commands.

The advantage is, that you can start your process again and don't have to do all of your troubleshooting, while your process is hung and cannot do useful work.

Volker.
St.wing
Advisor

Re: My program is not stable

A few days ago, my program appeared an unstable state. This morning I changed the hardware component of my computer, namely I replaced the thousand photoelectric network card into general 100 trillion Ether network card, then the problem resolve.On the usage of the photoelectric network card, I discover a phenomena, i.e. accepting from the different address of multicast at the same time, my process is on an address can receive normally, but on another address appear the phenomenon that occasionally my process can not receive the packets, would you tell me whether the photoelectric network card will encounter such problem, and what the mechanism is .Is the photoelectric network card broken ,or I commit some configuration error?
Please tell me.
Thank you!
Anton van Ruitenbeek
Trusted Contributor

Re: My program is not stable

St.wing,

The both cards are Digital eh.. HP cards ?
Please give me the partnumbers or the real names !
I don't have any knowledge now about the hardware/releases.
btw by the knowledge I (now) have about the 7.3 and drivers is whith the latest patches there are no problems of this sort.

AvR
NL: Meten is weten, maar je moet weten hoe te meten! - UK: Measuremets is knowledge, but you need to know how to measure !
St.wing
Advisor

Re: My program is not stable

yeh,the type of my switch is Cisco Catalyst and my fiber card is DEGPA-SA, and i use Auto-Negotiation.The type of my machine is DS25.when i reboot my machine,i can see lots of error packets in the report from our switch.but the total number of error packets do not increase soon after.Is the phenomena is normal?
Anton van Ruitenbeek
Trusted Contributor
Solution

Re: My program is not stable

Always remove Auto-Negotiation !
This _IS_ the most common problem in networkenvironments. It works fine if you got all Digital hardware (switches, cards). But unfortuanaly this is never true.

This can be done at bootprompt:
>>> set exxx_mode fastfd
You can do it also within LANCP
LANCP> SET DEVICE Exxx /SPEED=100 /FULL_DUPLEX

Don't forget to set the speed on the Catalist also to 100 Mb/full duplex

AvR
NL: Meten is weten, maar je moet weten hoe te meten! - UK: Measuremets is knowledge, but you need to know how to measure !
Guinaudeau
Frequent Advisor

Re: My program is not stable

Volker,

we use ourselves posix thread in our application. I am green too and discover now the interesting SET PROC/DUMP=NOW option, to get a snapshot in some troublesome situations on the fly.

Concerning the SDA extension pthread, "pthread help" brings very little info, so any additional information would be welcome. I read SDA extensions are "undocumented", but did someone start some help and could share it ?

Louis