Operating System - OpenVMS
1828274 Members
3426 Online
109975 Solutions
New Discussion

Re: Error activating image ...

 
SOLVED
Go to solution
Klaus Heim
Advisor

Re: Error activating image ...

I have changed all batch jobs and include DCL command set watch. At Friday I got the error again. The image las$cre_sh_batch.exe (foreign command cre_sh_batches) is started with different input parameters:

cre_sh_batches 901 !ok
cre_sh_batches 902 !ok
..
cre_sh_batches 918 !ok
cre_sh_batches 919 !error
cre_sh_batches 920 !error
cre_sh_batches 921 !error

See attachment for a part from the logfile.

The image syscom.exe with the file id (3483,5277,0) is a writeable common and is installed with /open/header/share/write.

The writeable image is linked to the shareable image lvsshr.exe. In the batch logfiles I've not found any reference to the file or the file id.
H.Becker
Honored Contributor

Re: Error activating image ...

>>>
The image syscom.exe with the file id (3483,5277,0) is a writeable common and is installed with /open/header/share/write.

The writeable image is linked to the shareable image lvsshr.exe. In the batch logfiles I've not found any reference to the file or the file id.
<<<

That's OK, you will not find any XQP activity like open or close, because it is installed /open.

That there are writeable shareable images involved, wasn't mentioned before. Is this a pattern?

It's still not obvious what's happening. /header moves the DYNTBL into the non-paged pool. So at activation time it is not read from the image file. It looks like after the activation "ofcre_sh_batches 918 !ok" There is a corruption in the pool. However, I do not see a match with the previous entries in the thread. The last entry shows that (all?) subsequent activations fail. I was under the impression, that the next activation would succeed.

Are there other problems with/on this system with devices, drivers or even pool corruption, etc.?
Klaus Heim
Advisor

Re: Error activating image ...

>>>That there are writeable shareable images involved, wasn't mentioned before. Is this a pattern?

I forgot this fact. The image didn't use the writeable shareable images. These are linked to lvsshr.exe which is linked to lvs$tashr.exe. And lvs$tashr.exe is used by the image las$cre_sh_batch.exe.

>>>The last entry shows that (all?) subsequent activations fail. I was under the impression, that the next activation would succeed.

I agree. I'm also impressed. This is a relatively new batch job. This time the error occurs when the same image is started with different input parameters. In case of error this batch job will continue (set noon). The other batch jobs stop after the error case. The next activation would succeed - in an other dcl or batch environment.

>>>Are there other problems with/on this system with devices, drivers or even pool corruption, etc.?

I didn't know or see any problem. Can we monitor something? I will check the systems for errors.
H.Becker
Honored Contributor

Re: Error activating image ...

It looks like this problem can't be solved in this forum. You may want to get in contact with the local HP office. There is expertise on this topic outside of HP but having access to the system, looking at the images, the whole environment seems necessary. As already said, this is not a common or known problem, it doesn't look like an image activation problem, but a person with knowledge in this area should have a close look at the system.

That said,
>>>
The image didn't use the writeable shareable images. These are linked to lvsshr.exe which is linked to lvs$tashr.exe. And lvs$tashr.exe is used by the image las$cre_sh_batch.exe.
<<<
using an image is transitive.

>>>
This time the error occurs when the same image is started with different input parameters. In case of error this batch job will continue (set noon).
<<<

With an installed image, this is expected. If the dynamic table in the non-paged pool is corrupt it will stay corrupt until the image is de-installed (manually or with a shutdown). The big question is, what happens after successfully activating the image (cre_sh_batches 918) and before trying to activate the very same images in (cre_sh_batches 919)?
John McL
Trusted Contributor

Re: Error activating image ...

I'm puzzled. In the log file that you included, why are images xxxx$TASHR.EXE;118 and FHSHR.EXE;159 being activated after TPSHR.EXE;5 in the successful batch job? These don't appear in the failed batch jobs that just after the reference to TPSHR.EXE;5 report a problem with image SYSCOM.EXE.

You said that LIB$FIND_IMAGE_SYMBOL wasn't being used, so surely we should see SYSCOM.EXE activated in the successful job. I think you said that one of your other images is linked against SYSCOM - why does this comment system not allow me to see the entire thread as I type this? - and my understanding is that the image should therefore be activated.

Why also do we see all the "deaccess" diagnostics in the failing jobs but not in the successful job? I also see that the deaccess diagnostics are almost in the order of activation, except for DBMSHR.EXE;7 which is deaccessed last of the 6 rather than 4th.

Does file DSA3:[xxxxLVS0.][LVSLIB]SYSCOM.EXE
have FID (76514,5,0)? It seems to be trying to access that file twice, which suggests a failure or corruption of the image activator work list. The first access to SYSCOM might be resolving the dynamic table but when it comes to the second access the image might be recognised as already present and the DYNTBL data is accesses in the already activated SYSCOM.EXE.

Have you checked for
(a) any logical names that might be pointing to incorrect images?
(b) whether any of your image names clash with system images (This shouldn't be a problem in normal situations but your crashes don't seem normal.)
(c) Is DBMSHR.EXE your own code or or from SYS$SHARE and if the second, then does its fixup vector give us any clues?

Can you get a process dump and examine the internal lists for the Image Activator and (I think) the image fixup list.

We're looking for anything odd, like duplicate FIDs or FIDs to images that we don't believe should be activated. (I'm writing this from memory of the image activation described in the Internals and data Structures book but maybe things have changed. I'm sure someone will correct me if I'm wrong!)

Finally, are you running any software like a disk defragger that might be trying to move .EXEs while you are trying to use them? There shouldn't be a problem but if we find that this software is running then try running your jobs with that other software stopped.

Klaus Heim
Advisor

Re: Error activating image ...

H.Becker>>> Please, don't give up. We have opened a case by HP Support months before. The case is escaleted. A solution is far away. I got more help form the experts in this forum. Please, hold the line and let us look for the needle in the haystack.

John McL>>>
>>>Does file DSA3:[xxxxLVS0.][LVSLIB]SYSCOM.EXE
have FID (76514,5,0)?

No, SYSCOM.EXE has FID (3483,5277,0). FID (76514,5,0) is the logfile from the batch job LAS_FAKTURA.LOG.

>>>Have you checked for (a),(b)
It's difficult. I have no idea what to check.
>>>(c)
DBMSHR.EXE is our own code

>>>Can you get a process dump and examine the internal lists for the Image Activator and (I think) the image fixup list.
I can get the process dump. But I can't list the image activator or image fixup list. In an other case I have also problems to analyse the process dump.

>>>We're looking for anything odd, like duplicate FIDs or FIDs to images that we don't believe should be activated.
Do you believe that duplicate FIDs or something like this is possible?

>>>Finally, are you running any software like a disk defragger that might be trying to move .EXEs while you are trying to use them?
No disk defragger or something like this.

Please, let us start from the beginning. On Alpha everything is fine. On Integrity we have the problems. What is changed between these systems.

1. The OS and the whole environment (new installation and configuration)
2. On AXP there is 1 CPU, on IA64 there are 4 CPUs
3. The application (compiled and linked)
4. The database ADABAS is completly new.
5. The foreign application "ctmagent" is new. The host system starts with this agent the batch jobs on OpenVMS.

I fear, I miss only a little thing and got this big problem.
Hoff
Honored Contributor

Re: Error activating image ...

If the folks with the VMS source code and likely also with the application source code haven't yet resolved this, then it's unlikely that folks here (lacking one or both of these resources) will get to the bottom of this; that this error probably isn't a trivial trigger.

That this is a latent bug in the code, a nasty bug in VMS, or a software or hardware error somewhere in the system environment.

This case has been open for months?

That's, well, sad.

This on-going open-case status implies that most or all of the so-called low-hanging fruit and most or all of the obvious triggers have already been considered.

Has this case been escalated?

Talk to your boss, too.

And there are always options.
Hoff
Honored Contributor

Re: Error activating image ...

ps: shut down three processors, as a test, and see if you can reproduce this.

SMP is a common trigger for subtle errors (in VMS, and more commonly in application code), and any application with writeable shareable images or global sections is immediately going to be suspect code.
H.Becker
Honored Contributor

Re: Error activating image ...

I still think that it is better to have a look at your images and at the system. If you want, I can do that. Drop me a line. My mail address is (temporarily) in my profile.
Klaus Heim
Advisor

Re: Error activating image ...

>>>Has this case been escalated?

Yes, this case is escaleted. First we opend the case by HP support. After some weeks I post the case in this forum.

Thank you very much for all your help.