Operating System - OpenVMS
1752309 Members
6014 Online
108786 Solutions
New Discussion юеВ

Re: ACCVIO register stack dump

 
SOLVED
Go to solution
pacab
Advisor

ACCVIO register stack dump

Greeting to all,
I have spent countless hours trying to figure this access violation issue and have had enough. So I am seeking this forum expertise to shed some light.

I have attached a file with the stack dump the application issues and a snipet of the C++ code.

This is a C++ legacy application that was moved during a hardware and VMS O.S upgrade from 6.2 to 7.3-2. Each time our business users use the printing function more than once they get kicked out of the application and get the stack dump. If they print once and go back to the main menu which is 2 screen levels out and select an option to switch to a different entity which calls a new screen they get kicked out with a stack dump. I have searched and read documents/manuals with no luck...PLEASE Help!!!

Thank you all in advance for your time and feedback...Abel
16 REPLIES 16
Willem Grooters
Honored Contributor

Re: ACCVIO register stack dump

I have seen this before and lost quite some time to find out that the problem was in a called routine, that used an undocumented feature. It was buried deep - so it took a lot of time to locate the source.

If possible, I'd rebuild the application - including the (non-system) libraries used. Read the releasenotes - all of them - between 6.2 and 7.3-2, to see if any changes in the code are required.

Do you get a traceback? This may shine a light where teh error occurs.
Willem Grooters
OpenVMS Developer & System Manager
Volker Halle
Honored Contributor

Re: ACCVIO register stack dump

Abel,

this looks like a stack overflow problem. The failing virtual address 7AE2C000 points into P1 space, where the user stack resides.

Can you start the image with RUN/DUMP or set the process SET PROC/DUMP/ID=xxx after the image has been started ? It should then write an image dump (SYS$LOGIN:image_name.DMP) when an improperly handled condition has happened. You can look at the dump with ANAL/PROC or even ANAL/CRASH to find out what's going on.

To rule out the fact, that it's just missing PGFLQUOTA, that may prevent automatic stack expansion, try to increase pagefile quota for the user.

Volker.
pacab
Advisor

Re: ACCVIO register stack dump

Willem,
I will check the release notes to see anything that jumps out. Rebuilding the application would be a bit of an undertake, which, at this time we are just trying to keep the application stable. But, you are right on the money when you say some of this code needs to be cleaned up.

Volker, I will run this with /dump and take a closer look. I actually had already increased the /PGFLQOUTA to 700000 and it helped to allow more than 2-3 prints, but, it still creates the same ACCVIO error.

Any other suggestions?...

Thank you for all of your comments, time and feedback.

Abel
Ian Miller.
Honored Contributor

Re: ACCVIO register stack dump

I saw
"increased the /PGFLQOUTA to 700000 and it helped to allow more than 2-3 prints, "

and wondered if you tried increasing it any more?
____________________
Purely Personal Opinion
Volker Halle
Honored Contributor

Re: ACCVIO register stack dump

Abel,

if increasing PGFLQUOTA helps 'a little bit', it may be a temporary workaround to increase it even more, but it also confirms, that the underlying problems is a STACK consumption problem, where some piece of code seems to allocate STACK space and not return it or allocate STACK space based on some variable, which does not get reset and gets bigger from call to call...

You should also be able to 'see' this, if running SHOW PROC/CONT against such a process and having the user issue a couple of prints. Virtual Pages used should increase.

You could examine the stack in the running system with SDA and see it growing from print call to print call, but a process dump would certainly be better.

Volker.
pacab
Advisor

Re: ACCVIO register stack dump

I increased the PGFLQUOTA to 900000 and not sure it helped. I monitored the process and created a dump of the error. Attached is the output from both SET PROC/DUMP and SHO PROC/CONT.

I to feel this is a stack size issue, I am just not sure where it is or what parameters to change to prevent this problem from happening.

I also included the printing code used in this application that is giving me nightmares... Abel
Volker Halle
Honored Contributor

Re: ACCVIO register stack dump

Abel,

before the final failure, the SP was at 7AE2A0C0 (same value as at start of application). The failing VA is 7AE2C000, just one ALPHA page further DOWN the stack. Stacks GROW towards LOWER virtual addresses, but the failing VA is at a HIGHER address.

With the following data, you can confirm, that it really is a STACK UNDERFLOW, which explains why setting PGFLQUOTA higher does NOT help much.

At the DBG prompt, enter SDA, then
SDA> SHOW PROC/PAGE 7AE2A0C0;4000

Look at the Read Writ columns for pages 7AE2A000 (page of USER stack) and 7AE2C000 (page causing ACCVIO).

If this doesn't work in the dump, try it against that process in the running system.

You can check the USER stack start address with:

SDA> EXA CTL$GQ_STACK+18 ! bottom of USER stack

Then let's have a look at the instruction stream leading to the ACCVIO:

DBG> EXA/INS @PC-40:@PC

Volker.
pacab
Advisor

Re: ACCVIO register stack dump

Volker,
Thank you for all your input and time...

I have attached the output of the SDA you referred me to. Can you please advise on the results? Also, is there any HP VMS docs you can referr me to learn some of these administrative tools?

Thank you once again...Abel
Volker Halle
Honored Contributor

Re: ACCVIO register stack dump

Abel,

there is the OpenVMS Debugger Manual and the System Analysis Tools Manual (for SDA), but the problem is, that while those books will extensively describe all available commands, they don't tell you the troubleshooting techniques and internals of OpenVMS, which you need in this case to troubleshoot the problem from a process (or system) dump. But I'm here to help you ;-)

The failing instruction is:

LDQ_U R24,(R17)

It tries to load the un-aligned QUADWORD from the virtual address pointed to by R17 (=7AE2C000) into R24 and fails with ACCVIO accessing the page pointed to by R17 - not in memory !

Looking at the instructions preceeding the failing one (and having the Alpha architecture manual ready for looking up, what those instructions really do), one can see, that the code seems to be searching memory (byte-by-byte, R17 pointing to current byte, R6 containing some byte counter) for a byte value 0x2C (that's a comma ',' in ASCII code !):

ADDL R6,#X01,R6
LDA R17,#X0001(R17)
XOR R20,#X2C,R20
BEQ R20,#X000011
LDQ_U R22,(R17)
EXTBL R22,R17,R22
XOR R22,#X2C,R22 <<< R22 = 0x2C ?
ADDL R6,#X01,R6 <<< incr R6
LDA R17,#X0001(R17) <<< incr R17
BEQ R22,#X00000B <<< branch if R22=0x2C
LDQ_U R24,(R17) <<< ACCVIO here

In your dump, R6 should be something like the 'String length' or number of bytes already checked (R6 is 00002B5F in your register dump attached to your first note).

So the 'string' may have started at 7AE2C000-2B5F ! Try SDA> EXA 7AE2C000-2B5F;100 and see if you could identify the data shown...

You might also want to check your C code for a loop looking for a ',' in a string and make sure, there ALWAYS IS a comma and the loop will be TERMINATED at the end of the string by checking for a terminating 0 in case of an ASCIZ string ?!

Volker.