Operating System - OpenVMS
1827677 Members
4199 Online
109967 Solutions
New Discussion

Re: Process dump error meaning - BR R31,#X000228

 
SOLVED
Go to solution
sathya prabu
Advisor

Process dump error meaning - BR R31,#X000228

Hi,

While trying to debug a TDB file, there was an ACCVIO system error message. I got the process dump and analyzed it. I am getting the follwing the "things" that i could follow. Can anybody tell me what this means?

analyze/proc set.dmp

OpenVMS Alpha Debug64 Version V8.3-009


%DEBUG-I-NODSTS, no Debugger Symbol Table: no DSF file found and
-DEBUG-I-NODSTIMG, no symbols in DISK$ALPHASYS:[VMS$COMMON.SYSEXE]SET.EXE;1
%DEBUG-I-NOLOCALS, image does not contain local symbols
%DEBUG-I-NOGLOBALS, some or all global symbols not accessible
%SYSTEM-F-IMGDMP, dynamic image dump signal at PC=000000000004C8D4, PS=00000018
break on unhandled exception preceding SHARE$SET_DATA2+116948
DBG> exa/inst 313556
SHARE$SET_DATA2+116948: BR R31,#X000228
DBG> exa/inst 313555
SHARE$SET_DATA2+116947: BLBC R0,#X02286B
DBG> exa/inst 313554
SHARE$SET_DATA2+116946:
DBG> exa/inst 313553
SHARE$SET_DATA2+116945: LDBU R3,#X5A51(R11)
DBG> exa/inst 313552
SHARE$SET_DATA2+116944: JSR R26,(R26)
DBG> exa/inst 313551
SHARE$SET_DATA2+116943:
DBG> exa/inst 313550
SHARE$SET_DATA2+116942:
DBG>
22 REPLIES 22
sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

This was the actual ACCVIO msg while debugging!

Task is in SERVER SERVER_1
Task is in the task debugger
Task is in SERVER SERVER_2
%DEBUG-I-DYNMODSET, setting module INIT_1
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=000000000140C9F3, PC=0000000000F6FD6C, PS=0000001B
break on unhandled exception at INIT_1\INIT_1\%LINE 305+16
305: Initialize INIT_1_WKSP.
DBG>
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

Hi,

looks like the code intentionally created an image dump by signalling SS$_IMGDMP.

Try:

DBG> SET MODE HEX
DBG> EXA/INS 4C8D4-20:4C8D4

This should show you the instruction stream. The updated PC (probably after calling LIB$SIGNAL) points to the following instruction (0x04C8D4 = decimal 313556):

SHARE$SET_DATA2+116948: BR R31,#X000228

but you would be interested in the previous instructions...

Volker.
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

So you're really trying to figure out the context of the ACCVIO. Try this:

DBG> SET MODE HEX
DBG> EXA/INS 0F6FD6C

Then look at the instruction opcode and figure out, which register contains a 'bad' or unexpected value. The access violation happened when trying to WRITE to virtual address 140C9F3

Volker.
sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Hi,

I tried examining the instructions, but i got he follwing messages.

DBG> set rad hex
DBG> exa/inst 140C9F4
%DEBUG-E-NOACCESSR, no read access to address 000000000140C9F4
DBG> exa/inst 140C9F3
%DEBUG-E-NOACCESSR, no read access to address 000000000140C9F3
DBG> exa/inst 4C8D4-20:4C8D4
SHARE$SET_DATA2+1C8B4: LDQ R26,#XFF60(R2)
SHARE$SET_DATA2+1C8B8: ADDL R31,R9,R16
SHARE$SET_DATA2+1C8BC: BIS R31,R31,R17
SHARE$SET_DATA2+1C8C0: LDA R18,#X04FC(R31)
SHARE$SET_DATA2+1C8C4: BIS R31,#X03,R25
SHARE$SET_DATA2+1C8C8: LDQ R27,#XFF68(R2)
SHARE$SET_DATA2+1C8CC: LDQ_U R31,(SP)
SHARE$SET_DATA2+1C8D0: JSR R26,(R26)
SHARE$SET_DATA2+1C8D4: BR R31,#X000228

Please let me know what this says. I Hope i cannot read the address specified. Also, Every time i debug the TDB file, the debugger gives ACCVIO at either 140C9F3 or 140C9F4. Does this has anything to do with the QUOTAs? Also, i have tried setting the bytlm,wsextent and other memory limiting values higher, still i am getting the reason mask=04 ACCVIO. How should i approach this issue?
sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Hi,

I got some more info..

The follwing is what i have got. There are no values in any adresses till 140C9ee. Please anybody let me know why the ACCVIO for that address happened during write operation and no values are in other addresses after 140C9ee? what should i do to prevent this?

DBG> set rad hex
DBG> exa/inst 140C9F3
ACM$SERVER_1+0A3:
DBG> exa/inst 140C9F2
ACM$SERVER_1+0A2:
DBG> exa/inst 140C9F1
ACM$SERVER_1+0A1:
DBG> exa/inst 140C9F0
ACM$SERVER_1+0A0:
DBG> exa/inst 140C9ef
ACM$SERVER_1+9F:
DBG> exa/inst 140C9ee
ACM$SERVER_1+9E: FBGE F11,#X006BFC
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

The code at SHARE$SET_DATA2+1C8D0 is signalling SS$_IMGDMP by calling LIB$SIGNAL or LIB$STOP (most likely).

SHARE$SET_DATA2+1C8C0: LDA R18,#X04FC(R31)
loads 04FC into R18, this is the SS$_IMGDMP status code. You can verify this with $ EXIT %X4FC

I would assume, that this is code in some condition handler, which tries to force a process dump to enable the underlying exception to be diagnosed.

DBG> SHOW CALL

should show the call stack and the original exception (ACCVIO).

You need to concentrate on the ACCVIO. If EXAMINE of the failing virtual address also fails in the debugger (don't use EXA/INS, but just use EXA 140C9F3), you need to locate the failing instruction stream, as I said before:

DBG> EXA/INS 0F6FD6C-20;0F6FD6C

then decode the instruction shown at 0F6FDC and look at the register contents. Then figure out, why some register contains an invalid value pointing to a non-accessible virtual address.

Then find the source code for that instruction stream.

Volker.



sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Hi Volker,

Thanks a lot for the help. I was able get more information on the issue.

Please just let me know what the below error means.

DBG> show call
module name routine name line rel PC abs PC
*INIT_1 INIT_1 305 000000000000031C 0000000000F6FD6C
SHARE$ACMTWPSHR 0000000000037BBC 0000000007CA5BBC
SHARE$ACMTWPSHR 0000000000039588 0000000007CA7588
SHARE$ACMTWPSHR 000000000003E08C 0000000007CAC08C
SHARE$ACMTWPSHR 000000000003DDCC 0000000007CABDCC
SHARE$ACMTWPSHR 000000000003A330 0000000007CA8330
SHARE$ACMTWPSHR 0000000000037DEC 0000000007CA5DEC
SHARE$ACMTWPSHR 00000000000393A0 0000000007CA73A0
*SERVER_1 ACM$SERVER_1 00000000000000D4 000000000140C9D4
SQL$PROC_2_A93A76_241CC1 0000000001927E18 0000000001937E18
----- the above looks like a null frame in the same scope as the frame below
0000000000079040 0000000000089040
FFFFFFFF80385D54 FFFFFFFF80385D54

DBG> set rad hex
DBG> exa SQL$PROC_2_A93A76_241CC1
RDB$SUBROUTINE_1_SR\SQL$PROC_2_A93A76_241CC1: 01403099

DBG> exa 1937E18
SQL$PROC_2_A93A76_241CC1+658: 23DE0010A79E0008
DBG> exa 1927E18
RDB$SUBROUTINE_2_SR\SQL$PROC_1_A8EB7D_9563C1\%LINE 309+18: LDL R18,#X00B0(R3)

DBG> exa SQL$PROC_1_A8EB7D_9563C1
RDB$SUBROUTINE_2_SR\SQL$PROC_1_A8EB7D_9563C1: 06803099


-sathya
Volker Halle
Honored Contributor
Solution

Re: Process dump error meaning - BR R31,#X000228

Sathya,

this is getting confusing.

In the initial post, you've analyzed a process dump with a %SYSTEM-F-IMGDMP at PC=04C8D4.

In your last response, your SHOW CALLS output does not match this situation. It should have also listed SHARE$SET_DATA2+116948

Please ALWAYS use DBG> SET RADIX HEX, so that the addresses shown always are in hex.

The ACCVIO seems to have been reported at
*INIT_1 INIT_1 305 000000000000031C 0000000000F6FD6C

Now examine

DBG> SET MODE HEX
DBG> EXA/INS 0F6FD6C-20;0F6FD6C

Please note that the failing VA of 0140C9F3 is pretty close to the following return address on the stack:

*SERVER_1 ACM$SERVER_1 00000000000000D4 000000000140C9D4

Volker.
sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Hi Volker,

Thanks a lot for your support

That was my mistake, the first time PC address was from a different code path and the second time PC address from a another call. But, both the times the write operation on the specific VA x140C9F0 + x0003 is failing.

I could get the exact place where the instruction failed.

Line 305: LDQ_U R24,#X0003(R3)
: LDQ_U R19,(R3)
: MSKLH R24,R3,R24
: MSKLL R19,R3,R19
-> : STQ_U R24,#X0003(R3)

In the entire routine, R3 is unmodified. From beginning it holds the same value in it. When STQ_U is tried on R24 with x0003(R3), the ACCVIO is thrown. Can you let me what kind of problem is this? how i can solve it? why R3 is holding the value when it knows it cannot write to address location it is holding?

The correspoding line was an INITIALIZE line in case of one SUBROUTINE.

In case of the other the corresponding line was a MOVE operation.

Thanks,
-Sathya
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

Sathya,

you need to find out the caller of this piece of code and figure out, HOW R3 is getting loaded with this value.

If you can reproduce this, you might want to set a breakpoint at the failing instruction and look at the value of R3, when you hit that breakpoint. You may be getting to this code with correct values of R3 as well.

Once you hit this breakpoint and R3 looks bad or unexpected, issue a DBG> SHOW CALL to find out where this call has been called from.

Volker.
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

Sathya,

can you find out, which data structures are at address 140C9F0 ? The data at that address range is readable, but not writeable from USER mode.

You could try a couple of EXAMINE commands at that address range. If it would be ASCII data, maybe you can guess where it's being used from.

Volker.
sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Hi Volker,

Thanks a lot.

But i got u some more info.. The process behaves very wierd. Please shed some light on it. This is still dragging. Wanna get some hold on it..

at the start of the debugging
exa/ps 140C9F0
ACM$READ_SERVER+0A0:
SP_ALIGN IPL VMM CM IP SW
0 25 0 KRNL 0 0

after entering the module :-
exa/ps 140C9F0
ACM$READ_SERVER+0A0:
SP_ALIGN IPL VMM CM IP SW
0 25 0 KRNL 0 0

R3 content when entering into the specific module
INIT_PPVM\INIT_PPVM\%R3: 000000000140C9F0

ascii content at 140C9F0
ACM$READ_SERVER+0A0: "`Ã ......"

hex content at 140c9f0
ACM$READ_SERVER+0A0: 000103180008D960

Actual exception:
stepped to INIT_PPVM\INIT_PPVM\%LINE 305
INIT_PPVM\INIT_PPVM\%R3: 000000000140C9F0
INIT_PPVM\INIT_PPVM\%R24: 2020202020000000
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=000000000140C9F3, PC=0000000000F6FD6C, PS=0000001B
break on unhandled exception at INIT_PPVM\INIT_PPVM\%LINE 305+10

examine after exception:
ACM$READ_SERVER+0A0: 000103180008D960

ps after exception:
ACM$READ_SERVER+0A8:
SP_ALIGN IPL VMM CM IP SW
0 9 0 SUPR 0 0
ACM$READ_SERVER+0A0:
SP_ALIGN IPL VMM CM IP SW
0 25 0 KRNL 0 0

exa/ps 140C9F3
ACM$READ_SERVER+0A3:
SP_ALIGN IPL VMM CM IP SW
1 24 0 KRNL 0 0

after ASCII deposit on address(process allowed to deposit) 140C9F3
ACM$READ_SERVER+0A3: 0109415948544153
ACM$READ_SERVER+0A3: "SATHYA.."

-sathya
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

Sathya,

to prove that memory at virtual address 140C9F3 is NOT writeable by user mode do this:

- start your application with the debugger
- DBG> SET BR INIT_PPVM\INIT_PPVM\%LINE 305
- DBG> GO

Once you've hit the breakpoint, login as a privileged user (need CMKRNL privs), then invoke SDA with ANAL/SYS. In SDA, set process context to the process being debugged:

SDA> SET PROC/ID=
SDA> SHOW PROC/PAGE 140C9F3;10

then look at the data in columns 'Read' and 'Writ' and post this data here

SDA> EXIT

- in your debug session, issue a SHOW CALLS - this will tell you from which routine INIT_PPVM has been called. Consider to set a breakpoint at the call to INIT_PPVM. Look at the value of R3 in the context of the caller.

Volker.
sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Hi Volker,

Thank you very much for the info.

Here is the data for the address space
Process address space
---------------------

Mapped Address PTE Address PTE Type Read Writ MLOA GH PgTyp Loc Bak RefCnt WSLX
----------------- ----------------- ----------------- ----- ---- ---- ---- - ------- ------ ----------------- ---- --------
00000000.0140C000 FFFFFEFC.00005030 000C0C02.00160F01 VALID KESU NONE M-U- 0 PROCESS ACTIVE FF000000.00000000 0001 00000555

Please let nme know what it signifies and how im gonna get the code work?

-Thanks
Sathya
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

Sathya,

as expected, the protection of the page in question is Writ: NONE - so your code can not write to that page !

The only conclusion from this is, that the some previous code calculated a wrong address or address offset before calling your final failing routine, which incurs the ACCVIO.

So my previous advice is still valid, run the code with the debuggger and wait for the ACCVIO, then issue a DBG> SHOW CALLS commands to see the call stack.

Or set a breakpoint to the first instruction in the failing routine (INIT_PPVM ?) and look at the value in R3. If it looks 'bad', determine the caller. For the next execution of your program, set a breakpoint in the calling routine and step through it, until it calls INIT_PPVM. Look at how it calculates R3.

Does your program always fail when entering this routine ? Did it ever work before ? If so, what changed ?

Volker.
sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Hi Volker,

Thanks for the quick info.

Actually ,the code was working fine before.
After that,a lot of changes has been made. I could not pin point what exactly is it. But i was able to sense something wierd(for me?!) again.

While in debug, The task(previously i thought that the routine INIT_PPVM is ought to be called from the TASK) is calling this INIT_PPVM without having any actual call being made from this task.

There is one specific step in the task where it is mistakenly being called always.
When i 'step debug' the task, it performs normally and when it reaches that specific step i did 'set break' on the routine actually being called from the task, but the acms debugger is not showing me the actual routine that i have set break, instead shows this INIT_PPVM and as i step through the debugger, it gives ACCVIO at line 305 for STQ_U on 140c9f3.

Do you have any idea why the task is calling uninvited guests?!?

-Thanks,
Sathya
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

Sathya,

I don't have any experience with ACMS debugging.

If you are only looking at the source code, you may not see the actual 'call', if it's being generate by the compiler. You would need to look at the machine code listing of the module.

Once you've hit the ACCVIO, try

DBG> SHOW CALL
DBG> EXA/INS @R26

Volker.
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

and by the way:

The fact the that page is Writ: NONE is consistant with the fact, that this seems to be a code page - as pointed out earlier:

Please note that the failing VA of 0140C9F3 is pretty close to the following return address on the stack:

*SERVER_1 ACM$SERVER_1 00000000000000D4 000000000140C9D4

So something may have gone wrong on the stack and some code address has been picked up incorrectly and used as a data address.

What does this return ?

DBG> EXA/INS 0140C9F3-10:0140C9F3+10

Volker.
sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Hi Volker,

I am Getting the following instrcutions at the range

DBG> exa/ins 0140C9d3:0140C9F3+10
ACM$READ_SERVER+83: FBLT F16,#XF41E6B
ACM$READ_SERVER+87: BGT R8,#X041E43
ACM$READ_SERVER+8B: STT F16,#X0847(R0)
ACM$READ_SERVER+8F: STQ_C R16,#X10A7(R0)
ACM$READ_SERVER+93: STQ_C R16,#X18A5(R0)
ACM$READ_SERVER+97: FBEQ F0,#XF41EA7
ACM$READ_SERVER+9B: BGT R4,#X000143
ACM$READ_SERVER+9F:
ACM$READ_SERVER+0A3:
ACM$READ_SERVER+0A7:
ACM$READ_SERVER+0AB:
ACM$READ_SERVER+0AF:
ACM$READ_SERVER+0B3:

and 'show calls' having

module name routine name line rel PC abs PC
*INIT_PPVM INIT_PPVM 305 000000000000031C 0000000000F6FD6C
SHARE$ACMTWPSHR 0000000000037BBC 0000000007CA5BBC
SHARE$ACMTWPSHR 0000000000039588 0000000007CA7588
SHARE$ACMTWPSHR 000000000003E08C 0000000007CAC08C
SHARE$ACMTWPSHR 000000000003DDCC 0000000007CABDCC
SHARE$ACMTWPSHR 000000000003A330 0000000007CA8330
SHARE$ACMTWPSHR 0000000000037DEC 0000000007CA5DEC
SHARE$ACMTWPSHR 00000000000393A0 0000000007CA73A0
*READ_SERVER ACM$READ_SERVER 00000000000000D4 000000000140C9D4
SQL$PROC_2_A45A23_241DC1 0000000001927E18 0000000001937E18
----- the above looks like a null frame in the same scope as the frame below
0000000000079040 0000000000089040
FFFFFFFF80385D54 FFFFFFFF80385D54

-Thanks
sathya
Volker Halle
Honored Contributor

Re: Process dump error meaning - BR R31,#X000228

Sathya,

this is NOT a valid instruction stream. Alpha instruction salways start a longword boundaries - it's my fault that I asked you to display the I-stream starting at 0140C9F3-10.

So let's try this again:

READ_SERVER ACM$READ_SERVER 00000000000000D4 000000000140C9D4

So DBG> EXA/INS 0140C9D4-10:0140C9D4

should report a valid instruction stream. If so, then try:

DBG> EXA/INS 0140C9D4-10:0140C9F3+11

Also please try:

DBG> EXA/QUAD 140C9F0
DBG> EXA/LONG @140C9F0
DBG> EXA/LONG @(140C9F0+4)

And just for confirmation:

DBG> EXA/INS 0F6FD6C
DBG> EXA/INS 07CA5BBC-10:07CA5BBC

and

DBG> EXA R26 ! should display return address
DBG> EXA/INS @R26

Trying to do a debug session via a forum is going to be a long and tedious process, I hope you are aware of this...

Volker.

sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Hi Volker,

Thanks a lot for all your support. I understand. Let me carry on from here.

May be if i couldn't i might start a new thread again(;))(jus jokin..).

Thanks again for all your support.

-Sathya
sathya prabu
Advisor

Re: Process dump error meaning - BR R31,#X000228

Solution -- May be in some other thread.. This gonna be a tedious process.. A entire application debug.. so temporarily giving up.. but will be back with a solution.