- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Periodic access violation
Operating System - OpenVMS
1752630
Members
5843
Online
108788
Solutions
Forums
Categories
Company
Local Language
юдл
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
юдл
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-11-2005 03:59 AM
тАО04-11-2005 03:59 AM
Periodic access violation
Thank you in advance for any and all help with this issue. We are running OpenVMS V7.1 on an Alpha 4100 in a process control/manufacturing facility with programs written in Fortran. Using Oracle DB for production storage.
The problem being described did not begin until after a recent patch/minor upgrade to Oracle to fix a bug (imagine that!). The program in question runs every hour, reading data from one instance, does some manipulation, and inserts the data into another instance. The program crashes on a regular basis of every 3 - 3.5 days.
I have several examples from the crash output, and each one seems to be in a different location of the program task. Of the 14 examples I have, the reason mask is the same (00), the virtual address is the same, with the exception of 2 of them, PC is the same for all, as well as PS.
Below is a copy of one of the access violation crash outputs. Again, thanks for any and all help/suggestions.
Steve
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=000000007AFA37C0, PC=FFFFFFFF8091A7A0, PS=0000001B
Improperly handled condition, image exit forced.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000010000
000000007AFA37C0
000000008091A7A0
000000000000001B
Register dump:
R0 = 0000000002607934 R1 = 00000000026444E8 R2 = 0000000000812600
R3 = 00000000026444E8 R4 = 00000000008126E0 R5 = 0000000000000000
R6 = 0000000000000001 R7 = 0000000002607934 R8 = 00000000008801B8
R9 = 0000000000000000 R10 = 0000000000008000 R11 = 0000000002651080
R12 = 000000007AFA4F6C R13 = 00000000026B4868 R14 = 000000000260EB60
R15 = 0000000000000000 R16 = 00000000026444E8 R17 = 00000000008126E0
R18 = 0000000000000000 R19 = 0000000000000001 R20 = 0000000000000000
R21 = 0000000002607938 R22 = 0000000000000100 R23 = 0000000000000100
R24 = 0000000000000100 R25 = 0000000000000000 R26 = FFFFFFFF8091C07C
R27 = 0000000000812528 R28 = FFFFFFFF8091A67C R29 = 0000000000000003
SP = 000000007AFB0000 PC = FFFFFFFF8091A7A0 PS = 000000000000001B
The problem being described did not begin until after a recent patch/minor upgrade to Oracle to fix a bug (imagine that!). The program in question runs every hour, reading data from one instance, does some manipulation, and inserts the data into another instance. The program crashes on a regular basis of every 3 - 3.5 days.
I have several examples from the crash output, and each one seems to be in a different location of the program task. Of the 14 examples I have, the reason mask is the same (00), the virtual address is the same, with the exception of 2 of them, PC is the same for all, as well as PS.
Below is a copy of one of the access violation crash outputs. Again, thanks for any and all help/suggestions.
Steve
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=000000007AFA37C0, PC=FFFFFFFF8091A7A0, PS=0000001B
Improperly handled condition, image exit forced.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000010000
000000007AFA37C0
000000008091A7A0
000000000000001B
Register dump:
R0 = 0000000002607934 R1 = 00000000026444E8 R2 = 0000000000812600
R3 = 00000000026444E8 R4 = 00000000008126E0 R5 = 0000000000000000
R6 = 0000000000000001 R7 = 0000000002607934 R8 = 00000000008801B8
R9 = 0000000000000000 R10 = 0000000000008000 R11 = 0000000002651080
R12 = 000000007AFA4F6C R13 = 00000000026B4868 R14 = 000000000260EB60
R15 = 0000000000000000 R16 = 00000000026444E8 R17 = 00000000008126E0
R18 = 0000000000000000 R19 = 0000000000000001 R20 = 0000000000000000
R21 = 0000000002607938 R22 = 0000000000000100 R23 = 0000000000000100
R24 = 0000000000000100 R25 = 0000000000000000 R26 = FFFFFFFF8091C07C
R27 = 0000000000812528 R28 = FFFFFFFF8091A67C R29 = 0000000000000003
SP = 000000007AFB0000 PC = FFFFFFFF8091A7A0 PS = 000000000000001B
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-11-2005 04:30 AM
тАО04-11-2005 04:30 AM
Re: Periodic access violation
Steve,
the reason mask value in the Signal Array (10000) indicates an access violation during an instruction fetch.
This exception happens in a system routine (PC=8091A7A0). You can determine which instruction was being executed and which routine this is in, if you issues the following commands in the running system (the same one, where this program has run):
$ ANAL/SYS ! needs CMKRNL privilege
SDA> READ/EXEC
SDA> EXA/INS 8091A7A0
SDA> EXA/INS 8091A7A0-30;40
SDA> EXIT
Then please provide the data reported by EXA/INS
You should also set up the process to write a process dump: $ SET PROC/DUMP before running the image in this process. If the process will abort with an improperly handled condition, a process dump will be written (as imagename.DMP in the current default directory).
Analysing process dumps on V7.1 with the PC in system space may prove quite complicated. More recent versions of OpenVMS have improved the IMGDMP facility and allow much better analysis.
Volker.
the reason mask value in the Signal Array (10000) indicates an access violation during an instruction fetch.
This exception happens in a system routine (PC=8091A7A0). You can determine which instruction was being executed and which routine this is in, if you issues the following commands in the running system (the same one, where this program has run):
$ ANAL/SYS ! needs CMKRNL privilege
SDA> READ/EXEC
SDA> EXA/INS 8091A7A0
SDA> EXA/INS 8091A7A0-30;40
SDA> EXIT
Then please provide the data reported by EXA/INS
You should also set up the process to write a process dump: $ SET PROC/DUMP before running the image in this process. If the process will abort with an improperly handled condition, a process dump will be written (as imagename.DMP in the current default directory).
Analysing process dumps on V7.1 with the PC in system space may prove quite complicated. More recent versions of OpenVMS have improved the IMGDMP facility and allow much better analysis.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-11-2005 06:16 AM
тАО04-11-2005 06:16 AM
Re: Periodic access violation
Volker, thanks for your help.
Below is the output from the SDA commands you recommended. I also wanted to setup this proc to take a proc-dump when it crashes, but apparently, the SET PROC/DUMP qualifier is only valid for the current process. The EXEs that run on my system are managed by a process management program that stops, starts, restarts and overall controls the images.
SDA> EXA/INS 8091A7A0
FFFFFFFF.8091A7A0: STQ R27,(SP)
SDA> EXA/INS 8091A7A0-30;40
FFFFFFFF.8091A770: LDQ R3,#X0018(FP)
FFFFFFFF.8091A774: LDQ R4,#X0020(FP)
FFFFFFFF.8091A778: LDQ R5,#X0028(FP)
FFFFFFFF.8091A77C: LDQ R6,#X0030(FP)
FFFFFFFF.8091A780: LDQ R7,#X0038(FP)
FFFFFFFF.8091A784: LDQ R8,#X0040(FP)
FFFFFFFF.8091A788: LDQ FP,#X0048(FP)
FFFFFFFF.8091A78C: LDA SP,#X0050(SP)
FFFFFFFF.8091A790: RET R31,(R26)
FFFFFFFF.8091A794: BIS R31,R31,R31
FFFFFFFF.8091A798: LDA SP,#XF520(SP)
FFFFFFFF.8091A79C: BIS R31,R16,R1
FFFFFFFF.8091A7A0: STQ R27,(SP)
FFFFFFFF.8091A7A4: LDA R16,#X0060(SP)
FFFFFFFF.8091A7A8: STQ R26,#X0A78(SP)
FFFFFFFF.8091A7AC: LDA R0,#X0081(SP)
FFFFFFFF.8091A7B0: STQ R2,#X0A80(SP)
SDA> exit
Below is the output from the SDA commands you recommended. I also wanted to setup this proc to take a proc-dump when it crashes, but apparently, the SET PROC/DUMP qualifier is only valid for the current process. The EXEs that run on my system are managed by a process management program that stops, starts, restarts and overall controls the images.
SDA> EXA/INS 8091A7A0
FFFFFFFF.8091A7A0: STQ R27,(SP)
SDA> EXA/INS 8091A7A0-30;40
FFFFFFFF.8091A770: LDQ R3,#X0018(FP)
FFFFFFFF.8091A774: LDQ R4,#X0020(FP)
FFFFFFFF.8091A778: LDQ R5,#X0028(FP)
FFFFFFFF.8091A77C: LDQ R6,#X0030(FP)
FFFFFFFF.8091A780: LDQ R7,#X0038(FP)
FFFFFFFF.8091A784: LDQ R8,#X0040(FP)
FFFFFFFF.8091A788: LDQ FP,#X0048(FP)
FFFFFFFF.8091A78C: LDA SP,#X0050(SP)
FFFFFFFF.8091A790: RET R31,(R26)
FFFFFFFF.8091A794: BIS R31,R31,R31
FFFFFFFF.8091A798: LDA SP,#XF520(SP)
FFFFFFFF.8091A79C: BIS R31,R16,R1
FFFFFFFF.8091A7A0: STQ R27,(SP)
FFFFFFFF.8091A7A4: LDA R16,#X0060(SP)
FFFFFFFF.8091A7A8: STQ R26,#X0A78(SP)
FFFFFFFF.8091A7AC: LDA R0,#X0081(SP)
FFFFFFFF.8091A7B0: STQ R2,#X0A80(SP)
SDA> exit
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-11-2005 06:43 PM
тАО04-11-2005 06:43 PM
Re: Periodic access violation
Steve,
although the information in the signal array (reason mask) and the instruction stream are not 100% consistent, I would assume a stack overflow condition as a possible reason for this process crash.
The ACCVIO happened on an instruction trying to write to the user stack. The stack pointer (SP) had just been decremented by 2784. bytes.
Failing virtual address: 7AFA37C0
SP decrement: FFFFF520
Previous SP value: 7AFA42A0
So the previous SP pointed to another page. Each Alpha page is 8kb in size (hex 2000), so the stack pointer crossed the page boundary at 7AFA4000. When trying to access the next-lower page, the ACCVIO seems to have happened.
The user stack normally expands automatically towards lower addresses in P1 space. Stack expansion may be limited by VIRTALPAGECNT or process PGFLQUOTA.
Try to watch this process during normal operations using SHOW PROC/CONT/ID= and look for increasing values of Virtual Pages and whether this value apporaches the VIRTAULPAGECNT system parameter.
You could also use SDA to check the P1 space low address:
$ ANAL/SYS
SDA> SET PROC/ID=
SDA> SHOW PROC/PHD
...
First free P1 VA 00000000.7AE38000 ...
...
If this address continually decreases while the process is in operation, this may indicate a stack overflow problem.
You can also check the process for remaing page file quota using
$ SHOW PROC/QUOTA/ID=
If Paging File Quota continously decreases, you can also predict, that the process is going to crash at some point in time...
Volker.
although the information in the signal array (reason mask) and the instruction stream are not 100% consistent, I would assume a stack overflow condition as a possible reason for this process crash.
The ACCVIO happened on an instruction trying to write to the user stack. The stack pointer (SP) had just been decremented by 2784. bytes.
Failing virtual address: 7AFA37C0
SP decrement: FFFFF520
Previous SP value: 7AFA42A0
So the previous SP pointed to another page. Each Alpha page is 8kb in size (hex 2000), so the stack pointer crossed the page boundary at 7AFA4000. When trying to access the next-lower page, the ACCVIO seems to have happened.
The user stack normally expands automatically towards lower addresses in P1 space. Stack expansion may be limited by VIRTALPAGECNT or process PGFLQUOTA.
Try to watch this process during normal operations using SHOW PROC/CONT/ID=
You could also use SDA to check the P1 space low address:
$ ANAL/SYS
SDA> SET PROC/ID=
SDA> SHOW PROC/PHD
...
First free P1 VA 00000000.7AE38000 ...
...
If this address continually decreases while the process is in operation, this may indicate a stack overflow problem.
You can also check the process for remaing page file quota using
$ SHOW PROC/QUOTA/ID=
If Paging File Quota continously decreases, you can also predict, that the process is going to crash at some point in time...
Volker.
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
News and Events
Support
© Copyright 2024 Hewlett Packard Enterprise Development LP