1748169 Members
4439 Online
108758 Solutions
New Discussion юеВ

System hang

 
Willem Grooters
Honored Contributor

Re: System hang

What hardware? In particular: what processor type (pre-EV56, EV56, EV6?)
I had similar problems on a EV56-based system and 3DLabs Oxygen VX1 in graphics mode (hung DECW$SERVER process looping heavily), it turned out the driver was EV6-optimized and wouldn't work on EV56.
Since Radeon is even 'newer', it could well be that the driver of thsi card is optimezed for EV6 as well and won't run properly on EV56 or earlier.
It didn't matter whether I had consoe set th Graphics or Serial, the only way to get around it was disableing DECWindows, or install a different card.
Willem Grooters
OpenVMS Developer & System Manager
Wim Van den Wyngaert
Honored Contributor

Re: System hang

Willem : he asking in DCL if TCP should be started. So, decw isn't started yet.

Just tried to boot minimal with my AS500. After 5 minutes the machine was blocked.
I power cycled it and tried again. No blocking.
I booted normal and then minimal again. No blocking.

I noted that job_control was doing a few IO's while I was waiting and doing show sys.

Nothing special in oplog.

???

Wim
Wim
Volker Halle
Honored Contributor

Re: System hang

Olive,

as you seem to be able to easily reproduce this, did you ever try to force a crash, if the system is hung like that ?

CTRL-P would not work on the graphics console, so you need to find and press the HALT button. Do you get the console prompt (>>>) ? Can you then enter CRASH and hit to force a crash ?

I would not expect any obvious problem to be seen in such a forced crash, but even testing this, may show, whether the problem is between the graphics hardware and the console firmware and/or OpenVMS.

Trying to get a valid crashdump when booted from the CD would require some setup work (DOSD), but even without that, it could maybe tell more about where the problem may be.

Volker.
olive_wide
Frequent Advisor

Re: System hang

It becomes a terrible problem.Today,I reboot the server 10 times,it hang 7 times!!
The base hardware of the Server is:
Alpha ES45 Model 2
Alpha 21264C,EV68,1250Hz
1024MB*2 memory
Radeon 7500 graphic card.
Openvms 7.3-2
Tomorrow,I will make a crash,and analyze the dump file,show here.

Thanks all.
Olive
olive_wide
Frequent Advisor

Re: System hang

Here is the crash information
Help me to analyze it,what's wrong with the system?

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 1
Crashdump Summary Information:



Crash Time: 22-AUG-2005 15:34:38.48
Bugcheck Type: OPERCRASH, Operator forced system crash
Node: BU (Cluster)
CPU Type: AlphaServer ES45 Model 2B
VMS Version: V7.3-2
Current Process: NULL
Current Image:
Failing PC: FFFFFFFF.80126F74 SCH$IDLE_C+00214
Failing PS: 00000000.00000303
Module: PROCESS_MANAGEMENT (Link Date/Time: 9-AUG-2004 12:55:04.33)
Offset: 00000F74

Boot Time: 22-AUG-2005 15:31:14.00
System Uptime: 0 00:03:24.48
Crash/Primary CPU: 00/00
System/CPU Type: 2608
Pagesize: 8 KByte (8192 bytes)
Physical Memory: 2048 MByte (4194304 PFNs, discontiguous memory)
Dumpfile Pagelets: 47138 blocks
Dump Flags: writecomp,errlogcomp
Dump Type: compressed,selective,shared_mem
EXE$GL_FLAGS: poolpging,init,bugdump
Paging Files: 1 Pagefile and 1 Swapfile installed

Stack Pointers:
KSP = FFFFFFFF.83743FF8 ESP = 00000000.00000000 SSP = 00000000.00000000
USP = 00000000.00000000

General Registers:
R0 = 00000000.00000003 R1 = FFFFFFFF.8154C000 R2 = FFFFFFFF.81009790
R3 = FFFFFFFF.810E11E0 R4 = FFFFFFFF.81AC8A00 R5 = 00000000.00000001
R6 = FFFFFFFF.8154C000 R7 = 00000000.00000000 R8 = 00000000.00000000
R9 = 00000000.00000000 R10 = 00000000.00000004 R11 = FFFFFFFF.81AC8A00
R12 = 00000000.7BD019C8 R13 = FFFFFFFF.810DC6C8 R14 = 00000000.00000001
R15 = 00000000.00000000 R16 = 00000000.0000064C R17 = 00000000.00000018
R18 = 00000000.00217380 R19 = 00000000.00000000 R20 = FFFFFFFF.80126DD8
R21 = 00000000.0000000F R22 = FFFFFFFF.81008000 R23 = 00000000.00000000
R24 = 00000001.58B3595B AI = 00000000.00000003 RA = FFFFFFFF.810D0B8C
PV = FFFFFFFF.810D0B40 R28 = FFFFFFFF.8100804C FP = FFFFFFFF.84899FC0
PC = FFFFFFFF.800EE068 PS = 38000000.00001F00

Halt Registers saved in per-CPU Slot (by console):
Halt PCBB = 00000000.0154C080 R25 (AI) = 00000000.00000000
Halt PC = FFFFFFFF.80126F74 R26 (RA) = FFFFFFFF.81008000
Halt PS = 00000000.00000303 R27 (PV) = 00000000.00000000
Halt Code = 00000000.00000001 --> "Console Operator Induced Crash"

Halt Registers saved in System HWPCB (by console):
KSP = FFFFFFFF.84899FC0 PTBR = 00000000.0003FFF8
ESP = FFFFFFFF.8489B000 AST{SR/EN} = 00000000.00000000
SSP = FFFFFFFF.84897000 ASN = 00000000.00000000
USP = FFFFFFFF.84897000 FEN = 80000000.00000000

System Registers:

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 2
Crashdump Summary Information:



Page Table Base Register (PTBR) 00000000.0003FFF8
Processor Base Register (PRBR) FFFFFFFF.8154C000
Privileged Context Block Base (PCBB) 00000000.00002180
System Control Block Base (SCBB) 00000000.00000398
Software Interrupt Summary Register (SISR) 00000000.00000000
Address Space Number (ASN) 00000000.00000000
AST Summary / AST Enable (ASTSR_ASTEN) 00000000.00000000
Floating-Point Enable (FEN) 00000000.00000000
Interrupt Priority Level (IPL) 00000000.0000001F
Machine Check Error Summary (MCES) 00000000.00000008
Virtual Page Table Base Register (VPTB) FFFFFEFC.00000000

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 3
Crashdump Summary Information:



Failing Instruction:
SCH$IDLE_C+00214: LDQ R26,#XFEC8(R13)

Instruction Stream (last 20 instructions):
SCH$IDLE_C+001C4: LDQ R26,#XFEC8(R13)
SCH$IDLE_C+001C8: LDL R1,#X001C(R6)
SCH$IDLE_C+001CC: LDL R5,#X0020(R6)
SCH$IDLE_C+001D0: LDA R26,#X2800(R26)
SCH$IDLE_C+001D4: LDL_L R24,(R26)
SCH$IDLE_C+001D8: BIC R24,R5,R24
SCH$IDLE_C+001DC: STL_C R24,(R26)
SCH$IDLE_C+001E0: BEQ R24,#X0001E9
SCH$IDLE_C+001E4: LDQ R26,#XFFC8(R13)
SCH$IDLE_C+001E8: LDQ R27,#XFFD0(R13)
SCH$IDLE_C+001EC: JSR R26,(R26)
SCH$IDLE_C+001F0: BR R31,#XFFFFA8
SCH$IDLE_C+001F4: LDL R23,#X0694(R6)
SCH$IDLE_C+001F8: BNE R23,#X00016D
SCH$IDLE_C+001FC: LDQ R22,#XFEC8(R13)
SCH$IDLE_C+00200: BIS R31,#X03,R0
SCH$IDLE_C+00204: ADDQ R22,#X40,R28
SCH$IDLE_C+00208: S4ADDL R0,R28,R28
SCH$IDLE_C+0020C: LDL R27,(R28)
SCH$IDLE_C+00210: BNE R27,#X00016D
SCH$IDLE_C+00214: LDQ R26,#XFEC8(R13)
SCH$IDLE_C+00218: LDL R17,#X27B0(R26)
SCH$IDLE_C+0021C: BLBS R17,#X000192
SCH$IDLE_C+00220: LDQ R24,#XFEC8(R13)
SCH$IDLE_C+00224: LDL R23,#X16F8(R24)

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 4
Current Registers: Process index: 0000 Process name: NULL PCB: 810D97C8 (CPU 0)



R0 = 00000000.00000003
R1 = FFFFFFFF.8154C000 MP_CPU (CPU Id 0)
R2 = FFFFFFFF.81009790 SCH$GQ_ACTIVE_PRIORITY
R3 = FFFFFFFF.810E11E0 SCH$WAIT_ANY_MODE
R4 = FFFFFFFF.81AC8A00 PCB (Username SYSTEM, Procnam SECURITY_SERVER)
R5 = 00000000.00000001
R6 = FFFFFFFF.8154C000 MP_CPU (CPU Id 0)
R7 = 00000000.00000000
R8 = 00000000.00000000
R9 = 00000000.00000000
R10 = 00000000.00000004
R11 = FFFFFFFF.81AC8A00 PCB (Username SYSTEM, Procnam SECURITY_SERVER)
R12 = 00000000.7BD019C8
R13 = FFFFFFFF.810DC6C8 SCH$IDLE
R14 = 00000000.00000001
R15 = 00000000.00000000
R16 = 00000000.0000064C
R17 = 00000000.00000018
R18 = 00000000.00217380
R19 = 00000000.00000000
R20 = FFFFFFFF.80126DD8 SCH$IDLE_C+00078
R21 = 00000000.0000000F
R22 = FFFFFFFF.81008000 EXE$GR_SYSTEM_DATA_CELLS
R23 = 00000000.00000000
R24 = 00000001.58B3595B
AI = 00000000.00000000
RA = FFFFFFFF.81008000 EXE$GR_SYSTEM_DATA_CELLS
PV = 00000000.00000000
R28 = FFFFFFFF.8100804C SCH$GL_MFYCNT+00008
FP = FFFFFFFF.84899FC0
PC = FFFFFFFF.800EE068 IOC_STD$CREATE_UCB_C+003A8
PS = 38000000.00001F00 Kernel Mode, IPL 31

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 5
Stack Decoder:



System Stack (NULL Process):
Stack Pointer FFFFFFFF.84899FC0
Stack Limits (low) FFFFFFFF.84898000
(high) FFFFFFFF.8489A000

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 6
Bugcheck Stack:



Stack Pointer SP => FFFFFFFF.84899FC0

Stack Frame:
PV FFFFFFFF.84899FC0 FFFFFFFF.810DC6C8 SCH$IDLE
Entry Point FFFFFFFF.80126D60 SCH$IDLE_C
FFFFFFFF.84899FC8 00000000.00000000
FFFFFFFF.84899FD0 FFFFFFFF.81AC8A00
return PC FFFFFFFF.84899FD8 FFFFFFFF.80152ABC SCH$INTERRUPT+00C3C
saved R3 FFFFFFFF.84899FE0 FFFFFFFF.810E11E0 SCH$WAIT_ANY_MODE
saved R13 FFFFFFFF.84899FE8 FFFFFFFF.810E0070 SYS$HIBER
saved R14 FFFFFFFF.84899FF0 00000000.00000001
saved FP FFFFFFFF.84899FF8 00000000.00000000

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 7
System Configuration:



System Information:
System Type AlphaServer ES45 Model 2B Primary CPU ID 00
Cycle Time 0.8 nsec (1250 MHz) Pagesize 8192 Byte

Memory Configuration:
Cluster PFN Start PFN Count Range (MByte) Usage
#00 0 629 0.0 MB - 4.9 MB Console
#01 629 261506 4.9 MB - 2047.9 MB System
#02 262135 9 2047.9 MB - 2048.0 MB Console

Per-CPU Slot Processor Information:
CPU ID 00 CPU State rc,pa,pp,oh,cv,pv,pmv,pl
CPU Type EV68CB Pass 4.0 (21264C)
PAL Code 1.98-43 Halt PC FFFFFFFF.80126F74
CPU Revision .... Halt PS 00000000.00000303
Serial Number JA50500313 Halt Code "Console Operator Induced Crash"
Console Vers V6.9-2 Halt Request "Warm Bootstrap Request"


OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 8
Adapter Configuration:



TR Adapter ADP Hose Bus BusArrayEntry Node CSR Vec/IRQ Port Slot Device Name / HW-Id
-- ----------- ----------------- ---- ----------------------- ---- ---------------------- ---- ---- ---------------------------
1 KA2608 FFFFFFFF.8161C9C0 0 BUSLESS_SYSTEM
2 PCI FFFFFFFF.8161CE40 0 PCI
FFFFFFFF.8161D220 38 FFFFFFFF.84A09800 40 7 ACER 1543 PCI-ISA Bridge
FFFFFFFF.8161D258 40 FFFFFFFF.84A1E000 90 PKA: 8 Symbios 895
FFFFFFFF.8161D290 48 FFFFFFFF.84A20800 A0 9 Memory Channel
FFFFFFFF.8161D2C8 50 FFFFFFFF.84A25000 40 10 00000000.B1548086 (..T?
FFFFFFFF.8161D418 80 FFFFFFFF.84A2E000 38 16 00000000.00000ACE (?)
3 ISA FFFFFFFF.8161D880 0 ISA
FFFFFFFF.8161DB18 0 FFFFFFFF.84A0C000 0 0 EISA_SYSTEM_BOARD
4 XBUS FFFFFFFF.8161DEC0 0 XBUS
FFFFFFFF.8161E118 0 FFFFFFFF.84A0C000 C 0 MOUS
FFFFFFFF.8161E150 1 FFFFFFFF.84A0C000 1 1 KBD
FFFFFFFF.8161E188 2 FFFFFFFF.84A0C000 3 TTA: 2 Serial Port
FFFFFFFF.8161E1C0 3 FFFFFFFF.84A0C000 4 TTB: 3 NS16450 Serial Port
FFFFFFFF.8161E1F8 4 FFFFFFFF.84A0C000 7 LRA: 4 Line Printer (parallel port)
FFFFFFFF.8161E230 5 FFFFFFFF.84A0C000 6 DVA: 5 Floppy
5 MEMCHAN FFFFFFFF.8161EA40 0 PCI
FFFFFFFF.8161EC98 48 FFFFFFFF.84A20800 A0 MCA: 9 Memory Channel
6 PCI FFFFFFFF.8161ED40 0 PCI
FFFFFFFF.8161F078 220 FFFFFFFF.84A28000 70 EIA: 4 DE602-BA i82558 100BaseTX
FFFFFFFF.8161F0B0 228 FFFFFFFF.84A2A800 74 EIB: 5 DE602-BA i82558 100BaseTX
7 PCI FFFFFFFF.8161F780 0 PCI
FFFFFFFF.8161FA18 80 FFFFFFFF.84A2E000 38 DQA: 16 ACER 5229 IDE Controller
FFFFFFFF.8161FA50 81 FFFFFFFF.84A2E000 3C DQB: 16 ACER 5229 IDE Controller
8 PCI FFFFFFFF.8161FAC0 1 PCI
FFFFFFFF.8161FD50 8 FFFFFFFF.84A32800 B0 EWA: 1 DEGXA-TB (Gigabit Ethernet)
FFFFFFFF.8161FD88 10 FFFFFFFF.84A37000 C0 EWB: 2 DEGXA-TB (Gigabit Ethernet)
9 PCI FFFFFFFF.816205C0 2 PCI
FFFFFFFF.81620850 8 FFFFFFFF.84A3E800 40 FGA: 1 KGPSA-** (Emulex LP9802)
FFFFFFFF.81620888 10 FFFFFFFF.84A43000 50 EWC: 2 DEGXA-TB (Gigabit Ethernet)
10 PCI FFFFFFFF.816210C0 3 PCI
FFFFFFFF.81621350 8 FFFFFFFF.84A4A800 D0 GHA: 1 ATI Radeon 7500
FFFFFFFF.81621388 10 FFFFFFFF.84A4F000 E0 FGB: 2 KGPSA-** (Emulex LP9802)

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 9
Paging File Usage (blocks):




Swapfile (Index 1) Device DKA0:
PFL Address FFFFFFFF.81A51EC0 UCB Address FFFFFFFF.8162DD80
Free Blocks 88064 Bitmap FFFFFFFF.81A51F60
Total Size (blocks) 88064 Flags inited,swap_file
Total Write Count 0 Total Read Count 0
Smallest Chunk (pages) 5504 Largest Chunk (pages) 5504
Chunks GEQ 64 Pages 1 Chunks LT 64 Pages 0

Pagefile (Index 254) Device DKA0:
PFL Address FFFFFFFF.81A77A40 UCB Address FFFFFFFF.8162DD80
Free Blocks 4202368 Bitmap FFFFFFFF.7FEE6008
Total Size (blocks) 4202368 Flags inited
Total Write Count 0 Total Read Count 0
Smallest Chunk (pages) 262648 Largest Chunk (pages) 262648
Chunks GEQ 64 Pages 1 Chunks LT 64 Pages 0

Summary: 1 Pagefile and 1 Swapfile installed

Total Size of all Swap Files: 88064 blocks
Total Size of all Paging Files: 4202368 blocks
Total Committed Paging File Usage: 34992 blocks

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 10
Memory Management Statistics:



Pagefaults: Non-Paged Pool:
Total Page Faults 9238 Successful Expansions 0
Total Page Reads 5922 Unsuccessful Expansions 0
I/O's to read Pages 2502 Failed Pages Accumulator 0
Modified Pages Written 0 Total Alloc Requests 9969
I/O's to write Mod Pages 0 Failed Alloc Requests 0
Demand Zero Faults 4425
Global Valid Faults 1021 Paged Pool:
Modified Faults 2643 Total Failures 0
Read Faults 40 Failed Pages Accumulator 0
Execute Faults 764 Total Alloc Requests 1330
Failed Alloc Requests 0

Direct I/O 40962 Cur Mapped Gbl Sections 295
Buffered I/O 4995 Max Mapped Gbl Sections 295
Split I/O 110 Cur Mapped Gbl Pages 3386
Hits 39645 Max Mapped Gbl Pages 3386
Logical Name Transl 22447 Maximum Processes 18
Dead Page Table Scans 0 Sched Zero Pages Created 0

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 11
Memory Management Statistics:



Distributed Lock Manager: Local Incoming Outgoing
$ENQ New Lock Requests 24112 88 5
$ENQ Conversion Requests 2080 54 1
$DEQ Dequeue Requests 23421 34 0
Blocking ASTs 2 0 1
Directory Functions 7947 4562
Deadlock Messages 0 0

$ENQ Requests that Wait 17 Deadlock Searches Performed 0
$ENQ Requests not Queued 1 Deadlocks Found 0

MSCP Statistics: Total IOs 17
Count of VC Failures 1 Split IOs 0
Count of Hosts Served 1 IOs that had to Wait (Buf) 0
Count of Disks Served 1 Requests in MemWait Queue 0
MSCP_BUFFER (SYSGEN) 1024 Max Req ever in MemWait 0
MSCP_CREDITS (SYSGEN) 32

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 12
Memory Management Statistics:



File System Cache: Current SYSGEN Param Hits Misses Hitrate
File Header Cache (ACP_HDRCACHE = 1402) 1434 566 71.7%
Storage Bitmap Cache (ACP_MAPCACHE = 350) 198 45 81.5%
Directory Data Cache (ACP_DIRCACHE = 1402) 3757 60 98.4%
Directory LRU (ACP_DINDXCACHE= 350) 2354 26 98.9%
FID Cache (ACP_FIDCACHE = 64) 8 4 66.7%
Extent Cache (ACP_EXTCACHE = 64) 43 10 81.1%
Quota Cache (ACP_QUOCACHE = 703) 0 0 0.0%

Volume Synch Locks 8034 Window Turns 23
Volume Synch Locks Wait 0 Currently Open Files 198
Dir/File Synch Locks 10508 Total Count of OPENs 742
Dir/file Synch Locks Wait 11 Total Count of ERASE QIOs 38
Access Locks 3098
Free Space Cache Wait 1

Global Pagefile Quota 196297 GBLPAGFIL (SYSGEN) Limit 196608

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 13
Process DCL Recall Buffer:



%CLUE-W-NOACC, either no CLI or process area paged out

OpenVMS (TM) Operating System, Version V7.3-2 -- System Dump Analysis 22-AUG-2005 15:34:38.48 Page 14
Active XQP Processes:



Extended Indx Process name Username PCB XQP Threads Outstanding
--PID--- ---- --------------- ------------ -------- --active--- -BIO---DIO-
%CLUE-I-NOACTIVE, there are no active XQP processes

Volker Halle
Honored Contributor

Re: System hang

Olive,

the CLUE output generally does not help in case of forced crashes. Pool and page/swap look o.k.

Your STARTUP process is supposed to hang on the READ from OPA0:

$ ANAL/CRASH sys$system
SDA> SHOW SUMM
SDA> SET PROC/IND=xxx ! pid of hanging process
SDA> SHOW PROC/CHAN

should have busy channel on OPA0:

Volker.
olive_wide
Frequent Advisor

Re: System hang

Volker,
Thank you
But I don't know your means :)
Please tell me what shall I do next.It is real a serious bad trouble for me :(
olive
Volker Halle
Honored Contributor

Re: System hang

Olive,

please issue the commands I've posted in the previous reply. You should see a process named STARTUP, please enter the process-index from column 2 (Indx) from the line with the process name STARTUP.

SDA> SET PROC/IND=xxxx

Then please post the output of the SHOW PROC/CHAN command.

Volker.

PS: Please be aware, that analysing a system hang through this kind of forum may be a long step-by-step process and it may also not lead to a result ! If possible, try to get local consulting from someone experienced in OpenVMS crash analysis.
Willem Grooters
Honored Contributor

Re: System hang

Sorry, overlooked you specified the data already...
Have you installed the graphics patches? IIRC, there have been issues with this card.
Do you have problems when booting from disk? If not, there might be a problem with code on the CD (which obviously misses the patches :-))
If you have a spare disk (or create a small one on your storage if applacible) and copy the CD onto that, and boot the system standalone from that disk; you may be able to apply the required grphics patches to that system - and booting is likely to be much faster.
It may help.
Willem Grooters
OpenVMS Developer & System Manager
Willem Grooters
Honored Contributor

Re: System hang

additional data, from the Graphics 0400 patchkit:

To provide maximum performance, OpenVMS can set certain
RADEON 7500 parameters to optimize bus usage. However,
after extensive testing in multiple configurations, it
was determined that there are some configurations in
which the RADEON 7500 may interact with other devices on
the same bus in such a way as to cause the system to hang
or crash. Therefore, the default settings for the
DECW$RADEON_FAST_DMA environmental option have been
changed as follows:

Just to inform.

Willem
Willem Grooters
OpenVMS Developer & System Manager