HPE 9000 and HPE e3000 Servers
1751973 Members
5399 Online
108784 Solutions
New Discussion юеВ

Re: Help Server Panic

 
SCSI Error
Regular Advisor

Help Server Panic

Hi,

I would like to ask on whats the problem of the server based on the crash dumps below.

Server just rebooted due to panic

q4> load mpinfo_t from mpproc_info max nmpinfo
loaded 8 mpinfo_ts as an array (stopped by max count)
q4> trace pile
processor 0 claims to be idle
stack trace for event 6
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff408cc0
can not get struct frame_marker at 0x0.0xff`ff408aa0
idle+0xa74

processor 1 claims to be idle
stack trace for event 7
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff40acc0
can not get struct frame_marker at 0x0.0xff`ff40aaa0
idle+0xa74

processor 2 claims to be idle
stack trace for event 4
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff40ccc0
can not get struct frame_marker at 0x0.0xff`ff40cbf0
sd_strategy+0x3c

processor 3 claims to be idle
stack trace for event 5
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff40ed10
can not get struct frame_marker at 0x0.0xff`ff40eaf0
idle+0xa58

processor 4 claims to be idle
stack trace for event 1
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff410cc0
can not get struct frame_marker at 0x0.0xff`ff410aa0
idle+0xb08

processor 5 claims to be idle
stack trace for event 2
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff412cc0
can not get struct frame_marker at 0x0.0xff`ff412aa0
idle+0xa7c

processor 6 was running process at 0x0`667a6040 (pid 2292), thread at 0x0`667a7040 (tid 2395)
stack trace for event 0
crash event was an HPMC
PC (0xad9cc00.0x0`00003368) was in user space

processor 7 was running process at 0x0`667a2040 (pid 2201), thread at 0x0`667a3040 (tid 2303)
stack trace for event 3
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff416cc0
can not get struct frame_marker at 0x0.0xff`ff416bd0
biowait+0x178
q4> q
no such symbol "q"
q4> quit
# ls
.q4_hist INDEX image.1.1 image.1.2 image.1.3.gz image.2.1.gz image.2.2 image.2.3.gz image.3.1.gz vmunix
# cd ..
# ls -lrt
total 165904
drwxr-xr-x 2 root root 8192 Sep 19 15:09 crash.40
drwxr-xr-x 2 root root 8192 Sep 19 16:25 crash.42
drwxr-xr-x 2 root root 8192 Sep 20 12:24 crash.43
drwxr-xr-x 2 root root 8192 Sep 20 15:59 crash.41
drwxr-xr-x 2 root root 8192 Oct 6 01:42 crash.44
drwxr-xr-x 2 root root 8192 Oct 6 09:23 crash.45
drwxr-xr-x 2 root root 8192 Oct 31 09:33 crash.46
-rwxr-xr-x 1 root root 2 Oct 31 09:47 bounds
-rw-r--r-- 1 root sys 84858880 Oct 31 09:52 crash.tar
drwxr-xr-x 2 root root 8192 Oct 31 10:05 crash.47
# cd crash.46
# ls -lrt
total 193856
-rw-r--r-- 1 root root 7981970 Oct 31 09:31 vmunix.gz
-rw-r--r-- 1 root root 8698226 Oct 31 09:31 image.1.1.gz
-rw-r--r-- 1 root root 4725776 Oct 31 09:31 image.1.2.gz
-rw-r--r-- 1 root root 5808752 Oct 31 09:32 image.1.3.gz
-rw-r--r-- 1 root root 648052 Oct 31 09:32 image.1.4.gz
-rw-r--r-- 1 root root 21979830 Oct 31 09:32 image.2.1.gz
-rw-r--r-- 1 root root 43883540 Oct 31 09:33 image.2.2.gz
-rw-r--r-- 1 root root 5464487 Oct 31 09:33 image.2.3.gz
-rw-r--r-- 1 root root 106 Oct 31 09:33 image.3.1.gz
-rw-r--r-- 1 root root 1408 Oct 31 09:33 INDEX
# gunzip vmunix.gz
# cat I
# more
# cat INDEX
comment savecrash crash dump INDEX file
version 2
hostname
modelname 9000/800/N4000-55
panic , isr.ior = 0'182406ab.40000000'a9862448
dumptime 1193782820 Wed Oct 31 09:20:20 EDT 2007
savetime 1193783477 Wed Oct 31 09:31:17 EDT 2007
release @(#) $Revision: vmunix: vw: -proj selectors: CUPI80_BL2000_1108 -c 'Vw for CUPI80_BL2000_1108 build' -- cupi80_bl 2000_1108 'CUPI80_BL2000_1108' Wed Nov 8 19:24:56 PST 2000 $
memsize 8589934592
chunksize 134217728
module /stand/vmunix vmunix 19000208 2066622737
image image.1.1 0x0000000000000000 0x0000000007ff4000 0x0000000000000000 0x0000000000046a4f 3530883753
image image.1.2 0x0000000000000000 0x0000000007ff8000 0x0000000000046a50 0x000000000004ea47 1459972055
image image.1.3 0x0000000000000000 0x0000000007ff5000 0x000000000004ea48 0x0000000000078ec7 2262858736
image image.1.4 0x0000000000000000 0x000000000037d000 0x0000000000078ec8 0x000000000007ffff 3554446994
image image.2.1 0x0000000000000000 0x0000000007feb000 0x0000000000100000 0x000000000018fe07 3919034855
image image.2.2 0x0000000000000000 0x0000000007ffc000 0x000000000018fe08 0x00000000001a7b57 38330710
image image.2.3 0x0000000000000000 0x000000000078c000 0x00000000001a7b58 0x00000000001fffff 3241226112
image image.3.1 0x0000000000000000 0x0000000000010000 0x0000000000280000 0x00000000002fffff 4215202376
# q4pxdb vmunix
q4pxdb64: vmunix is already preprocessed
PXDB aborted.
# q4 .
@(#) q4 $Revision: 11.X B.11.23d Thu May 6 18:05:11 PST 2003$ 0
Reading kernel symbols ...
Reading data types ...
Initialized PA-RISC 2.0 address translator ...
Initializing stack tracer ...
script /usr/contrib/Q4/lib/q4lib/sample.q4rc.pl
executable /usr/contrib/Q4/bin/perl
version 5.006001
SCRIPT_LIBRARY = /usr/contrib/Q4/lib/q4lib_11.11
perl will try to access scripts from directory
/usr/contrib/Q4/lib/q4lib_11.11

loaded spinlock_depth.pl
run Print_ProcState
q4: (warning) No loadable modules were found
q4: (warning) No loadable modules were found
Try to load dumpdev_t to find dump space configured
Total Dump space CONFIGURED: 4194304 KB 4096 MB
Try to load dumpimage_t to find dump space actually used
Total Dump space USED: 666492 KB 650 MB
Okay, Dump space appears to be sufficient : about 3445 MB extra
q4> examine panicstr using s
, isr.ior = 0'182406ab.40000000'a9862448
q4> runningprocs
010 8 0x8
q4> load mpinfo_t from mpproc_info max nmpinfo
loaded 8 mpinfo_ts as an array (stopped by max count)
q4> trace pile
processor 0 was running process at 0x0`65cff040 (pid 4410), thread at 0x0`65cf6040 (tid 4876)
stack trace for event 5
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff408cc0
can not get struct frame_marker at 0x0.0xff`ff408c30
vx_irwlock2+0x50

processor 1 claims to be idle
stack trace for event 4
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff40acc0
can not get struct frame_marker at 0x0.0xff`ff40aaa0
idle+0xb08

processor 2 claims to be idle
stack trace for event 6
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff40ccc0
can not get struct frame_marker at 0x0.0xff`ff40caa0
idle+0xb08

processor 3 claims to be idle
stack trace for event 7
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff40ecc0
can not get struct frame_marker at 0x0.0xff`ff40eaa0
idle+0xb08

processor 4 claims to be idle
stack trace for event 1
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff410cc0
can not get struct frame_marker at 0x0.0xff`ff410aa0
idle+0xa74

processor 5 was running process at 0x0`65cba040 (pid 4407), thread at 0x0`65cbb040 (tid 4873)
stack trace for event 0
crash event was an HPMC
PC (0x6a82000.0x0`000027e0) was in user space

processor 6 was running process at 0x0`6594e040 (pid 4408), thread at 0x0`6594f040 (tid 4874)
stack trace for event 2
crash event was a TOC
PC (0x6511800.0x0`c00294a4) was in user space

processor 7 was running process at 0x0`65925040 (pid 4409), thread at 0x0`65926040 (tid 4875)
stack trace for event 3
crash event was a TOC
can not get struct frame_marker at 0x0.0xff`ff416cc0
can not get struct frame_marker at 0x0.0xff`ff416cc0
pgcopy_pcxu_loop+0x34


Thanks
5 REPLIES 5
whiteknight
Honored Contributor

Re: Help Server Panic

Hi

the event is TOC, may I know if you have serviceguard running ? Please check your OLDsyslog.log

my 2 cents

WK
Problem never ends, you must know how to fix it
SCSI Error
Regular Advisor

Re: Help Server Panic

Hi,

There's no service guard running on the system. Its a standalone system running oracledb
melvyn burnard
Honored Contributor

Re: Help Server Panic

This has NOTHING to do with Serviceguard.

Looks to me like you had an HPMC:
processor 6 was running process at 0x0`667a6040 (pid 2292), thread at 0x0`667a7040 (tid 2395)
stack trace for event 0
crash event was an HPMC

log a call with your local HP Response Centre
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
SCSI Error
Regular Advisor

Re: Help Server Panic

Do you think processor 5 and 6 are defective?
Dennis Handly
Acclaimed Contributor

Re: Help Server Panic

>Do you think processor 5 and 6 are defective?

Is this the same event?
The first says 6 has HPMC and 5 has TOC.
And the second part says the opposite.