- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: repeated kernel panic "Trap Type 15 (Data page...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-21-2002 12:58 PM
12-21-2002 12:58 PM
repeated kernel panic "Trap Type 15 (Data page fault)"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-21-2002 01:17 PM
12-21-2002 01:17 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
reboot your server and get to PDC. Check PIM output (processor internal memory) if there's a valid timestamp and error codes logged, check MDT (memory deallocation table) if there're any memory module logged errors. If you'll find anything there call HP.
If not, try patching your system with latest GR first and then latest HW enablement from the SAME support CD (for them to have same release date). Also check if FC driver installed is the latest (you have B.11.11.06).
You may also use STM (Support Tools Manager) to check CPUs and memory and view its logs
Eugeny
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-21-2002 04:15 PM
12-21-2002 04:15 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-21-2002 04:39 PM
12-21-2002 04:39 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
There is another patch that may not be related by you should have anyway, because it caused kernel panics for me when reading large files from ultrium tape drives.
PHKL_27753
It was not in the last Certified Bundle I installed which was July or September.
If you have a support contract, they'll always be happy to analyze q4 output and give you precise answers. No matter how smart(assed) I think I am, I'd never want to run systems without a support contract. It's the best money you can ever invest. See my thread on great support stories.
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-23-2002 11:52 AM
12-23-2002 11:52 AM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
and Hardware Enablement patch
bundles, as well as all the NFS
critical patches.
It still keeps happening.
Any ideas?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-23-2002 12:17 PM
12-23-2002 12:17 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
Eugeny
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-23-2002 12:27 PM
12-23-2002 12:27 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
http://www1.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000062903332
HTH
Marty
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-23-2002 12:38 PM
12-23-2002 12:38 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
Wow.
All the NFS stuff too?
check dmesg, certain hardware faults can trigger a panic. I had a disk trigger what looked like I/O software panics a ways back.
Also, get support's dump team to read the q4 dump. Those guys are wizards.
Steve
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-23-2002 01:31 PM
12-23-2002 01:31 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
There are no memory errors in the
tombstones file.
I searched the individual patches for
"data page fault" and have applied
a few more.
I had previously applied PHKL_27753.
I believe I have a later version of the
cumulative streams patch.
Here is the total list of the individual
patches I have applied over the last
few days:
# BUNDLE B.11.00 Patch Bundle
BUNDLE.PHKL_23225 1.0 Fix for dqput() data page fault panic
BUNDLE.PHCO_24777 1.0 mountall cumulative patch.
BUNDLE.PHNE_28089 1.0 cumulative ARPA Transport patch
BUNDLE.PHNE_27703 1.0 Cumulative STREAMS Patch
BUNDLE.PHNE_27218 1.0 ONC/NFS General Release/Performance Patch
BUNDLE.PHNE_25388 1.0 LAN product cumulative patch
BUNDLE.PHNE_24403 1.0 HP-PB 100Base-T.
BUNDLE.PHKL_28267 1.0 thread perf, user limit, cumulative VM
BUNDLE.PHKL_28096 1.0 SCSI IO Cumulative Patch
BUNDLE.PHKL_27830 1.0 VxFS cumulative;VxFS 3-way deadlock;sendfile
BUNDLE.PHKL_27825 1.0 Cumulative VM,Psets,Preemption,PRM
BUNDLE.PHKL_27766 1.0 early boot,Psets,vPar,Xserver,T600 HPMC KRNG
BUNDLE.PHKL_27753 1.0 audit subsystem cumulative patch
BUNDLE.PHKL_27751 1.0 Fibre Channel Mass Storage Patch
BUNDLE.PHKL_27682 1.0 diag0 cumulative patch.
BUNDLE.PHKL_27317 1.0 detach; NOSTOP, Abort; Psets; slpq1 perf
BUNDLE.PHKL_27304 1.0 SCSI Tape (stape) cumulative
BUNDLE.PHKL_27179 1.0 Corrected reference to thread register state
BUNDLE.PHKL_27172 1.0 vPars panic; Syscall cumulative
BUNDLE.PHKL_27152 1.0 I/O Cumulative, PA 8700 2.2, vPar, PCI-X
BUNDLE.PHKL_27096 1.0 VxVM,EMC,Psets&vPar,slpq1,earlyKRS
BUNDLE.PHKL_27094 1.0 Psets Enablement Patch, slpq1 perf
BUNDLE.PHKL_27091 1.0 Core PM, vPar, Psets Cumulative, slpq1 perf
BUNDLE.PHKL_27025 1.0 SCSI Ultra160 Driver with OLAR support
BUNDLE.PHKL_26698 1.0 umount-mkfs panic; HFS mount/umount perf
BUNDLE.PHKL_26032 1.0 New audio h/w support + cumulative fixes
BUNDLE.PHKL_25729 1.0 signals,threads enhancement,Psets Enablement
BUNDLE.PHKL_25602 1.0 Fix panic in ccio_alloc_shared_mem
BUNDLE.PHKL_25233 1.0 select(2) and poll(2) hang
BUNDLE.PHKL_24507 1.0 fix for data page fault in pstat_getstream()
BUNDLE.PHKL_24343 1.0 Data Page Fault panic in DNLC
BUNDLE.PHKL_23957 1.0 Boot panic (w/Fiber Ch. & Gig. Ethernet) fix
I just added these:
BUNDLE.PHNE_28089 1.0 cumulative ARPA Transport patch
BUNDLE.PHKL_28267 1.0 thread perf, user limit, cumulative VM
Is there anything left to try before
sending the dump to support?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-23-2002 02:17 PM
12-23-2002 02:17 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
If you have quite a few dumps you can narrow it down a bit. Using q4 trace the crash events for about three dumps. i.e. :
q4> trace event 0
If the stack trace are all different then you are probably looking at hardware. If you run the following q4 commands on each of the dumps you may see a pattern on a particular processor :
q4> load mpinfo_t from mpproc_info max nmpinfo
q4> trace pile
processor 0 was running process at 0x580f00 (pid 794)
stack trace for event 1
crash event was a panic
.....
.....
processor 1 claims to be idle
stack trace for event 2
crash event was a TOC
The panic on one processor will send a TOC to the others. If you see a pattern then most likely one of the registers on that processors is on the way out. If not, log a call and we can examine the dump in greater detail.
Regards,
James.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-23-2002 03:31 PM
12-23-2002 03:31 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
They are always at the same point:
q4> trace event 0
stack trace for event 0
crash event was a panic
panic+0x6c
report_trap_or_int_and_panic+0x94
trap+0xedc
nokgdb+0x8
bcopy_pcxu_method+0x4
xdrmblk_getbytes+0x5c
xdr_opaque+0x78
xdr_bytes+0xd4
xdr_READ3resok+0x74
xdr_READ3res+0x38
xdr_replymsg+0xd4
clnt_clts_kcallit_addr+0x570
clnt_clts_kcallit+0x28
rfscall+0x27c
rfs3call+0x78
nfs3read+0x100
nfs3_do_bio+0x108
async_daemon+0x4ec
coerce_scall_args+0xe0
syscall+0x204
$syscallrtn+0x0
Maybe this is just a bug in NFS version 3.
Does it look that way to you?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-24-2002 12:00 AM
12-24-2002 12:00 AM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
You may have a corner case, or have found a new bug -}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-24-2002 02:02 AM
12-24-2002 02:02 AM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
I found an almost exact match of you panic, however a successor to the patch that fixed it is already on your system, so you should probably get the dump in to confirm.
Just one final question though....what kind of server(s) are you connecting with via NFS?
regards,
James.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-24-2002 09:08 AM
12-24-2002 09:08 AM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
What patch was the one
which looked closest to the
symptoms?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-24-2002 09:21 AM
12-24-2002 09:21 AM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
Patch PHNE_23502 was the one that fixed the previous issue and the latest successor is:
PHNE_27218 1.0 ONC/NFS General Release/Performance Patch
which you already have installed. I imagine the full trace was not put in the patch text as it is so large anyway, however it was the same stack trace. See JAGad35150 in the patch text for a description.
If you are connecting to a Linux system it will probably be a new issue so I imagine the labs will be keen to see the dump.
regards,
James.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-24-2002 12:37 PM
12-24-2002 12:37 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2002 08:46 AM
12-26-2002 08:46 AM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
What is happening now is
a lockup in the processes
which read from NFS. They
are permanently sleeping.
I also saw some of this
happen with version 3, but
was more concerned with the
panics.
How can I diagnose these
processes which are hung
(apparently in NFS reads)?
Please help. These NFS
problems are real show-stoppers.
Thanks,
David
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2002 01:22 PM
12-26-2002 01:22 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2002 01:57 PM
12-26-2002 01:57 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
I see what you mean. Sorry to ask so many questions when you are trying to solve this one but....
Have you tried using udp?
Are there any errors in the message buffer/syslog?
Does this only happen when the client/server is linux?
You said this happened to at version 3 - where the symptomns as severe?
I suppose you could try tracing the processes in q4 to....
# ied q4 /stand/vmunix /dev/kmem
q4> load proc_t from proc_list next p_factp max nproc
q4> keep p_pid ==
q4> trace pile
Also, i'm sure you know about this document on NFS performance tuning but here is the link anyway.
http://www.docs.hp.com/hpux/onlinedocs/1435/NFSPerformanceTuninginHP-UX11.0and11iSystems.pdf
regards,
James.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2002 02:36 PM
12-26-2002 02:36 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
Here is what I know at this time.
The problem may not be due to
NFS, but rather to TCP. The TCP
connection which appears to be
hung, is between two processes
which send backup data to tape. It
is not an NFS-TCP connection.
I've tried using UDP for the NFS
connections using version 3, but
that still took the panic, so I changed
the version back to 2. All the open
NFS files on the system now are
accessed by UDP connections.
There are no error messages in
the syslog. This is so strange.
Here is what I get from the q4
analysis of both the 'read' and the
'write' processes:
q4> keep p_pid == 3549
kept 1 of 181 proc_t's, discarded 180
q4> trace pile
stack trace for process at 0x0`4c2b5040 (pid 3549), thread at 0x0`4c2b6040 (tid 3718)
process was not running on any processor
_swtch+0xc4
_sleep+0x4e0
read_sleep+0x1b8
hpstreams_read_int+0x1e8
streams_read_uio+0x28
soreceive+0x3ec
soo_rw+0x40
read+0x10c
syscall+0x204
$syscallrtn+0x0
q4> load proc_t from proc_list next p_factp max nproc
loaded 180 proc_ts as a linked list (stopped by null pointer)
q4> keep p_pid == 3550
kept 1 of 180 proc_t's, discarded 179
q4> trace pile
stack trace for process at 0x0`5022f040 (pid 3550), thread at 0x0`4c2cb040 (tid 3719)
process was not running on any processor
_swtch+0xc4
_sleep+0x4e0
write_sleep+0x184
streams_write_uio+0x3bc
sosend+0x4d4
soo_rw+0x80
write+0x108
syscall+0x204
$syscallrtn+0x0
Is there anything else I can gather?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2002 03:07 PM
12-26-2002 03:07 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
Not sure if I can be much help, not exactly my area of expertise! Suppose we should look at the streams subsystem for any errors though....
In the directory /var/adm/streams there is an error log of the notation error.dd.mm, is there any errors reported? Also see the streams binaries in /usr/bin :
# ll /usr/bin/str*
-r-xr-xr-x 1 bin bin 16384 Nov 14 2000 /usr/bin/strace
-r-xr-xr-x 1 bin bin 20480 Nov 14 2000 /usr/bin/strchg
-r-xr-xr-x 1 bin bin 16384 Nov 14 2000 /usr/bin/strclean
-r-xr-xr-x 1 bin bin 16384 Nov 14 2000 /usr/bin/strconf
-r-xr-xr-x 1 bin sys 118784 Nov 14 2000 /usr/bin/strdb
-r-xr-xr-x 1 bin bin 16384 Nov 14 2000 /usr/bin/strerr
-r-xr-xr-x 1 bin bin 20480 Nov 14 2000 /usr/bin/strvf
strvf verifies the streams installation, the others have manpages that describe their use.
Although I'm not sure this is the issue, I'm reading some streams internals notes and will see what I can find.
Regards,
James.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2002 03:49 PM
12-26-2002 03:49 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
it!
There are no error messages in
the streams log directory.
With netstat I can see some "packet"
statistics occasionally increase on the
loopback interface (lo0), but I have
not found any way to trace contents
of the packets. Tcpdump complains
"no such device /dev/lo0" when I
try to trace them.
I think you are on the right track with
looking at the streams queues.
If we can also see the tcp packets
on lo0, then perhaps we can narrow
it down a bit...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2002 04:02 PM
12-26-2002 04:02 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
Attached are some tools written by HP labs. Hopefully I won't be in trouble by posting these! :) If you can run each one and attach the output?
The three tools are:
strshow
tcpipstreams
crashinfo
The first two should be obvious, the third gives a general view of the sysyem. Normally used on dumps (you can run it on your dumps by changing directory to the dump dir and executing). These are normally put in /usr/contrib/bin and can be run without any options.
Regards,
James.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2002 04:08 PM
12-26-2002 04:08 PM
Re: repeated kernel panic "Trap Type 15 (Data page fault)"
Just realised you can't attach binaries.....personal email is james@carrera-blue.com if you want them.
Cheers,
James.