1823118 Members
3268 Online
109646 Solutions
New Discussion юеВ

Ignite core dump problem

 
SOLVED
Go to solution
Fabian Brise├▒o
Esteemed Contributor

Ignite core dump problem

Hello guys, im getting this message when running an ignite backup with make_net_recovery:



/dev/vg25/lvsapdata65 /oracle/SMP/sapdata65 0

** 0 - The Volume Group or Filesystem is Not included in the
System Recovery Archive
** 1 - The Volume Group or Filesystem is Partially included in the
System Recovery Archive
** 2 - The Volume Group or Filesystem is Fully included in the
System Recovery Archive

* Checking Versions of Recovery Tools
Memory fault(coredump)


I'm using ignite version Ignite-UX C.7.9.260, this started happening when I installed this new version, since then I have uninstalled and installed again and problem continues, I have the same version on all clients this is the only one giving me problems, I also runned the following command:

swlist -l fileset Ignite-UX
# Initializing...
# Contacting target "msmxpd"...
#
# Target: msmxpd:/
#

# Ignite-UX C.7.9.260 HP-UX System Installation Services
Ignite-UX.BOOT-COMMON-PA C.7.9.260 Common Boot Components for PA-RISC clients
Ignite-UX.BOOT-KRN-11-11 C.7.9.260 Installation Boot Kernel for B.11.11 clients
Ignite-UX.BOOT-SERVICES C.7.9.260 Network Boot Services for System Installations
Ignite-UX.CD-TOOLS C.7.9.260 Tools for creating CD media
Ignite-UX.CD-TOOLS-DOC C.7.9.260 Documentation for CD-TOOLS
Ignite-UX.CD-TOOLS-SRC C.7.9.260 Source for CD-TOOLS
Ignite-UX.DVD-TOOLS C.7.9.260 Tools for creating DVD media
Ignite-UX.DVD-TOOLS-DOC C.7.9.260 Documentation for DVD-TOOLS
Ignite-UX.DVD-TOOLS-SRC C.7.9.260 Source for DVD-TOOLS
Ignite-UX.FILE-SRV-11-11 C.7.9.260 File Archives Used By Clients During HP-UX Install
Ignite-UX.IGNITE C.7.9.260 Graphical Ignite User Interface for Installations
Ignite-UX.IGNT-ENG-A-MAN C.7.9.260 Ignite-UX Manual Pages
Ignite-UX.IGNT-JPN-E-MAN C.7.9.260 Ignite-UX Japanese EUC Manual Pages
Ignite-UX.IGNT-JPN-S-MAN C.7.9.260 Ignite-UX Japanese SJIS Manual Pages
Ignite-UX.MGMT-TOOLS C.7.9.260 Tools for Managing Data Files on an Ignition Server
Ignite-UX.RECOVERY C.7.9.260 System disaster recovery tools

as you can see there are no software verision problems.


also checked /var/adm/syslog/syslog.log and found nothing, any help you could provide would be great. If you need any info please let me know.

Thanks in advanced.
Knowledge is power.
28 REPLIES 28
Raj D.
Honored Contributor

Re: Ignite core dump problem

Fabian,

What syntax are you using with the make_net_recovery ,

Check this out:
http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=812024


Also:
- whether the backup completed successfuly at the end.
- what is the message at the end of the recovery.log file.


Hth,
Raj.
" If u think u can , If u think u cannot , - You are always Right . "
Michael Steele_2
Honored Contributor

Re: Ignite core dump problem

Hi

From a thread:

"...but fixed this by increasing the kernel parameter maxssiz from 8mb to 16mb:
received a SIGSEGV for stack growth failure.
Possible causes: insufficient memory or swap space,
or stack size exceeded maxssiz.

What is maxssiz and related kernel parms?
Support Fatherhood - Stop Family Law
Fabian Brise├▒o
Esteemed Contributor

Re: Ignite core dump problem

Hello guys, thanks for the reply.
Box config.
O.S: HPUX 11.11
RAM: 30 GB

Sintax I am using:
/opt/ignite/bin/make_net_recovery -s ig5470 -A -n 3 -P s -x inc_entire=vg00 -x exclude=/SAParchives -x exclude=/usrtemporal -x exclude=/var/adm/crash -v

output of swapinfo -tam
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 4096 0 4096 0% 0 - 1 /dev/vg00/lvol2
dev 5008 98 4910 2% 0 - 0 /dev/vg00/lvswap2
dev 5008 99 4909 2% 0 - 0 /dev/vg00/lvswap3
dev 5008 98 4910 2% 0 - 0 /dev/vg00/lvswap4
dev 5008 98 4910 2% 0 - 0 /dev/vg00/lvswap5
reserve - 23128 -23128
memory 23886 5161 18725 22%
total 48014 28682 19332 60% - 0 -


kernel values
maxssiz 0x800000
maxssiz_64bit 0x800000
maxswapchunks 12065
maxtsiz 0x4000000
maxtsiz_64bit 0x40000000

I am checking the thread you sent me.
Knowledge is power.
Michael Steele_2
Honored Contributor

Re: Ignite core dump problem

I'm confused, you have duplicater max's.

Also, list in dec please.
Support Fatherhood - Stop Family Law
Fabian Brise├▒o
Esteemed Contributor

Re: Ignite core dump problem

attaching recovery.log

Knowledge is power.
Fabian Brise├▒o
Esteemed Contributor

Re: Ignite core dump problem

max_thread_proc 256
xx maxdsiz 990056448
xx maxdsiz_64bit 4294967296
xx maxfiles 1024
xx maxfiles_lim 2048
xx maxssiz 8388608
xx maxssiz_64bit 8388608
xx maxswapchunks 12065
xx maxtsiz 67108864
xx maxtsiz_64bit 1073741824
Knowledge is power.
Michael Steele_2
Honored Contributor

Re: Ignite core dump problem

Hi

SIGSEGV is a segmentation fault and associated to stack growth.



http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1379087&admit=109447626+1255725561412+28353475

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=587531

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=11110

Possible patches
PHCO_38226

Didn't see anything in that attachment. What about these?

server:/var/opt/ignite/clients/0x00306E4B1A17/install.log

client: /var/opt/ignite/local/install.log
Support Fatherhood - Stop Family Law
Dennis Handly
Acclaimed Contributor

Re: Ignite core dump problem

>Memory fault(coredump)

Have you analyzed your corefile with gdb? Gotten a stack trace and register dump?
gdb ignite-executable core
(gdb) bt
(gdb) info reg
(gdb) disas $pc-4*16 $pc+4*4
(gdb) q

>Michael: SIGSEGV is a segmentation fault and associated to stack growth.

Possibly. But since it didn't say that, it isn't so likely.
That's why you need to look at the core file.
Luitzen
Occasional Advisor

Re: Ignite core dump problem

Hello,

I've got the same problem. We've two kinds of systems: one with HP-UX EOE 11.23 and the T2775BA bundle and one with HP-UX EOE 11.23 and the T2777BA bundle.

All systems with the T2775BA bundle have a correct make_net_recovery but those ones with the T2777BA bundle fail with the same "Memory fauke(coredump)"

Falling back to the C.7.4.155 Ignite-UX.RECOVERY fileset solves the problem but gives the expected warning of the older version of the software...
Raj D.
Honored Contributor

Re: Ignite core dump problem

Fabian,


You have Ignite-UX 7.9.260 and looks it is very latest version,
>Do you have the pax cumulative patch installed correctly as per the latest Ignite-UX version .

# swlist -l product | grep -i pax


- What is the model of the server:


- Also as per the release note it says few core dump issue has been fixed in this version,

Ignite-UX version C.7.9.260 contains the following new features:
http://docs.hp.com/en/5992-5096/5992-5096.pdf

- Fixes for the following Change Requests:
* QXCR1000930732. Ignite-UX installs and rescan_hw_host involving
B.11.23 will no longer core dump and terminate when multi-pathed
devices are inventoried and some paths to the disk are "bad". Such
disks will be presented for install, and the itool interface will
no longer display disk device paths which are known to be "bad" for
possible selection.



Hth,
Raj.
" If u think u can , If u think u cannot , - You are always Right . "
Luitzen
Occasional Advisor

Re: Ignite core dump problem

Hello Fabian,

we didn't have the latest pax patch, we're running with the 2004 version:

PHCO_31634 1.0 Cumulative changes to pax(1)

I've updated to the latest version:

PHCO_39677 1.0 Cumulative changes to pax(1)

and tried to make a new make_net_recovery. This is the result:

# /opt/ignite/bin/make_net_recovery -A -P s -s igniteserver -x exclude=/var/adm/crash -x exclude=/tmp -x exclude=/var/tmp

* Creating NFS mount directories for configuration files.

======= 10/26/09 10:02:46 MET Started /opt/ignite/bin/make_net_recovery. (Mon
Oct 26 10:02:46 MET 2009)
@(#)Ignite-UX Revision C.7.9.260
@(#)ignite/net_recovery (opt) Revision:
/branches/IUX_RA0909_WEB/ignite/src@78846 Last Modified: 2009-08-13
14:15:12 -0600 (Thu, 13 Aug 2009)

* Checking Versions of Recovery Tools
Memory fault(coredump)

The message following the Version Check should be:

* Scanning system for IO devices...

I don't know id pax is in the picture, there's been no archiving done yet.

Kind regards, Luitzen
Dennis Handly
Acclaimed Contributor

Re: Ignite core dump problem

>Luitzen: I don't know if pax is in the picture, there's been no archiving done yet.

Please create your own thread so you can assign points. If there is something useful here, you can include a URL.

If you use "file core" you can see if created by pax.
Luitzen
Occasional Advisor

Re: Ignite core dump problem

Sorry to interrupt this thread, I thought it would be useful to hear about someone having the same problem... I will follow this thread silently...

Kind regards, Luitzen
Dennis Handly
Acclaimed Contributor

Re: Ignite core dump problem

>Luitzen: I thought it would be useful to hear about someone having the same problem.

Only if you have a suggestion/solution. :-)
Sorry, which I guess you did:
Falling back to the C.7.4.155 ... solves the problem
Fabian Brise├▒o
Esteemed Contributor

Re: Ignite core dump problem

Thanks for the replys.
My O.S is 11.11 running on a hp9000 rp4440.

I had ignite version C.7.5.142 installed and everithing worked fine.
I upgraded to version C.7.9.260, and that's when problems started.

The reason I upgraded is that I updated the ignite version on all ignite clients and server, this is the only server that is giving me problems.


Hello Dennis, I havent analized the core file with GDB.

gdb, where do I download this ?
I found this, HP WDB, is this the same as GDB ??

Luitzen, bundle T2775BA ??


RAJ.
I have patch PHCO_32438 1.0 pax(1) cumulative patch installed on the server having problems.

I checked a server that I recently patched and that server has pax patch PHCO_35998 1.0 pax(1) cumulative patch, maybe I could patch to sse if it works but luitzen keeps getting the problem ??

this is the set of patches I have installed,

Gold Applications Patches for HP-UX 11i v1, December 2004
Gold Base Patches for HP-UX 11i v1, December 2004

And if I do a bdf this is diplayed:
ig5470:/var/opt/ignite/clients
20480000 15095296 5343928 74% /var/opt/ignite/recovery/client_mnt

im not seing this mounted on other clients.


im will continue to analyze the server.

Thanks a lot for your input.
Knowledge is power.
Luitzen
Occasional Advisor

Re: Ignite core dump problem

Fabian,

T2775BA is a Serviceguard CFS (veritas 4.1) bundle. I've no problems on those servers.

For our Oracle-RAC clusters we need a more expensive version: T2777BA wich is the same including RAC extensions. With that bundle the Ignite problem starts....

Kind regards, Luitzen
Dennis Handly
Acclaimed Contributor

Re: Ignite core dump problem

>gdb, where do I download this? I found this, HP WDB, is this the same as GDB?

http://www.hp.com/go/wdb
Yes, provided you invoke it as gdb. And not the wdb GUI.
Fabian Brise├▒o
Esteemed Contributor

Re: Ignite core dump problem

I have runned gdb. This is the output, if im running it wrong please let me know, as always thanks for your help.

msmxpd:/opt/ignite/bin> /opt/langtools/bin/gdb64 /opt/ignite/bin/make_net_recovery -c core
Detected 32-bit executable.
Invoking /opt/langtools/bin/gdb32.
HP gdb 6.0 for PA-RISC 1.1 or 2.0 (narrow), HP-UX 11i
and target hppa1.1-hp-hpux11.00.
Copyright 1986 - 2009 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 6.0 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
warning: Load module /opt/ignite/bin/make_net_recovery has been stripped

(no debugging symbols found)...

warning: core file may not match specified executable file.
Core was generated by `make_net_recov'.

warning: make_net_recov is 14 characters in length. Due to a limitation
in the HP-UX kernel, core files contain only the first 14 characters
of an executable's name. Check if make_net_recov is a truncated name.
If it is so, core-file, packcore and other commands dealing with
core files will exhibit incorrect behavior. To avoid this, issue
exec-file and symbol-file commands with the full name of the executable
that produced the core; then issue the core-file, packcore or other
core file command of interest.

Program terminated with signal 11, Segmentation fault.

(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
#0 0xc0197998 in + 0x8 () from /usr/lib/libc.2
Knowledge is power.
Fabian Brise├▒o
Esteemed Contributor

Re: Ignite core dump problem

Hello.
A bit more information

msmxpd:/> what /usr/lib/libc.2
/usr/lib/libc.2:
malloc.c $Date: 2007/01/09 16:25:15 $Revision: r11.11/14 PATCH_11.11 (PHCO_35743)
$ PATCH_11.11/PHCO_35743 Jan 22 2007 04:25:33 $
Knowledge is power.
Dennis Handly
Acclaimed Contributor

Re: Ignite core dump problem

>This is the output, if I'm running it wrong please let me know,

Some general commands to issue:
(gdb) bt
(gdb) info reg
(gdb) disas $pc-4*16 $pc+4*4
(gdb) q
Fabian Brise├▒o
Esteemed Contributor

Re: Ignite core dump problem

Hello Dennis.

As always thanks for your help.

Here is the output of the commands you told me to run.

I keep seing /usr/lib/libc.2 a lot..... I checked and there is a new patch available.

I checked another server that has the same patch as this one and ignite is working on that server, the only diference I see is that server that is working has BASE and QTY. patches from 2006.

The one that is failing has BASE and QTY. from 2004, could this be related in any way ?


(gdb) bt
#0 0xc0197998 in + 0x8 () from /usr/lib/libc.2
#1 0xc01989d4 in + 0x22c () from /usr/lib/libc.2
#2 0xc019b6ec in free+0x30c () from /usr/lib/libc.2
#3 0x1d60c in + 0x2584 ()
#4 0x28bc4 in + 0xa94 ()


(gdb) info reg
flags: 2f000041
r1: 210 rp/r2: c01989d7 r3: 7b0416e8 r4: 7b041378
r5: 0 r6: 4006419c r7: 838 r8: 0
r9: 400649d4 r10: dfa0 r11: 2 r12: 27f7f2a08
r13: 31836 r14: 31836000382e8 r15: 40004558 r16: 4000d1d0
r17: 4000d5d0 r18: 4000d5d1 r19: 7b03a794 r20: d0c4
r21: c88c r22: 400b0e3c arg3/r23: 7b043108 arg2/r24: 838
arg1/r25: e8000bc8 arg0/r26: 8210681 dp/gp/r27: 40005558 ret0/r28: 0
ret1/ap/r29: 1 sp/r30: 7f7f31a0 mrp/r31: 7b043108 sar/cr11: 3f
pcoqh: c0197998 pcsqh: 5b6ec00 pcoqt: c019799c pcsqt: 5b6ec00
eiem/cr15: ffffffffffffffff iir/cr19: 4b5f0018 isr/cr20: aa7c000 ior/cr21: 821068d
ipsw/cr22: ff0004fa1f goto: 64 sr4: aa7c000 sr0: 5b6ec00
sr1: aa7c000 sr2: 0 sr3: 0 sr5: e2f0c00
sr6: 5b6ec00 sr7: 5b6ec00 rctr/cr0: 1871 pidr1/cr8: fffffff0ffffffff
pidr2/cr9: 127996d60 ccr/cr10: 153acfb00 pidr3/cr12: d27488 pidr4/cr13: 1c2d6b0
cr24: 800000f cr25: 1c700e0 cr26: 12792d960 mpsfu_high: 7af89060
mpsfu_low: b58338 mpsfu_ovflo: 0 pad: 5fc59ab901c70000 fpsr: 8000000
fpe1: 0 fpe2: 0 fpe3: 0 fpe4: 0
fpe5: 0 fpe6: 0 fpe7: 0



(gdb) disas $pc-4*16 $pc+4*4
Dump of assembler code from 0xc0197958 to 0xc01979a8:
0xc0197958 : stw %r24,0(%r31)
0xc019795c : ret
0xc0197960 : ldo -0x80(%sp),%sp
0xc0197964 : ldo 8(%r25),%r26
0xc0197968 : ldw 8(%r25),%r25
0xc019796c : cmpb,<>,n %r0,%r25,0xc0197844
0xc0197970 : ldw 0xc(%r25),%ret1
0xc0197974 : stw %r24,0(%r26)
0xc0197978 : stw %r31,0(%r24)
0xc019797c : stw %r0,8(%r24)
0xc0197980 : stw %r0,4(%r24)
0xc0197984 : ret
0xc0197988 : ldo -0x80(%sp),%sp
0xc019798c : break 0,0
0xc0197990 : cmpb,=,n %r0,%r26,0xc01979dc
0xc0197994 : cmpb,=,n %r0,%r25,0xc01979e0
0xc0197998 : ldw 0xc(%r26),%r31
0xc019799c : ldw 0xc(%r25),%r23
0xc01979a0 : cmpb,>>,n %r31,%r23,0xc01979c8
0xc01979a4 : cmpb,<> %r31,%r23,0xc01979e8
Knowledge is power.
Dennis Handly
Acclaimed Contributor

Re: Ignite core dump problem

>Here is the output of the commands you told me to run.

This indicates you have heap corruption, it is dying after free is called.

0xc0197998 : ldw 0xc(%r26),%r31 < abort here
arg0/r26: 8210681

The value in R26 is misaligned and doesn't have the high nibble set.

>The one that is failing has BASE and QTY from 2004, could this be related in any way?

Yes.
Fabian Brise├▒o
Esteemed Contributor

Re: Ignite core dump problem

Hello guys.
Management allowed me to patch the system, but until december so I will update this thread until then.


thanks for your help.
Knowledge is power.
Stephan.
Honored Contributor
Solution

Re: Ignite core dump problem

I know this is an old thread, but i stumpled about it due to the fact i have the same problem on several machines since last week.

Answer on my call was "known issue" with that specific version.

Check
http://www.itrc.hp.com/service/cki/docDisplay.do?docId=QXCR1000978239
for details.

As well in the meantime 7.9.261 is available and in the release notes (http://www.docs.hp.com/en/5992-5168/5992-5168.pdf) you can find:

"An array out-of-bounds memory corruption defect in
make_net_recovery and make_tape_recovery introduced"

Hopefully this will save you the time to update the box.

I'm going to install 7.9.261 right now and test if the problem is gone.