1833777 Members
2129 Online
110063 Solutions
New Discussion

Re: Memory ???

 
AntBark
Occasional Advisor

Memory ???

Hi - I recently completed a migration from a D320 Server to a B1000 Workstation. 32 to 64 bit. New system is running HPUX 11.00 and Oracle 8.1.6 64 bit. The machine has 512 MB of memory. My problem is that although the C programs startup fine at some stage they do core dumps. I suspected memory problems , and saw somewhere that one must check for paging - if I check vmstat the amount of memory steadily decrease. Some ideas ????
18 REPLIES 18
Elmar P. Kolkman
Honored Contributor

Re: Memory ???

If you run file on the core files, what is the signal that caused the core file?
Every problem has at least one solution. Only some solutions are harder to find.
AntBark
Occasional Advisor

Re: Memory ???

unfortunately I had to wipe the core file - LV overflow - will check it as soon as it happens again ...
G. Vrijhoeven
Honored Contributor

Re: Memory ???

Hi,

I think it is a good idea to install GPM/glance ( 60 days eval) and look into the process that cores. This way you can pinpoint your problem. You can find it on you applications cdroms.

Gideon

Judy Traynor
Valued Contributor

Re: Memory ???

Are the C programs using shared libraries? If so, relink them.
Also, check the old system tunables in your /stand/system file on the old machine, perhaps you need to tweak the kernel

Sail With the Wind
RAC_1
Honored Contributor

Re: Memory ???

As soon as a core file is dumped, do file core.

This will give details about whihc siganl caused the core.

Depending on that you can troubleshoot.
There is no substitute to HARDWORK
Patrick Wallek
Honored Contributor

Re: Memory ???

Did you tune your kernel appropriately? My initial guess would be that you are hitting one of the max?siz (maxdsiz, maxtsiz, etc) limits. You should probably compare the settings on your D320 to your B1000 and adjust where appropriate.
A. Clay Stephenson
Acclaimed Contributor

Re: Memory ???

First of all, 512MB is a tiny amount of memory for 64-bit Oracle BUT a UNIX box doesn't really care about how much physical memory is in a box but rather the amount of virtual memory. You could very well be hitting maxdsiz or maxssiz limits OR you may simply be a victim of bad code. You really need to use a debugger and do a stack trace on the core file. That will tell you exactly why your program is crashing ---- anything else is a shot in the dark.
If it ain't broke, I can fix that.
AntBark
Occasional Advisor

Re: Memory ???

Thanx everyone - file core showed SIGBUS was the interrupting the C program - what does this mean? Sorry I'm only comming back now but I'm running around for other stuff as well.

Will be back 22 Dec

Thanx

PS : maxdsiz and maxtsiz and their 64 bit mates are set to 256000000 - 1/2 memory size, is that fine?
Bill Hassell
Honored Contributor

Re: Memory ???

SIGBUS is a programming error. If the programs ran OK in the past, they may have some underlying code that took advantage of an incompatible feature from a previous release. You may also have development libraries that were copied from an earlier version of HP-UX rather than installed from the current CDs. You'll need to trace the program to see where it fails.


Bill Hassell, sysadmin
AntBark
Occasional Advisor

Re: Memory ???

Hi again
This program loops the whole time with sleep(5) at the end - what is funny is that it runs for a couple of days before it core dumps, thats why I initially thought it may be a memory issue. I can't see that it uses resources without setting them free, when compiling it gives a warning :
cc: "alarms_monitor.c", line 356: warning 604: Pointers are not assignment-compatible.
cc: "alarms_monitor.c", line 356: warning 563: Argument #2 is not the correct type.

Line 356 : signal(SIGINT,sigint_handler);

signal.h is included and sigint_handler is a procedure which let's the program exit savely if the program gets interrupted (ex. Cntl C) and it works fine ?

Thanx again
Saurav_1
Valued Contributor

Re: Memory ???

Hi,

HP 9000 uses. memory Page deallocation. which deallocates the particular memory page if it generate errors. and nextime system will not use that portion of memory. Unfortunately it is enabled when STM is installed. I request you to install the latest support tool manager if it is not. following is the path to download the latest.
http://www.software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=B6191AAE

This will reduce the chances of memory errors. You can verify the parity bits errors in memory by checking the latest file is /var/tombstones/ts* and once you reboot YOU CAN verify the system H/W status in SERVICE,INFORMATION MENU.

errors occured can be tracked from syslog.log file. If still coredumps keep on occuring. Pls update the version of compilers or install lates patches.

saurav
James Lynch
Valued Contributor

Re: Memory ???

What else is the program doing in the loop besides teh sleep(5)?

Have you ruled out the possibility the some external process and or user is sending a signal directly to your C program?

Are you handling any signals other than SIGINT?

Did you recompile the code for 64-bit?

JL
Wild turkey surprise? I love wild turkey surprise!
A. Clay Stephenson
Acclaimed Contributor

Re: Memory ???

Because you have the source code, this task is very simple. Compile the code with the -g option and without optimization. Let it crash and then you need to use a debugger (gdb) to display a stack trace. It will point you to the offending line of source code -- if you compile with -g.

Your SIGINT handler has nothing to do with the problem although the warnings you are getting indicate a less than well-disciplined approach to those pesky little things like typing --- and that may be indicative of a problem of a more fundamental nature.

If it ain't broke, I can fix that.
Steven E. Protter
Exalted Contributor

Re: Memory ???

There is a process you must go through with oracle to migrate data from 32 bit word size to 64 bit word size. I'm attaching a document on the subject.

Oracle 8.1.6 is not supported by Oracle any more. I believe that 8.1.7 support starts to go away in about 9 days.

If you haven't done the word size conversion that will cause these problems.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
AntBark
Occasional Advisor

Re: Memory ???

Hi

Thank you guys for all your help...
Saurav -
The STM is installed - I got it on a PLUS disk March 2000. Does it need to be patched and which patch? I tried to download the newer version but our net is to slow at the moment.

James -
The program retreives values from tags on a Centum (YOKOGAWA equipment). The code ran fine for 2 years on the old system. No external progs send any interrupts to this program. Only SIGINT is handled. What do I need to include to compile for 64 bit.

Steven -
"If you are changing word-size during a migration, upgrade, or downgrade operation, then no additional action is required. The word-size is changed automatically during any of these operations." -- from the document ?????
AntBark
Occasional Advisor

Re: Memory ???

Hi
the program is first precompiled by the oracle precompiler - I see the $ORACLE_HOME/precomp/lib has an equivelent ../lib64 dir. Do I have to change something in the $ORACLE_HOME/precomp/lib/env_precomp.mk for it to use the lib64 ?

cheers
James Lynch
Valued Contributor

Re: Memory ???

To compile a program for 64-bit you would use the "+DA2.0W" compiler option. This tells the compiler that you want to produce code for the PA-RISC 2.0 wide (64-bit) architecture. The compiler will also make sure that the 64-bit version of the standard libraries are used instead of the 32-bit versions.

If your code has been compiled with the "-g" option, then you could run adb on it and the core file and get a quick stack trace. That would tell you what routine it was in when it core dumped.

To get a stack trace using adb, you need to give it the executable program and the core file that you want to debug:

adb path_name_of_executable core
$c
$q

The "$c" command tells adb to generate a stack backtrace, thus giving you the function calling sequence of your program when it died. The "$q" tells adb to quit and exit.

Hopefully that should give you a starting point of where to look.

JL
Wild turkey surprise? I love wild turkey surprise!
Bill Hassell
Honored Contributor

Re: Memory ???

Oracle should have extensive docs on compiling a 64bit version. Recompiling Oracle for 64bit may involve several changes or perhaps an entirely different installation file. NOTE: all supporting or middleware applications must be 64bit compatible if they access shared memory (SGA).


Bill Hassell, sysadmin