1822854 Members
4200 Online
109645 Solutions
New Discussion юеВ

processes are dieing

 
MARK LING
Occasional Contributor

processes are dieing

I have K460 running HP-UX 10.20 for SAP R3 4.0B for 5 years. Recently, the SAP R3 started to hang after running about 20 to 30 hours, All of the dialog work processes are dieing. From GlancePlus, I can see these processes quickly died and respawned and died again. Nothing have been changed on the OS, SAP/R3 and database level. I replaced CPU boards based on the recommendation from HP response center, but the sympton still remains, any idea ? Thanks
7 REPLIES 7
Michael Tully
Honored Contributor

Re: processes are dieing

Perhaps something has been encountered that OS cannot cope with. This could be some bug in the OS, and since you have not patched in a while, might be the place to start. One would assume that this system is 10.x which is not really supported anymore, but you can get patches.
Anyone for a Mutiny ?
MARK LING
Occasional Contributor

Re: processes are dieing

Thank you, Michael. My server was crashed last week and I used RS to reboot ( I should have used TC), so I don't have any log file. After that, this sympton occured. I forward tombstone to respose center, because the tombstone does not have valid time stamp, they suspected that it might be the cpu issue. But after replacing cpu, the problem remains.
Sridhar Bhaskarla
Honored Contributor

Re: processes are dieing

Hi Mark,

Just checking you didn't lose any swap areas right after the crash?. Did you see if the system was running out of swap at the time when the processes were dying?.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
A. Clay Stephenson
Acclaimed Contributor

Re: processes are dieing

There are not enough data here to spot anything but a box that will run for 20 to 30 hours before developing problems sounds like a slow memory leak --- especially if some of your modules went bad and are now mapped out so that the memory pressure is more severe. Another thought is that a temp filesystem is nearly full and the processes are dying because they can't write temp files. It's rather unusual for a hardware problem to behave as yours is doing because it seems to be picking on only one program (or group of related programs). You really need to use Glance during problem periods and look at system tables, file tables, locks, and available swap and memory. You might also have one client PC who is constantly retrying and filling up the process table or some data structure within SAP. Unless you get some data, we are shooting in the dark.

If it ain't broke, I can fix that.
MARK LING
Occasional Contributor

Re: processes are dieing

Thanks for your help.

When the SAP dispatch processes die, there is no swap activities on Glance. swapinfo shows plenty of space. The /tmp space is not full.

On glance memory report, I saw some Page faults activities, no KB Reactivation, that indicates no RAM shortge. What kind parameter I need to pay special attention in terms of memory leak or some memory related issue ? Thanks again.
RAC_1
Honored Contributor

Re: processes are dieing

Is this behaviour a regular affair??
When the processes die, do thay create the core file??

Also it would help running tusc on those process after ther are about to complete 30 hrs. (just before thay die)
May be tusc will give some more information.

you can get tusc here.
http://hpux.connect.org.uk

Anil
There is no substitute to HARDWORK
Sandeep Kapare
Advisor

Re: processes are dieing

Since you are on 10.X, make sure that atleast you have latest version of stm/cstm. Memory leakages are detected by the stm / syslogd. If you don't get any error in syslog, then probably there is no memory leakage. As a long term plan & if possible as well as if supported by SAP, you can think of upgrading to 11i with latest GoldBase installed.
You can monitor critical system parameters like nproc, nfile, npty, shmmax, shmseg, nflocks, maxdsize as these parameters decide the amount of shared segments & related memory allocations, & many times you don't get these errors trapped in any logs.
Nothing is impossible