Operating System - HP-UX
1834512 Members
2346 Online
110068 Solutions
New Discussion

Re: statdaemon and very high load

 
SOLVED
Go to solution
Coolmar
Esteemed Contributor

statdaemon and very high load

Hi,
I have a system that has a load up around 15. It is connected to a SAN and I am sure it is not a disk bottleneck. I have read a couple of threads and in one of them the guy said that he solved the problem as he found a process spawning every second. How could he find that? Is there a way to tell what processes are spawning and how quickly?

Here is some info:
System: skregi25 Tue Jan 10 14:05:59 2006
Load averages: 15.80, 15.84, 15.89
314 processes: 296 sleeping, 18 running
Cpu states:
CPU LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
0 15.80 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0.0% 0.0%
--- ---- ----- ----- ----- ----- ----- ----- ----- -----
avg 15.80 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0.0% 0.0%

Memory: 1209424K (426548K) real, 2115692K (1136696K) virtual, 51692K free Page# 1/10

CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
0 ? 3 root 128 20 32K 32K sleep 1318:26 4.47 4.46 statdaemon
0 ? 1486 root 152 20 249M 15236K run 68:19 0.50 0.49 java
0 ? 1455 root 152 20 246M 7256K run 109:17 0.35 0.35 prm3d
0 ? 40 root 152 20 6624K 6624K run 34:12 0.31 0.31 vxfsd
0 ? 1433 root -16 20 34344K 12580K run 126:02 0.17 0.17 midaemon
0 ? 3115 root 152 20 15960K 1632K run 3:31 0.16 0.16 rep_server
0 ? 3163 root 152 20 13392K 1272K run 2:40 0.16 0.16 agdbserver
0 ? 3164 root 152 20 13592K 1076K run 4:54 0.12 0.12 alarmgen
0 ? 1755 root 152 20 23284K 1104K run 2:45 0.10 0.10 vxsvc
0 pts/3 16560 root 178 20 6868K 4944K run 0:00 2.00 0.10 top
0 ? 17823 root 152 20 209M 10168K run 7:57 0.10 0.10 java
0 ? 1408 root 152 10 5432K 480K run 0:12 0.08 0.08 memlogd
0 ? 10648 root 152 20 13192K 2984K run 0:19 0.07 0.07 mad
0 ? 16145 root 152 20 8592K 732K run 0:00 0.05 0.05 sshd:
0 ? 21136 oracle 156 20 438M 1792K sleep 0:02 0.05 0.05 ora_ckpt_PFRA
0 ? 1090 root 168 20 3624K 216K sleep 23:53 0.04 0.04 sendmail:
0 ? 1441 root 152 20 12888K 280K run 0:04 0.04 0.04 perflbd
0 ? 21196 oracle 156 20 304M 2860K sleep 0:02 0.03 0.03 ora_ckpt_REP
0 ? 2 root 128 20 32K 32K sleep 8:12 0.03 0.03 vhand
0 ? 0 root 127 20 32K 0K sleep 448:23 0.02 0.02 swapper
0 ? 1 root 168 20 496K 204K sleep 0:38 0.02 0.02 init
0 ? 4 root 128 20 32K 32K sleep 0:50 0.02 0.02 unhashdaemon
0 ? 20 root 147 20 32K 32K sleep 0:06 0.02 0.02 lvmkd
0 ? 21 root 147 20 32K 32K sleep 0:07 0.02 0.02 lvmkd
0 ? 22 root 147 20 32K 32K sleep 0:07 0.02 0.02 lvmkd
0 ? 23 root 147 20 32K 32K sleep 0:08 0.02 0.02 lvmkd
0 ? 24 root 147 20 32K 32K sleep 0:07 0.02 0.02 lvmkd
0 ? 25 root 147 20 32K 32K sleep 0:07 0.02 0.02 lvmkd
0 ? 27 root 100 20 32K 32K sleep 2:10 0.02 0.02 smpsched
0 ? 28 root 100 20 32K 32K sleep 2:09 0.02 0.02 smpsched
0 ? 29 root 100 20 32K 32K sleep 2:08 0.02 0.02 smpsched
0 ? 30 root 100 20 32K 32K sleep 2:06 0.02 0.02 smpsched
0 ? 31 root 100 20 32K 32K sleep 2:08 0.02 0.02 smpsched
skregi25:/usr/local/bb/bbc1.9e-btf/etc# sar 1 5

HP-UX skregi25 B.11.11 U 9000/800 01/10/06

14:07:02 %usr %sys %wio %idle
14:07:03 0 0 1 99
14:07:04 0 0 0 100
14:07:05 0 19 0 81
14:07:06 0 0 0 100
14:07:07 0 0 0 100

Average 0 4 0 96
11 REPLIES 11
Steven E. Protter
Exalted Contributor
Solution

Re: statdaemon and very high load

Shalom Sally,

The amount of time that process has used is inordinate. Way to high.

I would look at patches for the performance monitoring daemon.

Next I would look at java patches or more current versions. prm's using too much resources in comparision to say oracle reports which is probably what is taxing java.

Load average is not a good indicator though. It roughly shows how many processes are waiting for cpu.

Why are they waiting for cpu? Because statdaemon java and prm are eating your system alive.

I'd also make sure oracle was patched up.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
David Gourley_3
New Member

Re: statdaemon and very high load

Use prospect (available from the HP-UX website for download) to tell you if you've large numbers of threads/processes being spawned and where they are being spawned (also what their lifespan is).

Once you've identified which process is spawning the threads, tusc is a good tool for working out what's happening in a specific process.

My understanding is that the load placed on statdaemon will be related to the number of threads running in a system so this could be indicative of a huge number of threads in the system.
Alzhy
Honored Contributor

Re: statdaemon and very high load

For starters - Kill your prm3d daemon.

Second trace that java processes .. I have a feeling its being used by an HP software either for monitoring your FC connections (you most likely have an EVA or XP array behind that SAN?).

And I see that you possiblu have ISEE agent deployed on your server. Maybe ISEE components are misbehaving? And possibly java is called by ISEE?
Hakuna Matata.
Coolmar
Esteemed Contributor

Re: statdaemon and very high load

Is prm3d not something you want running? I will kill it, I just want to make sure that it will not clobber something else.

I think I might have found the culprit...Multilink, a UPS monitoring software that uses Java. Once I stopped it the load went right down. So I guess I have to figure out why it is hammering this system and not the others.

If someone can get back to me on that prm3d question, that would be great.
Thanks
Alzhy
Honored Contributor

Re: statdaemon and very high load

Aha... another case of a misbehavin' java. I actually thought it's with ISEE or SecurePath or some SAN subsystem you have.

If you are not using PRM then it would be wise to just turn it off. Edit /etc/rc.config.d/prm* and Stop kill the PRM process.

It was a resource on our systems very early that was solved via a patch.
Hakuna Matata.
Coolmar
Esteemed Contributor

Re: statdaemon and very high load

Well the other system we have has the same problem, however removing Multilink did not solve its problem. It does have java running but it needs to run for Oracle OEM. We have versions 1.2 1.3 and 1.4 all installed...however, Oracle installs Java when OEM is installed. I am trying to find out what version is running (java -version) and it keeps dumping a core file and won't tell me anything. I figure it might need to be patched but if I can't find what version....argh!
Coolmar
Esteemed Contributor

Re: statdaemon and very high load

oh, and one more thing...on this system where I was able to bring down the load by turning off Multilink...statdaemon is still up at the top of "top" ... is there a way to HUP it or restart it without screwing up the system?

CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
0 ? 3 root 128 20 32K 32K sleep 1370:05 4.06 4.05 statdaemon
0 pts/tz 22889 pf20903 154 20 27576K 3456K sleep 0:00 2.62 0.58 quiz
Alzhy
Honored Contributor

Re: statdaemon and very high load

Sally,

Remember that any one time you'll have several versions of Java runtime or SDK running on your system. Most suites, tools and software bundle their own Java release.

Most likely the java that's possibly the cause of all the woes comes from your SAN or ISEE monitors. I believe you have ISEE agents installed.

UNIX95=1 ps -efH

to ascertain what root owned Java process belong to which tool/software.
Hakuna Matata.
Coolmar
Esteemed Contributor

Re: statdaemon and very high load

I only see oracle using java...here is what I found:

root 3666 5980 1 12:23:50 pts/1 00:00 grep jdk
oracle 2953 2932 0 08:52:34 ? 03:51 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Djava.secu
oracle 2954 2932 0 08:52:34 ? 00:50 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Djava.secu
oracle 3069 2932 0 08:53:36 ? 00:25 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Dsun.rmi.d
oracle 3046 2932 0 08:53:19 ? 00:18 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Xmx256m -D
oracle 2701 2700 0 08:50:52 pts/0 00:45 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Xnoclassgc
psi3:/homeroot# UNIX95=1 ps -efH |grep jre
root 3694 5980 1 12:23:57 pts/1 00:00 grep jre
psi3:/homeroot# UNIX95=1 ps -efH |grep java
root 3814 5980 0 12:24:13 pts/1 00:00 grep java
oracle 2953 2932 0 08:52:34 ? 03:51 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Djava.secu
oracle 2954 2932 0 08:52:34 ? 00:50 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Djava.secu
oracle 3069 2932 0 08:53:36 ? 00:26 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Dsun.rmi.d
oracle 3046 2932 0 08:53:19 ? 00:18 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Xmx256m -D
oracle 2701 2700 0 08:50:52 pts/0 00:45 /u04/app/oracle/product/OEM/jdk/bin/PA_RISC2.0/java -Xnoclassgc
Alzhy
Honored Contributor

Re: statdaemon and very high load

Then Java can be discounted as the cuplprit for the high load.

What does glance tell you? If you have glance display the processes that have high I.O rates. Maybe there are clues on there. Or how about stopping ISEE agents. Again I I think you have ISEE on your system.
Hakuna Matata.
Coolmar
Esteemed Contributor

Re: statdaemon and very high load

Stupid question...where do you stop Isee? I can't find it anywhere in the init files.