Operating System - HP-UX
1748156 Members
3983 Online
108758 Solutions
New Discussion юеВ

Re: fuser uses too much CPU

 
SOLVED
Go to solution
Fedon Kadifeli
Super Advisor

Re: fuser uses too much CPU

Dennis,

System has 32 GB memory and

# kctune | grep filecache
filecache_max 979635732 3% Imm (auto disabled)
filecache_min 326545244 1% Imm (auto disabled)


The system was last patched 5 months ago with HP recommended patch bundle.
Johnson Punniyalingam
Honored Contributor

Re: fuser uses too much CPU

Check below link , tips on filecache

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1287943

Checked with one of my (Dev Database server)

Phys Mem : 8.0gb

# kctune |grep filecache
filecache_max 651795824 8% Imm (auto disabled)
filecache_min 244423434 3% Imm (auto disabled)
Problems are common to all, but attitude makes the difference
MaSa_3
Advisor

Re: fuser uses too much CPU

Hello Fedon,
I am currently facing a similar problem - so I found your posting.

I have got 6 productive HP-UX boxes(11.11) almost the same application Software and in general very low system load at all.
Three of the boxes patched about 3 month ago with HP recommended Patches about 5 month old, running fuser very slow with high CPU load in system mode.

The tree other boxes patched more then one year ago and uptime is also more then one year, those systems do not show this issue.
You can run fuser is without impacting system performance - also an other test system where I can do what I like - but I can not reproduce the problem.

Rebooting one of the impacted boxes is not option in this state of the analysis.

A wild guess might be, a specific patch together with an uptime, longer about 100 days.
We noticed the issue the fist time in files-system clean up job about two weeks ago which uses fuser and never not use significant CPU resources before

What I noticed is that the fuser command is running "pstat" system calls in loop, changing just the last parameter until the function returns with return code "0".
And then continuing with an other "pstat" call(see tusc output below ).
On the boxes that are OK it took just a few iterations on the others it took sometimes several 10000 iterations.

I have just started the analysis and keep you informed on the progress.


#time /usr/sbin/fuser /etc/hosts
/etc/hosts:

real 0m51.06s
user 0m0.04s
sys 0m50.50s
#what /usr/sbin/fuser
/usr/sbin/fuser:
fuser.c $Date: 2004/03/27 22:08:43 $Revision: r11.11/2 PATCH_11.11 (PHCO_30075)
$Revision: @(#) all R11.11_BL2004_0422_1 PATCH_11.11 PHCO_30075
Thu Apr 22 05:57:31 PDT 2004 $

#tusc /usr/sbin/fuser /etc/hosts

....snip....
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27324) ....................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27325) ....................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27326) ....................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27327) ....................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27328) ....................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27329) ....................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27330) ....................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27331) ....................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27332) ....................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 24924, 27333) ....................... = 1

... snip.....
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 83) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 84) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 85) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 86) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 87) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 88) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 89) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 90) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 91) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 92) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 93) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 94) .......................... = 1
pstat(PSTAT_PROC_VM, 0x7eff3f80, 104, 11571, 95) .......................... = 0
pstat(PSTAT_PROC, 0x7eff0aa8, 1176, 10, 184) .............................. = 0
write(2, "\n", 1) ......................................................... = 1
sysconf(_SC_CPU_VERSION) .................................................. = 532
brk(0x400015f0) ........................................................... = 0
brk(0x400035d8) ........................................................... = 0
brk(0x40006000) ........................................................... = 0
ioctl(1, TCGETA, 0x7eff0cd8) .............................................. ERR#25 ENOTTY

write(1, "\n", 1) ......................................................... = 1
exit(0) ................................................................... WIFEXITED(0)
#
Fedon Kadifeli
Super Advisor

Re: fuser uses too much CPU

Hi MaSa,

Although my system is 11.31 (not the same as yours), I think we are facing the same problem.

I have seen high number pstat system calls in Glance.

The problem may not be directly related to the fsuser executable itself, which has not changed for quite some time:

# what /usr/sbin/fuser
/usr/sbin/fuser:
$Revision: 92453-07 linker linker crt0.o B.11.16.01 030415 $
$Revision: B.11.31_LR

# ll /usr/sbin/fuser
-r-xr-xr-x 1 bin bin 32768 Feb 15 2007 /usr/sbin/fuser


MaSa_3
Advisor

Re: fuser uses too much CPU

Hi,
your are right, it is not the same HP-UX version, it is not the same fuser executable and it is not related to that executable.

I just noticed that the "losf" command also shows the same behavior.
It took minutes of CPU usage in system mode to run to completion.

lsof and fuser need the same data from the HP-UX Kernel, lsof even more.
And the lsof trace has also this pstat loops as fuser.

Do you see the same behavior?

I am not really sure what I should learn form this pstat calls.
Return Value "1" is usually "EPERM" -- insufficient permission/not super user.
Is it busy waiting on some locked resources/system tables becoming available?

What I try next is to shutdown some subsystems not related to end user businesses, and see if things change.

Ideas welcome! Thanks in advance!
MaSa_3
Advisor
Solution

Re: fuser uses too much CPU

Hi,
shutting down unneeded subsystems pushs me closer to a solution!

Stopping and restarting "utild" witch is started from "init" solves the problem.
After kicking out this daemon "fuser" and "lsof" behave as normal not consuming significant CPU resources anymore.
Restarting this daemon did not make the thing worse again.

Dose your system run utild also?

If your inittab looks like my one, you can just kill the running utild and it is restarted by init. (see commands below)

It works on all of my systems impacted.

The next step might be to look for patches or updates for that peace of software -- but not today!

So I recommend to give it a try!
Good Luck!

# uptime
1:18pm up 116 days, 3:45, 5 users, load average: 0.05, 0.07, 0.08
# time /usr/sbin/fuser /etc/hosts
/etc/hosts:
real 0m40.56s
user 0m0.04s
sys 0m40.37s <<<<<<<<<<<<< Bevor !!!!!

# grep utild /etc/inittab
util:23456:respawn:/usr/lbin/utild # Utilization Provider daemon

# ps -ef | grep utild
root 26952 1 0 Jan 12 ? 192:29 /usr/lbin/utild
# kill 26952
# ps -ef | grep utild
root 13647 1 0 16:00:27 ? 0:00 /usr/lbin/utild
# time /usr/sbin/fuser /etc/hosts
/etc/hosts:
real 0m0.04s
user 0m0.00s
sys 0m0.03s <<<<<<<<<<<<<<<<<< After !!!


#what /usr/lbin/utild
/usr/lbin/utild:
01.08.03.01
#swlist -l file | grep utild
utilProvider.UTIL-ENG-A-MAN: /usr/share/man/man1m.Z/utild.1m
utilProvider.UTIL-RUN: /usr/lbin/utild
Fedon Kadifeli
Super Advisor

Re: fuser uses too much CPU

Yes MaSa!

You are right! utild seems to be the culprit.

# time lsof >/tmp/lo

real 2:31.9
user 1.6
sys 2:28.9

# grep utild /etc/inittab
util:23456:respawn:/usr/lbin/utild # Utilization Provider daemon

# ps -ef | grep utild
root 3528 1 0 Nov 4 ? 1526:55 /usr/lbin/utild
root 9965 10626 0 17:38:10 pts/2 0:00 grep utild

# kill 3528

# ps -ef | grep utild
root 9987 1 2 17:38:18 ? 0:00 /usr/lbin/utild
root 10002 10626 0 17:38:19 pts/2 0:00 grep utild

# time lsof >/tmp/lo

real 3.7
user 1.5
sys 2.1

# what /usr/lbin/utild
/usr/lbin/utild:
$Revision: 92453-07 linker linker crt0.o B.11.16.01 030415 $
HP aC++ B3910B X.03.37.01 C++ Standard Library (RogueWave Version 1.2.1)
HP aC++ B3910B X.03.37.01 Classic Iostream Library
HP aC++ B3910B X.03.37.01 Language Support Library
01.08.01.01

# swlist -l file | grep utild
utilProvider.UTIL-ENG-A-MAN: /usr/share/man/man1m.Z/utild.1m
utilProvider.UTIL-RUN: /usr/lbin/utild

# ll /usr/lbin/utild
-r-xr-xr-x 1 bin bin 806912 Nov 16 2008 /usr/lbin/utild

# time fuser -u /var/adm/crash
/var/adm/crash:

real 0.0
user 0.0
sys 0.0
Dennis Handly
Acclaimed Contributor

Re: fuser uses too much CPU

>MaSa: What I noticed is that the fuser command is running "pstat" system calls in loop, changing just the last parameter until the function returns with return code "0".

Yes, that's how you can call pstat_getprocvm(2) and iterate over the segments for PID 24924. And it seems that this PID has way too many memory regions, > 27333!

>I am not really sure what I should learn form this pstat calls.

You can know exactly what it is doing. It is probably scanning for mmapped files?

>Return Value "1" is usually "EPERM"

No, it means it returned one memory region.

>Stopping and restarting "utild"

And that was PID 24924?
Fedon Kadifeli
Super Advisor

Re: fuser uses too much CPU

Yes,

It seems that utild has many memory regions (MEMMAP). I checked from Glance.

Is this a general behavior for utild?
Dennis Handly
Acclaimed Contributor

Re: fuser uses too much CPU

>Is this a general behavior for utild?

I'm not sure. It seems like an awful lot. You should contact the Response Center so you can mention this problem with utild(1m) and fuser/lsof.