Operating System - HP-UX
1827212 Members
2419 Online
109716 Solutions
New Discussion

Lag in output from "ps -ef"

 
SOLVED
Go to solution
TrustNo1
Regular Advisor

Lag in output from "ps -ef"

Occasionally, during the day, if I perform a "ps -ef", and grep for a significant ammount of processes, I will see a 2-5 second "hang" in the terminal output. I have performed a "bdf" at the same time (in separate window) and the bdf returns immediate, while the ps -ef is still hung. Sometimes this happens when an "ll" is perofmed also. I do see a degridation in %rcache and %wcache output from "sar -b 10 5" during these episodes. Normal for %rcache is 90-97% and normal for %rcache is 85-90%. During the episodes, %rcache might dip to 55-80% and %wcache to 65-87%.
Any ideas?
The only recent change was a data migration to a new EMC box. "sar -d" shows the i/o is evenly spread over two HSC-FWD SCSI cards and never approaches the sustainable MB transfer limit.
Thanks, joe
Dare to Dream
14 REPLIES 14
Madhu Sudhan_1
Respected Contributor

Re: Lag in output from "ps -ef"

Joseph:
Even I had a smiliar problem with ll getting hung when I tar-ed 45gb of data into a single directory and I ll on this particular directory, it hangs and even "root" can't kill it as glance says it is in the kernel priority mode. I think this is a sure sign of filesystem corruption. Suggest run fsck or reboot the system.
Hope this helps.

...Madhu
Think Positive
TrustNo1
Regular Advisor

Re: Lag in output from "ps -ef"

One additional note, while the "ps -ef" or "ll" is hung, Glance-Plus shows the process as sleeping!
~jdk
Dare to Dream
Madhu Sudhan_1
Respected Contributor

Re: Lag in output from "ps -ef"

What about CPU utilization ?

...Madhu
Think Positive
TrustNo1
Regular Advisor

Re: Lag in output from "ps -ef"

CPU is showing no more than 80%-85%(over three processors this is the highest)
~jdk
Dare to Dream
Madhu Sudhan_1
Respected Contributor

Re: Lag in output from "ps -ef"

Looks to me like it is a CPU bottleneck. (May be)

You may also have to check whether it is SYSTEM mode percentage or USER mode percentage.
If it is a user mode percentage then
reduce the number of applications you are running on the server or if you have scheduled some batch jobs, reschedule it to some non-peak hours.

If it is a system mode percentage, you need to investigate what the system is doing ? Why is it spending so much of time in system mode ? May be it is thrashing excessively and doing nothing than swapping. If so then it is a memory bottleneck.

I think perfview is your best bet to know what is going on.

Hope this helps.

...Madhu
Think Positive
Dan Hetzel
Honored Contributor
Solution

Re: Lag in output from "ps -ef"

Hi Joseph,

Are you running NIS or NIS+ ??
'ps -ef' will look for usernames and if your NIS server is slow to respond, this could introduce the delay.
Same apply to 'll' as UIDs are converted to real names.

Did you ever try to do a 'ps -e' instead. If this is done without delay, it will confirm what I said before. You could replace your 'll' by a simple 'ls' (not 'ls -l') to further investigate.

Best regards,

Dan
Everybody knows at least one thing worth sharing -- mailto:dan.hetzel@wildcroft.com
Alan Nicholson
Occasional Advisor

Re: Lag in output from "ps -ef"

I would agree with some of the other posts here, this is most likely to be a memory bottleneck of some description.

If your system isn't actually "out of memory", you could try increasing the dbc_max_pct parameter in the kernel to allow the system to allocate more memory for disk buffers. You may also want to look at the ninode parameter to allow the kernel to "remember" more files (which may slightly alleviate the hang in ps and ll).


It's worth bearing in mind two further points:

1) Lack of memory ofen manifests itself as a CPU bottleneck.
2) There will always be a bottleneck somewhere in the system, e.g. You have probably created a memory bottleneck by installing the EMC to remove a disk bottleneck. The best you can do is try and decide where you can tollerate the problem.
In time, on budget, good quality. Pick any two.
TrustNo1
Regular Advisor

Re: Lag in output from "ps -ef"

Dan!
Give that man a cuppie doll!
You nailed the problem right on the head!
"ps -ef" hung while "ps -e" spewed very fast.
The same lag is true for "ypcat" and login attempts.
I haven't heard of any user problems other than mine and our oracle dba seeing the same lag, and couldn't see any lacking resources.
Now I have to figure out what changed, (NIS I guess) to create the problem. Possibly a NIS server enhancement requires me to visit the patch barn.
Man o' man I can't thank you enough......
Thank you to everyone that replied!!
I hope I can be as helpfull to you some time!!!
~jdk
Dare to Dream
TrustNo1
Regular Advisor

Re: Lag in output from "ps -ef"

Allen,
I didn't get to mention we have 3.4GB of memory and 4.57GB of swap....I was reeeealllly hoping memory was not a problem.....
Thank you for you timely response!
~jdk
Dare to Dream
Dan Hetzel
Honored Contributor

Re: Lag in output from "ps -ef"

Hi Joseph,

Glad it helped !

You'll most probably have to get the latest patch for ONC/NIS. As you didn't tell if you were running 10.20 or 11.0 here are the 2 links:
10.20:
http://europe-support2.external.hp.com/wpsl/bin/doc.pl/screen=wpslDisplayPatch/sid=bc6739711a970469a1?PACH_NAM=PHNE_21375&HW=s800&OS=10.20

11.0:
http://europe-support2.external.hp.com/wpsl/bin/doc.pl/screen=wpslDisplayPatch/sid=3918536e01da56f681?PACH_NAM=PHNE_22125&HW=s800&OS=11.00


Best regards,

Dan
Everybody knows at least one thing worth sharing -- mailto:dan.hetzel@wildcroft.com
TrustNo1
Regular Advisor

Re: Lag in output from "ps -ef"

One last note worthy of mention......
dbc_max_pct and dbc_min_pct are both set to 2.
The reasoning here, is to let the EMC do the caching as it seems to be faster. The EMC has 3GB cache memory. The 2% setting was determined by tweaking and watching it over many weeks.
Dare to Dream
TrustNo1
Regular Advisor

Re: Lag in output from "ps -ef"

Yes, disclosure of the O.S. version would have been a prudent move....hp-ux 10.20.
Thanks for the link!
Have a GREAT day!!
~jdk
Dare to Dream
TrustNo1
Regular Advisor

Re: Lag in output from "ps -ef"

Final Resolution:
Apparently on the last reboot (Sunday) ypbind picked a "Slave" NIS server that had a not so good network connection. After changing the YPBIND_OPTIONS from ="" to ="-ypset", and the YPSET_ADDR from ="" to ="KnownGoodNISservADDR", I forced the change by stopping and restarting the nis.client. This fixed the problem completely!
~jdk
Dare to Dream
Dan Hetzel
Honored Contributor

Re: Lag in output from "ps -ef"

Well done, Joseph !!

Dan
Everybody knows at least one thing worth sharing -- mailto:dan.hetzel@wildcroft.com