- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: a nice enigma!
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-23-2002 11:53 AM
05-23-2002 11:53 AM
I have two computers which are configured "exactly" the same (you know what I mean). However, when I do "top" I sometimes see that one is using lots of "nice" CPU & virtually no "user" cpu & the other is reversed, namely lots of "user" & no "nice"! It is not always consistent which only adds to the puzzle.
My first thoughts were that my processes were suffering from priority degredation, which will only get worse with time. However, I thought "nice" & HPUX priorities were seperate entities - could be wrong here -.
o Can anyone set me straight on these issues? Explain the issues at hand (in simple low number of sylable terms that management have a chance of understanding)?
o Is there a way of fixing the priorities of these processes (say 154 or something) or stop them degrading with time (I canot use rtprio or rtsched to give then a Real Time or POSIX priority [<127] as this will/may cause ServiceGuard failover at the busy periods!!! trust me, I HAVE seen this before).
o I do give all my advisers points, the more advice, the more points (check out my stats)
Any takers
Tim
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-23-2002 12:12 PM
05-23-2002 12:12 PM
Re: a nice enigma!
HTH
mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-23-2002 12:30 PM
05-23-2002 12:30 PM
Re: a nice enigma!
I believe the nice column is only showing values that have been "nice-altered" or deviate up or down from the default (20).
I could be wrong here.
Could be as simple as a good deal of users have started procs in the background which imposes a nice hit of 5, I think.
Most everything runs @ 20 by default except some of the logging daemons.
What do the actual process lines show - do you have a lot of procs that aren't 20?
I usually only pay attentions to the user & system columns anyway - they tell the story.
Rgds,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-23-2002 01:06 PM
05-23-2002 01:06 PM
Re: a nice enigma!
1 - The computers are built identically, filesystems, kernels, software, patches, storage, network, cards in slots with same instance numbers the whole shooting match. There are occasionally "slight" differences but this is normally administrator error.
2 - There are no "users" as such. It runs an application that deals with phone call routing/process/handling, the only people who fiddle are administrators.
3 - By way of an example, there is a "daemon" process that is called "pmd". On one box it had a nice value of 20 & the other 24. These processes SHOULD (I will not discount admin error) start automatically using various "identical" configuratio files.
4 - The only majour variation is that the "services/daemon" (pmd etc) are/have/do get restarted at different times, so one machine/set of daemons could have been running for 7 days un-touched, whereas the other may only have run for 1 day.
5 - There is a database (Informix) but this runs on it's own server/computer & is connected to via the network.
Basically I'm 70% sure there is some priority degradation going on, BUT I thought this had nothing to do with the nice value, as I believe/understand they are seperate entities! If I'm right this implies that someone may be using "renice" on running processes. In which case I need to "re-educate" them urgently. If I'm wrong, then I need to explain why processes seem to have a nice value of > 20, and hopefully fix it (if possible).
Tim
Any more suggestions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-23-2002 01:12 PM
05-23-2002 01:12 PM
Re: a nice enigma!
Also, you could have a processor failing, or some other type of bottleneck on the one system that you have not yet seen. This could cause contention for the processes also.
It's just a thought.
Hope it helps
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-23-2002 01:14 PM
05-23-2002 01:14 PM
Re: a nice enigma!
This, most likely, is your culprit. Remember that the nice values are not intrinsic measurements of any one thing, they are relative values of processing time compared to the other processes on the system. Unless you are seeing other symptoms like swap issues or i/o binding, I wouldn't worry too much about it.
HTH
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-23-2002 01:25 PM
05-23-2002 01:25 PM
Re: a nice enigma!
Well if an admin restarts the daemon from the command line & in the background using "&" it would be at a nice of 24.
I was incorrect "&" imposes a nice hit of 4 not 5.
That would be my educated guess & the solution would be to "instruct" the admins to not start/restart it using the executable but to run the startup script in /sbin/init.d....hopefully it has one.
Rgds,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 01:13 AM
05-24-2002 01:13 AM
Re: a nice enigma!
Many thanks for the responses... Here is some more info attached in the nice.txt file. I have supplied 3 things for each node
o top, showing how the CPU is split
o glance, showing pmd (The "Daddy" daemon proc)
o glance, showing pmd (The "Daddy" daemon proc) cummulatively
From you answers there seems to be two possibilities:
1 - INITIATION METHOD; node 1 was started as a background process and node 2 was not. As the nice value is inherited (I believe) this will explain the difference.
2 - PRI & NICE DEGRADE; There is some priority degredation, which also degrades the nice value (which I did not think happened, but we live and learn).
Unfortunately there is evedence for either as o The "niced" node has had pmd running for over 7 weeks and the other has only been runnig for two weeks.
o A quick check by myself shows that all the procs I checked do indeed have a nice value of 24. These are child procs of pmd
I will be digging a bit deeper....
Regards
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 01:33 AM
05-24-2002 01:33 AM
Re: a nice enigma!
Hi Tim,
no 2 servers are identical. Just to do a quick check, is the output from;
swlist -l fileset | wc -l
The same on both servers ?
Cheers,
Stefan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 02:23 AM
05-24-2002 02:23 AM
Re: a nice enigma!
I also awarded you 3 points, in retrospect this should be more (7), sorry... put a dummy reply in and I'll give you 4 more...
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 02:32 AM
05-24-2002 02:32 AM
Re: a nice enigma!
Hi Tim,
aha, so they do have different numbers of filesets (patches+software) installed. The only way to ensure the software install is identical is to start by ensuring the same number of installed filesets. Just curious - how many filesets different were they ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 02:53 AM
05-24-2002 02:53 AM
Re: a nice enigma!
sn2b --> 1734
I have checked "patches" which is probably more important and there are many a difference. I'm not wholy convinced of the patch stuff, but I will dig a bit deeper.
On a slightly different tack, I looked at another cluster running similar (but different version) software and found that despite the fact it had been running for some 7-8 weeks the priorities are 20.
My current favorite is the background process as ALL the processes that are fathered by pmd have a nice value of 24 even the ones with a priority of 0 (zero)...
Any more thoughts, any one, generosity is my middle name....
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 04:07 AM
05-24-2002 04:07 AM
Re: a nice enigma!
From your 'top' and 'glance' samples, pmd is only active on the 2nd node. It may be interesting to know which processes on the 1st node are consuming CPU "nicely".
Also, if the patches on the two nodes are different, that *may* be the cause. Have you also checked with 'swlist -l fileset -a state' if all patches are configured?
Mladen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 04:27 AM
05-24-2002 04:27 AM
SolutionNice value don't change over time. They are set when a process start or ,indeed, inherited from the parent.
The thing that does change is priority (see top).When (Time Shared) processes run, they loose priority and regain priority as they wait their turn to run. A process's nice value is used as a factor in calculating how fast a process regains priority.
Priority queues:
-32 - -1 : Real time (POSIX)
0 - 127 HPUX real time (rtprio)
128 - 251 Time Share procs
252 - 255 Swapped processes
HtH,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 04:30 AM
05-24-2002 04:30 AM
Re: a nice enigma!
You can also check the differences between the files /var/adm/sw/swagent.log on the two nodes.
Another useful check may be the output from 'kmtune'. Any differences may point you further in terms of how the two nodes are different.
As for the CPU utilization, can you list top 2 or 3 processes that consume most of the CPU on each system?
Mladen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 04:35 AM
05-24-2002 04:35 AM
Re: a nice enigma!
Two identical machines running the same jobs for the usr / sys and nice to match would have to have "Exactly" the same processes running at the the same point of execuation at the same time.
Even this is unlikly as the hardware throughput of devices CPU/ MEMORY/ETC whilst rated the same is not.
So if you have processes "NICED" exactly the same on both machines the value of nice from top or glance will never match.
Paula
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 04:36 AM
05-24-2002 04:36 AM
Re: a nice enigma!
As far as the configured state of the software everything is "configured", there are a few items in the "installed" state, but I can explain these, nothing is "partial" or "corrupt"
regards
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 04:52 AM
05-24-2002 04:52 AM
Re: a nice enigma!
I have however done the following
# ps -el | awk '$8=="24"{print $0}'
This shows that ALL the processes started by pmd have a nice value of 24. As I believe nice is an inhereted value I think this is damming evedence that someone either re-niced pmd or started the application as a background process.
Paula - I'm not sure what you are saying.
a) No two machines are alike therefore you would not expect to see usr/nice the same. I partially agree, but I would not expect to see the pattern in the nice.txt file which is totally reversed.
b) The machines are different, so the nice values will be different. I disagree, I would expect to see a nice value of 20 across the board, it is the same software/binaries (with some minor exceptions)
I'm still figuring that someone started the application in the background or re-niced pmd.
Regards
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 05:01 AM
05-24-2002 05:01 AM
Re: a nice enigma!
How about nicing the process to what it should be and monitor it.
I use a script that picks up certain logings and nices them down as their routines can cause load problems on my main server.
I am sure that you can modify it to monitor the nice value of your process.
--------------------------------------------
#!/bin/ksh
# Automatically nice down the ftpbbs universe routines
######################################################
# PJFC 2001
######################################################
# Get parent pids
######################################################
q=`who -u | grep ftpbbs `
p=`who -u | grep ftpbbs | awk '{print $7, $15 }'`
######################################################
# Seperate each pid to a string
######################################################
a=`echo $p | awk '{print $1}'`
b=`echo $p | awk '{print $2}'`
######################################################
# Pick up pid of universe process and nice value
######################################################
y=`ps -efl | grep $a | grep -v grep | grep -v sh | grep root | grep uv | awk '{print $4}'` # PID
z=`ps -efl | grep $a | grep -v grep | grep -v sh | grep root | grep uv | awk '{print $8}'` # Nice value
######################################################
# Check nice value
if [ $z = 20 ]
######################################################
# If nice value = 20 then a restart has occured so nice it down
######################################################
then
renice -n 19 $y
fi
######################################################
# Do it all again for other ftpbbs login
######################################################
w=`ps -efl | grep $b | grep -v grep | grep -v sh | grep root | grep uv | awk '{print $4}'`
x=`ps -efl | grep $b | grep -v grep | grep -v sh | grep root | grep uv | awk '{print $8}'`
######################################################
# Check nice value
if [ $x = 20 ]
######################################################
# If nice value = 20 then a restart has occured so nice it down
######################################################
then
renice -n 19 $w
fi
echo "Renice ran "
exit 1
---------------------------------------------
HTH
Paula
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 05:08 AM
05-24-2002 05:08 AM
Re: a nice enigma!
They have to be niced when they start by the command line or if someone changes them after they have started running.
Nicing is a people thing, HPUX does not just nice processes because it feels like it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2002 06:39 AM
05-24-2002 06:39 AM
Re: a nice enigma!
Many thanks for the scripts. I was going to manually "renice" pmd then do a "rolling restart" of the app, but the script I can use to "renice" everything with no restart.
It is also good to know that [unlike HPUX timeshare] "nice" does not degrade with time. So my original understanding about them being seperate was correct.
Many thanks for the input.
Tim