cancel
Showing results for 
Search instead for 
Did you mean: 

sar cpu stats - confused!

sar cpu stats - confused!

Hopefully someone out there can help to clarify something strange for me - I've attached some sar -q and -u output for a typical backup window for our production GS80 (Tru64 5.1 PK5). During this window we are running our backups (Advfs cloning streamed to 4 tape drives via vdump), as well as some production reporting. I realise that running production reporting and backups at the same time will cause all sorts of performance issues, however that's the way it has to be...

The application is a fairly tubby Oracle 9i database, the reports will be fairly intense SQL which probably could do with some tuning, however that's not my question.

All things considered the box runs very well during this interval, I have no real issues with the way it's tuned. My confusion is what sar is reporting in sar -q. The runque figures are unreasonable for this box, yet no real CPU problems are evident.

What exactly does runque mean in this context?
3 REPLIES
Joris Denayer
Respected Contributor

Re: sar cpu stats - confused!

Hi Stuart,

I don't know the sar output very well. Anyway, these values seem to be pretty high.
I wonder how many CPU's/Rad's are configured on your GS80 ?

Do you have the same high runq values with the commands "vmstat 5" or "collect -sc -i5" ?

Did you always had these high numbers or is this a recent phenomenon ?

Joris

To err is human, but to really faul things up requires a computer

Re: sar cpu stats - confused!

Thanks for your reply Joris. The host has 5 1001Mhz CPU's, 3 in one quad, 2 in the other.

This issue has been happening for some time - at least since the upgrade from 4.0G to 5.1 (we didn't run sar before upgrading). The host itself runs well, I'm assuming that the stats are not reflective of actual performance, but some sort of aberration.

I'll collect some more stats tonight based on collect and vmstat and post them tomorrow.

Re: sar cpu stats - confused!

Last night I gathered some more statistics for the offending period. The attached file shows the following:

a. sar -q output as collected by sa1 from cron every 10 minutes
b. output from sar -q 5, with stdout captured to a file
c. output from collect -sc -i 5, captured to a file.

It looks to me that there's a fundamental difference between sar and sa1, either the information which is being captured, or the way it's being reported. Neither of these agree with collect.

The collect stats look correct to me - runq in the region of 3-5, the user/sys/wait I/O ratios are about what I'd expect.

I think I'll raise a question with the Tru64 engineering guys - I don't trust the sar data anymore.