Operating System - Linux
1824024 Members
3895 Online
109667 Solutions
New Discussion юеВ

Very high load peaks on a DL740 with RH AS 3

 
Domenico Viggiani
Super Advisor

Very high load peaks on a DL740 with RH AS 3

Hi,
I'm experiencing very high (even 40-50!) load peaks, during only a few of seconds.
During these peaks, user cpu is usually low but system cpu is ~80-99%.
System is a DL740 with 8 CPUs and 32 GB of RAM with Linux Red Hat Advanced Server 3.0 (kernel 2.4.21-9.0.3.ELhugemem), connected by a 2GBps SAN to an EVA3000 disk-array.
It is an Oracle-only machine, no other significative processes are running.
I tuned it using usual parameters from Oracle (eventually I can attach a kernel parameters list).
It seems that there is no exceptional I/O or paging/swap activity during peaks but I'm not sure (I'm using dstat - http://dag.wieers.com/home-made/dstat/ - to monitor several metrics at same time).

How can I identify what system does when load is high?

Any help will be appreciated.
16 REPLIES 16
Ivajlo Yanakiev
Respected Contributor

Re: Very high load peaks on a DL740 with RH AS 3

lsof
top
ps -ef |more
sar -d 1 100
vmstat
Domenico Viggiani
Super Advisor

Re: Very high load peaks on a DL740 with RH AS 3

It is not so simple.
I'm using dstat because it summarizes results from all these utilities and I can look at load, cpu, I/O, interrupt, context-switches, paging and swapping at same moment.
But I'm unable to understand what system does during peaks.
Vitaly Karasik_1
Honored Contributor

Re: Very high load peaks on a DL740 with RH AS 3

>How can I identify what system does when load >is high?

-do you have some cron jobs?
- do you have some scheduled tasks in Oracle?
Mike Jagdis
Advisor

Re: Very high load peaks on a DL740 with RH AS 3

32GB and Oracle suggests you will have serious amounts of cached disk data. One question: are you running the HP daemons to monitor the system? If yes I know why and even a work around :-)
Mike Jagdis
Advisor

Re: Very high load peaks on a DL740 with RH AS 3

Ok, this isn't the only "something sucking system time" thread so I'll paste the same here as in the others...

-----------------------------------------------
The explanation is somewhat involved...

If you trace cma*d you'll find that it doesn't do anything but open the device, ioctl, close. Admittedly rather more times than should be necessary but that's just incidental bad design.

You'll find the delay - and system time consumption - seems to happen on the close. From here you need a fairly good working knowledge of the Linux kernel...

Ok? still with me then?

Run oprofile for a while and you'll find the cpu time is being consumed by invalidate_bdev. Which is interesting :-).

Invalidate_bdev is called from kill_bdev. Kill_bdev is called from the block device release code. Release is what happens on last close. Now the monitoring daemon is opening the unpartitioned disk device which it is pretty certain nothing else has open. (Off hand I'm not sure if even having an fs on the device counts as it being open. There are subtle differences and I *think* I'm right in saying that block device access and fs access is considered different at this level. Don't quote me or blame me!)

So, each close triggers invalidate_bdev. Why is this so bad? Well, the idea is that when the last close happens on a device you need to flush any cached data because, with much PC HW, you can't be sure when the media gets changed. Invalidate_bdev isn't *meant* to be called often. It works by scanning through the entire list of cached data for block devices to find and drop data related to the device being closed. So it sucks system time and the amount is proportional to the amount of cached (from any device) data you have.

WORKAROUND:
All you need to do is to make sure that each time the cma*d daemon closes the device it isn't the *last* close - i.e. some other process has the device open. The other process doesn't even need to *do* anything. Try something along the lines of:

sh -c 'kill -STOP $$' < /dev/cciss/c0d0 > /dev/null 2>&1 &

Hope that's all clear! (As mud... :-) )

(HP: As well as blind debugging I do Linux & OSS consultancy. I happen to know the answer to this one as it came up at a major investment bank...)
Domenico Viggiani
Super Advisor

Re: Very high load peaks on a DL740 with RH AS 3

Thank you for all responses.
I know that blind debugging is not so easy but anyway I'm sure that on this forums there are a lot of smart people that can giv eme some advice.

Vitaly: I have no cron jobs and Oracle scheduled tasks.

Mike: I'm beginning to hate Linux VM! It caches a lot of file data and fills 32 GB even if you run 'ls'! It is the 'if you buyed RAM, use it' policy but I don't like it.
I don't run any HP utility. Can you give me some details?
xyko_1
Esteemed Contributor

Re: Very high load peaks on a DL740 with RH AS 3

Hi Domenico,

what is your backup police ? Does the peak happening around your backup time ?

Are you running rman ? rman is a very cpu consumming task.

It's possible to you start top right in the moment of peak time to see what is the major consuming task ?

regards,
Xyko
Domenico Viggiani
Super Advisor

Re: Very high load peaks on a DL740 with RH AS 3

xyko, I'm absolutely sure that no extra task is running.

I'm not excluding that Oracle is without guilt: sometime I see 10-15 oracle processes running for some seconds, bringing load to 8-15 and then disappearing.
But is it normal that some workload bring a big system like this to this peaks?
Vitaly Karasik_1
Honored Contributor

Re: Very high load peaks on a DL740 with RH AS 3

I suggest you to run "top" via cron every minute and save top output, so we'll be able to
catch process which eats CPU/RAM.

which driver do you use for storage?

Domenico Viggiani
Super Advisor

Re: Very high load peaks on a DL740 with RH AS 3

Vitaly,
sometime, processes run only a few seconds.

I use driver 7.00.03 (from HP): I'm planning to upgrade both kernel and driver to latest versions (2.4.21-20 and 7.01.01, respectly).

xyko_1
Esteemed Contributor

Re: Very high load peaks on a DL740 with RH AS 3

Domenico,

you have a situation that needs deep inspection. Some time ago I suggest acct to a problem and I think it may help you also.

Please look to my last reply in http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=713322

Hope it helps you.

Regards,
Xyko
Mike Jagdis
Advisor

Re: Very high load peaks on a DL740 with RH AS 3

Ok, the next step is to figure out what processes are involved. They aren't necessarily ones that are runnable - since it's a spike in system cpu the process(es) triggering it may well be blocked, in 'S' or more likely 'D' state.

Then you need to know what's going on. Install oprofile (get it from oprofile.sf.net and build it if necessary). Run it for a while and then use opreport to examine the suspect processes and see what bits of the kernel are getting hammered.

I still suspect a linear scan of buffer heads for some reason. I'm not that familiar with Oracle set up but you should be mounting filesystems it uses with the noatime option and you should have adjusted the bdflush sysctl values to spread I/O out rather than trying to do it in bursts?
Domenico Viggiani
Super Advisor

Re: Very high load peaks on a DL740 with RH AS 3

xyko, Mike,
now I'm running acct (installed by default in RH) and I'll give a look at oprofile too.
Now I'm seeing lot of Oracle processes in D state, only for a few of seconds, bumping load even to 15; what is D state?

Anyway, I'm currently using following (default) values for bdflush:
50 500 0 0 500 3000 80 50 0
but I tried also:
60 2000 0 0 500 3000 87 50 0

Do you really suggest disabling atime update for Oracle filesystem? Can you point me to an 'official' reference for this?

I'm beginning to think that peraphs a 32bit architecture is very inefficient with a lot of RAM :-(
Mike Jagdis
Advisor

Re: Very high load peaks on a DL740 with RH AS 3

Try googling fr "linux oracle tuning" and you'll find plenty of "advice". Probably http://www.oracle.com/technology/oramag/webcolumns/2002/techarticles/scalzo_linux01.html is more reasonable than some :-)

D state is uninterruptible sleep. 99% of the time that means waiting for disk I/O of some description.

Practically nothing uses atime so it's generally the first to go on loaded filesystems. Most fs' put the inode table at one end of the disk, away from the data so updating atime tends to encourage head movement for no reason.

And, yeah, large amounts of memory on a 32bit system is not great for performance. For one thing the address extension from the cpu side is something of a hack. For another, if you don't have 64bit PCI with a 64bit PCI controller and drivers that know about 64bit capable PCI, *every* I/O will involve copying to/from bounce buffers in the lowest 4GB...
Domenico Viggiani
Super Advisor

Re: Very high load peaks on a DL740 with RH AS 3

Solved!!!

It was a known issue with my kernel and Oracle version.

Oracle doc 262004.1:

- Much higher system time both while running and during connect/disconnect.
The problem gets worse as more users concurrently connect and disconnect.
The high system time can cause system instability depending on what's
running on the machine. If you are facing this issue you should:

1. updating to RHEL 3 U3

2. export DISABLE_MAP_LOCK=1 (set this so that oracle and the listener
inherit this) 3. Install the patch 3596858 (available for 9.2.0.4 & 9.2.0.5,
fixed on 10g

Also:

bug 3570979:

PERFORMANCE PROBLEM (HIGH SYS TIME) WHEN USING REMAP_FILE_PAGES() ON RAMFS



Bye

Domenico Viggiani
xyko_1
Esteemed Contributor

Re: Very high load peaks on a DL740 with RH AS 3

Congratulations Domenico,

and thank's for posting the solution.

Regards,
Xyko