Re: yet another disk io question

Jose Molina · ‎02-01-2007

Hello,

I've been browsing the forums and gathering some data about how to "trace" an i/o problem with the disks.

There's a lot of questions about this, and after reading the responses i'm still a bit lost about how to trace my problem.

We have one DL380 with disks in RAID1, and RH 7.3 (old, i know). There's a custom application running on it, and the developers are "seeing" always a delay of about 4ms between every write they do on this app.

I'm trying to find if this is a O.S. problem or app problem, and i've started looking at io data by checking your posts about it.
Doing an iostat -x 300 10 i get these results:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
/dev/cciss/c0d0
0.00 1.34 0.00 2.00 0.00 26.85 0.00 13.43 13.43 0.62 311.00 62.83 1.26

/dev/cciss/c0d0p5
0.00 1.34 0.00 2.00 0.00 26.85 0.00 13.43 13.43 0.62 311.00 62.83 1.26
/dev/cciss/c0d1
0.01 0.07 0.01 0.52 0.19 4.77 0.09 2.39 9.36 0.31 584.91 340.88 1.81

If i read this correctly, the mean await of about 0.300 seems to be too big?
CPU 99% idle, mem and net are Ok. CCISS version is 2.4.50 (doing a strings on cciss.o) on kernel 2.4.18-3smp.

Ivan Ferreira · ‎02-01-2007

The first thing I would like to see is the the cpu iowait value. Obtain this value from the output of vmstat. If you have a high iowait value, then there is something extrange with your disk subsystem. What is the value of the "wa" column from vmstat?

Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?

Jose Molina · ‎02-01-2007

Hmm, maybe RH 7.3 has an old version of vmstat but i don't see any "wa" column. With vmstat 300 10, i get these results:
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
0 0 0 0 4864 8320 206276 0 0 1 15 116 25 0 0 100
0 0 0 0 3944 8624 206692 0 0 2 16 119 58 0 0 100

Hmmm... · ‎02-01-2007

The 'wa' was introduced in kernel 2.5.41, before that it's been always zero. I'm not sure if it's been backported to the 2.4 series kernels.

You could test the 2.4.20 kernels from fedoralegacy.org

http://ftp.funet.fi/pub/mirrors/download.fedoralegacy.org/redhat/7.3/updates/i386/kernel-smp-2.4.20-46.7.legacy.athlon.rpm
http://ftp.funet.fi/pub/mirrors/download.fedoralegacy.org/redhat/7.3/updates/i386/kernel-smp-2.4.20-46.7.legacy.i586.rpm
http://ftp.funet.fi/pub/mirrors/download.fedoralegacy.org/redhat/7.3/updates/i386/kernel-smp-2.4.20-46.7.legacy.i686.rpm

You may find other important updates for the RedHat 7.3 box at the ftp.

Jose Molina · ‎02-01-2007

The thing is, we're stuck with this kernel by compatability constraints. There must be some way to verify that the system is operating correctly...even if the answer is that it's overloaded that would be fine, because i could certify that what we need is new hardware to get better i/os... any other tool that i could run besides vmstat and iostat that could give you more info?

Ivan Ferreira · ‎02-02-2007

What I can see from the first iostat output is that you are using 8k block size for I/O and your disks are not very used %util. But this is not a warranty that you don't have any problems.

What I would do is to test the performance of the raw device using the dd command or a software like iometer and try to stress your disks to identify the maximum I/O capabilities provided for the subsistem. If you can obtain high I/O rates with performance meassuring tools, then the problem should be traced from the application.

Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?

Jose Molina · ‎02-02-2007

Thanks for the replies, i'm going to check if i can get high i/os and if i do, i'll consider the platform "healthy". One last question, there's no tweaking to be done to cciss, nor a tool to do it, is there? What i mean is, if i have a problem we're talking about the smartarray controller or whatever, but not due to bad cciss configuration. I've never had to do any modification to cciss and since i'm being pressed from the developers i'm trying to make sure, this is the usual way to work with it. (i.e. install and "forget")

Again thanks a lot for your time, it's hard to get support for these kind of things.

Ivan Ferreira · ‎02-02-2007

The kernel documentation does not provides any tuning option for the driver, see:

http://www.mjmwired.net/kernel/Documentation/cciss.txt

You may try by tuning the file system, I don't know if in that version of kernel you can tune the I/O elevators, but se the article here:

http://www.redhat.com/magazine/008jun05/features/schedulers/

Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?

Jose Molina · ‎02-02-2007

Well, i did some tests with dd, different blocksizes and all. Doing this dd several tests got these sustained results:
time dd if=/dev/zero of=./rawfile bs=8k count=204800
Time:
real 1m0.838s
user 0m0.220s
sys 0m12.900s
1.6G rawfile

While vmstat 10 showed this:
io system cpu
bi bo in cs us sy id
0 11 116 32 0 1 99
0 13140 211 93 1 17 83
1 24780 237 64 0 22 77
1 25468 237 56 0 23 76
10 25449 244 79 1 23 76
4 26967 245 85 1 23 77
1 25205 257 81 0 24 76
5 23331 257 75 0 17 83
6 29 127 47 0 1 98

On a new and faster server i got around 80000 bo's. So, it seems a bit slow... i don't know, we're talking about 4 years old HD versus a new platform. I don't have much numbers to compare...

Ivan Ferreira · ‎02-02-2007

The results shows about 26 MB/s. But you for a single thread application. I think that is not too bad for two disks configuration (RAID1). The problem is that you are trying with a single thread application, running just one instance and also, you are using the file system, so, much information can go to the buffers/cache.

For a better test you should use a new partition and bypass the file system, with a command like this:

Write performance:

for I in `seq 5`; do
dd if=/dev/zero of=/dev/cciss/c0d0p6 bs=8k count=131072 &
done

Read performance:

for I in `seq 5`; do
dd if=/dev/cciss/c0d0p6 if=/dev/null bs=8k count=131072 &
done

Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?

Jose Molina · ‎02-05-2007

Thanks, with all these info i think i have enough. I knew i had to do also a paralell test, but i don't have any spare partition to directly write into. I'm closing this thread, thanks a lot for the replies.

Jose Molina · ‎02-05-2007

Got enough info to be able to verify the system

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: yet another disk io question

yet another disk io question