Operating System - Linux
1752815 Members
5926 Online
108789 Solutions
New Discussion юеВ

Re: Extremely slow io on cciss raid6

 

Extremely slow io on cciss raid6

I've installed RHEL 5.1 on a DL320S server with a Smart Array P400 Controller with 6 SATA disks in a RAID6 (ADG) setup. The write speed is terrible. I normally get about 8Mb/s write speed which is not what I expect from such hardware.

I've tried different kernels and parameters to increase the speed and it has helped with the reading speed which is now at about 200Mb/s sustained rate as long as no writes occure during the read. As soon as a write occures the read speed decreases radically.

I've noticed a lot of people seems to have the same problem but so far I haven't seen any good solutions apart from replacing the array controller. The iowait is above 90% while writing to the disk and this is making the whole system incredibly slow. Just listing files in a directory can take 20s due to the iowait.

I'm running the latest of everyting now. Firmware, kernel etc but the problem is still there. I've tried the cciss.sf.net driver and the vanilla kernel driver. All the same.

System information:

uname:
Linux someserver 2.6.25 #1 SMP Wed Jun 11 21:21:21 CEST 2008 i686 i686 i386 GNU/Linux

from dmesg:
HP CISS Driver (v 3.6.14)
ACPI: PCI Interrupt 0000:0a:00.0[A] -> GSI 16 (level, low) -> IRQ 16
cciss0: <0x3230> at PCI 0000:0a:00.0 IRQ 217 using DAC
blocks= 4294967296 block_size= 512
blocks= 5860333808 block_size= 512
heads=255, sectors=32, cylinders=718179

blocks= 5860333808 block_size= 512
heads=255, sectors=32, cylinders=718179

cciss/c0d0: p1 p2

/proc/interrupts:
CPU0 CPU1
0: 255 0 IO-APIC-edge timer
1: 8 0 IO-APIC-edge i8042
3: 1 0 IO-APIC-edge
4: 2 0 IO-APIC-edge
8: 3 0 IO-APIC-edge rtc
9: 0 0 IO-APIC-fasteoi acpi
12: 131 0 IO-APIC-edge i8042
21: 990599 0 IO-APIC-fasteoi uhci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb3, uhci_hcd:usb4, ehci_hcd:usb6
22: 22324 0 IO-APIC-fasteoi ipmi_si
23: 166 0 IO-APIC-fasteoi uhci_hcd:usb5
215: 5802 2005 PCI-MSI-edge eth0
217: 512723 0 PCI-MSI-edge cciss0
NMI: 0 0 Non-maskable interrupts
LOC: 3144939 3144944 Local timer interrupts
RES: 1045 34959 Rescheduling interrupts
CAL: 209 653 function call interrupts
TLB: 445 478 TLB shootdowns
TRM: 0 0 Thermal event interrupts
SPU: 0 0 Spurious interrupts
ERR: 0
MIS: 0

/proc/driver/cciss/cciss0:
cciss0: HP Smart Array P400 Controller
Board ID: 0x3234103c
Firmware Version: 4.12
IRQ: 217
Logical drives: 1
Current Q depth: 0
Current # commands on controller: 16
Max Q depth since init: 19
Max # commands on controller since init: 24
Max SG entries since init: 31
Sequential access devices: 0

cciss/c0d0: 3000.49GB RAID ADG

/sys/block/cciss\!c0d0/queue/read_ahead_kb:
128

/sys/block/cciss\!c0d0/queue/max_sectors_kb:
512

vmstat -a 1 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
0 9 128 13176 931408 57280 0 0 5938 2580 245 1682 3 2 5 90 0
1 9 128 13124 931260 57268 0 0 3848 20088 455 3236 2 3 0 95 0
0 9 128 13124 931260 57268 0 0 0 1516 65 2131 0 0 0 100 0
0 6 128 13044 931548 57268 0 0 8712 3564 667 4246 5 3 0 93 0
1 4 128 13208 931564 57268 0 0 0 6216 66 2084 0 1 0 99 0

Any ideas apart from changing the array adapter?

24 REPLIES 24
fschicker
Advisor

Re: Extremely slow io on cciss raid6

hi ulrik,

we have the same problem!
did you find a solution for now?
we have the issue on different HP Servers with different P400 Controllers, everytime the same...

my questions:
- what hardware revision has your controller? (lspci output)
- did you try to put the controller in an other pci-x slot?
- what is the output of "lshw" from the pci slot the controller is in?

My Post: http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1240003

thank you!

i hope we can solve this problem :(

greets
Ivan Ferreira
Honored Contributor

Re: Extremely slow io on cciss raid6

I would not expect performance for a RAID 6 configuration and small controllers. ├В┬┐Do you have write-back cache? ├В┬┐How many disks you have? ├В┬┐Have you considered using RAID 5 + spare instead of RAID 6?. ├В┬┐What is the performance testing tool that you use? ├В┬┐What is the block size used?

├В┬┐Can you create a RAID 0, with 1 disk, and then with all disks, for performance test purposes?. With this you could identify the performance for each disk, and then for all disks in a stripe configuration, and then, compare with RAID 6 performance.

Use iostat -x to identify the "service time" (svctm) on each situation. Post your results.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
fschicker
Advisor

Re: Extremely slow io on cciss raid6

hi ivan,

its not an issue with the raidlevel.
we tried it on about 6 servers, with raid 1 / 5 and 6, everytime the same.

please read my post i linked, i think this shows the issue a little better.
Ivan Ferreira
Honored Contributor

Re: Extremely slow io on cciss raid6

I see in your test that you run over a file system. You should run your tests over the raw device. ├В┬┐Was this FS ext3? ├В┬┐Journaling enabled? For filesystem, use Iozone or Bonnie.

├В┬┐What would be the performance over a single disk?

A large block size won't always be better.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
fschicker
Advisor

Re: Extremely slow io on cciss raid6

hi ivan,

thanks for your answer.

i know the possibilities of tweaking with blocksizes and filesystems but i think 8 mb/s of writing has it reasons somewhere else :)

i cant start bonnie because the server gets too much load an the services on it get offline if i start writing too much to the disk.
Ivan Ferreira
Honored Contributor

Re: Extremely slow io on cciss raid6

Then, let's wait Ulrik Holm├Г┬йn results.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
fschicker
Advisor

Re: Extremely slow io on cciss raid6

Hi Ivan,

Now i could make bonnie and more other tests.

Here my results:

- direktly to the disk, without ext3:

sync; time sh -c "dd if=/dev/zero of=/dev/cciss/c0d0p3 bs=1024k count=1000; sync"
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 677.96 seconds, 1.5 MB/s

real 11m18.088s
user 0m0.000s
sys 0m2.584s

- bonnie:

bonnie -b -s 1100 -d /tmp/ -u root
Using uid:0, gid:0.
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.03 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
our.server.na 1100M 7366 15 4572 0 3456 0 26078 53 76173 5 135.3 0
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 396 0 +++++ +++ 263 0 227 0 +++++ +++ 1564 2
our.server.name,1100M,7366,15,4572,0,3456,0,26078,53,76173,5,135.3,0,16,396,0,+++++,+++,263,0,227,0,+++++,+++,1564,2

i dont know bonnie very well, but it doesnt look fine. server had 0.0 load before.

greets,
florian

Re: Extremely slow io on cciss raid6

I've tried both raw disk and ext3 and the the problem is not related to fs issues as I've seen in other suggestions to the same problem.

I know I cannot expect the speed of lightning with raid5/6 but more than 8Mb/s is not expecting to much. The speed is actually not the biggest issue. The frustrating problem is that the server is totally locked while writing to disk. The server is going to be a slave database server but it is simply not possible with the current performance.

To test the performance I run:

read:

time dd of=/dev/zero if=/dev/mapper/VolGroup00-test bs=1M count=3000
3000+0 records in
3000+0 records out
3145728000 bytes (3.1 GB) copied, 15.6588 seconds, 201 MB/s

real 0m15.713s
user 0m0.005s
sys 0m4.264s

write:

time dd if=/dev/zero of=/dev/mapper/VolGroup00-test bs=1M count=3000
3000+0 records in
3000+0 records out
3145728000 bytes (3.1 GB) copied, 426.12 seconds, 7.4 MB/s

real 7m6.139s
user 0m0.003s
sys 0m4.418s

Re: Extremely slow io on cciss raid6

While writing (The LVM is located on cciss/c0d0p2):

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
cciss/c0d0 0.00 1440.00 0.00 51.50 0.00 6.24 248.00 144.74 2835.11 19.43 100.05
cciss/c0d0p1
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
cciss/c0d0p2
0.00 1440.00 0.00 51.50 0.00 6.24 248.00 144.74 2835.11 19.43 100.05