Operating System - Linux
1753259 Members
4972 Online
108792 Solutions
New Discussion юеВ

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp

 
Yaroslav_4
Advisor

iowait 100% linx redhat ES3 2.4.21-9.ELsmp

i have RH ES3, kernel 2.4.21-9.ELsmp installed on DL380 3G. It has 6 hdd (72 GBytes) in Raid 5. in read or write activity i'm getting 100% iowait, could you advise what to do? Latest cciss driver cpq_cciss-2.4.54-14.rhel3.i686
11 REPLIES 11
Stuart Browne
Honored Contributor

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp

What are you doing on the server?

Some information. Run 'vmstat -ba 1 5', and show us the details. Also show us the content of '/proc/meminfo'.

Basically, what I'm looking at here is for the swap activity, and how much memory you've got.

The first thing I'd get you to do however is update to the latest eratta kernel (32.EL), and associated packages. There are a number of IO and memory handling updates of which might be bennificial.

Also, what RAID controller are you using? Are you using the onboard SmartArray 5i, or another controller?
One long-haired git at your service...
Gopi Sekar
Honored Contributor

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp


I believe it could be related to memory/swap issue. please post the /proc/meminfo and if possible the 'top' command output

Never Never Never Giveup
Yaroslav_4
Advisor

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp

I'm using onboard smart array 5i
Please look into output of related commands:
CPU states: cpu user nice system irq softirq iowait idle
total 0.0% 0.0% 0.0% 0.0% 0.0% 100.0% 0.0%
cpu00 0.0% 0.0% 0.0% 0.0% 0.0% 100.0% 0.0%
cpu01 0.0% 0.0% 0.0% 0.0% 0.0% 100.0% 0.0%
cpu02 0.0% 0.0% 0.0% 0.0% 0.0% 100.0% 0.0%
cpu03 0.0% 0.0% 0.0% 0.0% 0.0% 100.0% 0.0%
Mem: 4898932k av, 3027948k used, 1870984k free, 0k shrd, 35852k buff
860296k active, 2029808k inactive
Swap: 10241428k av, 0k used, 10241428k free 2813784k cached

[root@Zeus storage]# cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 5016506368 4998258688 18247680 0 39489536 4750659584
Swap: 10487222272 0 10487222272
MemTotal: 4898932 kB
MemFree: 17820 kB
MemShared: 0 kB
Buffers: 38564 kB
Cached: 4639316 kB
SwapCached: 0 kB
Active: 862832 kB
ActiveAnon: 553452 kB
ActiveCache: 309380 kB
Inact_dirty: 3696872 kB
Inact_laundry: 105020 kB
Inact_clean: 79600 kB
Inact_target: 948864 kB
HighTotal: 4063204 kB
HighFree: 1088 kB
LowTotal: 835728 kB
LowFree: 16732 kB
SwapTotal: 10241428 kB
SwapFree: 10241428 kB
HugePages_Total: 0
HugePages_Free: 0
Hugepagesize: 2048 kB


[root@Zeus storage]# vmstat -a 1 5
procs memory swap io system cpu
r b swpd free inact active si so bi bo in cs us sy id wa
1 3 0 18164 3704904 935904 0 0 1240 947 361 492 1 1 79 19
0 5 0 18200 3704156 936568 0 0 4692 8548 1376 665 1 1 0 98
1 1 0 18020 3700672 940196 0 0 42208 184 10256 10674 1 10 13 75
1 2 0 17876 3698336 942860 0 0 46004 72 11217 11616 2 8 14 76
0 3 0 17908 3697520 944100 0 0 44840 208 10990 15024 1 9 4 86
Gopi Sekar
Honored Contributor

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp


your system looks to be clean from memory/swap issue and cpu is relaxing.

Are there any activities happening in the hard disk (LEDs continuously glowing?), this might happen if one of the hard disk is bad and RAID is trying to rebuild it

Regards,
Gopi
Never Never Never Giveup
Ivan Ferreira
Honored Contributor

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp

It seems an erratic application, because the high system time is not related to memory problems, as seen in the output.

What is running on your server? The raid is a software or hardware raid?
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Stuart Browne
Honored Contributor

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp

*nod*nod* yup.. No swap activity, but nasty disk IO.

What is this machine doing ?!? The name is 'storage'. Is this perhaps a database storage server? And if so, how big are the queries it's getting hammered with ?!
One long-haired git at your service...
Yaroslav_4
Advisor

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp

on this machine i'm having oracle database instance. all datafiles are 2 Gbytes. I tryed to shutdown database and just make copy just file - the same. i even try to make such test:
cat some_file > /dev/null when oracle down, i'm got the same thing, before that i change all disk and reinstall linux, nothing changed
Yaroslav_4
Advisor

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp

when i'm trying to copy some files, the leds begin flushing, but when not - normal
Ivan Ferreira
Honored Contributor

Re: iowait 100% linx redhat ES3 2.4.21-9.ELsmp

Can you give information about your partitions, filesystem type (ext3?), block size, RAID stripe size and cache options, and journal mount options.

Are you rally using RAID 5 or ADG (RAID 5 with 2 parity disks)?

Do you have the chance to add a new disk without RAID (JBOD), mount it and run the tests?, better on other SCSI controller

This way you can know if the problem is related to the RAID itself.


Recomendations:

when creating the filesytem use:

-O dir_index
-R stride=N
-O sparse_super

Where N*4k equals to the RAID stripe size.

Use writeback journaling when mounting the filesystem (if it is ext3).
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?