Disk Enclosures
1748185 Members
3922 Online
108759 Solutions
New Discussion юеВ

Please help a newcommer - new FC san, latency issues.

 
SOLVED
Go to solution
MikkelNielsen
Occasional Contributor

Please help a newcommer - new FC san, latency issues.

Hello guys.

Im using you as last resort before i call HP.

Just installed a P2000G3 dual controller FC storage with 24 10K disks.

One brocade silkworm 300 refurbished switch.

Single port HBA in each ESX host (we have 4 in total)


As i understand, FC should be low latency. But we are experincing latency peaks of 200ms in evening hours, where almost no users are loggin into their terminal servers. We have 1 12disk raid10, 64K chunk. 1 12 disk raid50 conigured with 3 4disk raid 5. Chunk 64x3=192 kb.

as said, latency peaks at low hours - is this to be expected?

Testing with iometer in a 60/40 reallife test i get 541 IO/SEC, 4,18MB / sec avg. response on 46 MS and 1304 MS max response time. I dont think 541 IO's is good enough. In a 100% read test we get over 20.000 IO and above 600 MB / SEC, and 3.2 MS/avg wich is awesome. But things have to be balanced, and i cant use 20000 io's for anything if 60/40 is totally crap.

I have upgraded firmware on the hba's. Brocade 815. Using driver version 2.1.1.3 in vmware.
Changed queue depth to 64 on all ESX, aswell as DiskShecNumReqOutstanding to 64 on each esx host.

I have tried Raid5 12 and 16 disks, same problem.

Each HBA is zoned with each host port, so there should be no storms.

got 4 luns on the 12 disk raid50. 678 GB each, with only 2 vm's on each lun.

Maybe something is configured wrong at switch level? I dont know - maybe the spikes are all normal?
Path selection is MRU. Tried Round RObin and Fixed to each controller port. Didnt help either.

See attached screenshots and let me know what you think.

Thanks
6 REPLIES 6
Bill Costigan
Honored Contributor

Re: Please help a newcommer - new FC san, latency issues.

I suspect you are looking in the wrong place. The latency is not just the latency through the switch but how long it takes the disk array to return the data.

I would look at what the array is doing at the time. You also have to be careful setting up the path selection. If you send every other I/O to a differnet controller the controllers may have to do more work to switch which controller controls the LUN. It is often better to always access a LUN through the same controller unless there is a path failure.
MikkelNielsen
Occasional Contributor

Re: Please help a newcommer - new FC san, latency issues.

Hello.

Well it is definately not doing much, really. Say that i run IO meter and pull 20.000 IO, and do a login on a server on the same LUN, the spikes do not occour..

Path selection is set at mru, witch defaults to the prefferred controller and port.

When i had it set at fixed, i only fixed the luns to the owning controllers, but on port A1 or A2 or B1 or B2 ..
Bill Costigan
Honored Contributor
Solution

Re: Please help a newcommer - new FC san, latency issues.

I'd still look at the array. Latancy trhough the FC is measured in microseconds (maybe ~2 us) which is 0.002 milliseconds.

You have 10K disks. Let's say you can get 150 I/Os per disk that is 1,800 IOs total. A write to a RAID 5 disk requires 4 I/Os. 1800 / 4 = 450 raid 5 writes.

If the array takes on average 2ms to complete each write and you have a queue of 25 writes pending, then each write will take 50ms by the time it gets its turn and completes.

20,000 reads must be using large read ahead buffers. A read from raid 5 only takes a single IO so you could get 1,800-2,000 physical reads per second. If the array is sequentially fetching 10 times the requested amount of data then the next 9 reads will be from cache with alomost no latency.
MikkelNielsen
Occasional Contributor

Re: Please help a newcommer - new FC san, latency issues.

So what are you suggesting? Using raid10 instead?

Yes i am getting 20.000 IO with 64 K read ahead size, and a 32K request size. You might be right about the cache, and thats why i get so much io. But why does other people not get that amount of IO when they test the same san? (i can see in the unofficial storage shread) that others with p2000 is not getting as much as me.

Thanks





Bill Costigan
Honored Contributor

Re: Please help a newcommer - new FC san, latency issues.

I'm not suggesting anything - just trying to make sense of what you are seeing in your tests.

If you switched to RAID 10 you should see twice the write performance but I don't know if that will be enough to satisfy your requirements. The trade off is you will not have as much disk space as you have with RAID 5.

As for why others do not see the same 20K read rate... I don't know. Perhaps their reads are more random so the read ahead is not as effective. Perhaps they have fewer disks in the RAID group.
MikkelNielsen
Occasional Contributor

Re: Please help a newcommer - new FC san, latency issues.

Well Okay. I can live with the reduced space, if it will help the latency spikes.

You are sure this is not a switch problem?

Is it "normal" to see high spikes?