ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

DL585 G7 - Random high disk latency with multiple LSI HBAs

 
dba_jeff
Occasional Contributor

DL585 G7 - Random high disk latency with multiple LSI HBAs

For several months I have been battling a very odd disk IO latency issue. In a DL585 G7 server with quad 6174 CPUs, 128GB RAM, and the PCIe expansion board, running Win 2008R2 sp1, and outfitted with six LSI 9200-8e SAS/SATA HBAs connected to 24 Samsung 830 SSD drives, I will always see high latency on one or two of the HBAs and normal latency on the others. The odd thing is that with a reboot the high latency problem can migrate from one HBA to another.

 

My test is to run IOMeter configured to perform 1MB random reads with 8 workers and QD=32 - a basic high read IO test scenario that is highly repeatable. I originally saw the problem in Oracle using ASM so it's not an IOMeter-specific quirk and it is a real-world performance problem for my data warehouse. While I should be getting ~10GB/Second read throughput, I am only seeing ~5GB/Second because of this problem.

 

Latency on the good HBAs is around 1ms under load while it's around 400ms on the bad HBA(s).

 

It's not the disks, because other SSD brands show the same issue and I can swap particular disks for others and see the same problem. It's not cables because I have swapped those as well. I've tried three different LSI driver and BIOS versions so it's somewhat unlikely to be an LSI-only issue. I did a brief test on a Supermicro server and the issue, if it existed at all, was far less severe and I saw ~8GB/Second.

 

This could be an LSI problem - I have asked for their help as well - but I wanted to see if any HP users have seen (and hopefully solved) the same issue.

 

All ideas welcome!