Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

some IO ops with large latency in SDA fc perf

shcts
Occasional Advisor

some IO ops with large latency in SDA fc perf

 

Hi,

 

Hopefully this is the right category, I'm not used to this new interface.

 

Summary:

Using SDA FC PERFORMANCE, I am seeing some IO operations taking a REALLY long time to complete? Up to SECONDS of latency? Anybody else seeing this?

 

 

Background:

I am trying to comission a  new EVA p6500 (fully loaded 450 600G sas drives). I added vms volumes to an application cluster (via HB shadowing). After about a day user complaints prompted us to drop these shadow members. FC PERF indicated really poor latency for a small (but user noticeable) percentage  of IOs.  I have also noticed the problem on our older EVAs, this one just seems to have it much worse.

 

I have since tried to simply troubleshooting by eliminating possible SAN buffer credit, 8G fillword issues, etc... by plugging this array directly into the (4g)  switches where the hosts are connected. I added a few non-critical shadow members back, and I'm still seeing the latency. The only tool I can see this with is SDA FC perf, because all other tools (evaperf, portperfshow, etc...) deal with averages, so the problem is masked.

 

 

I have been  working the problem with HP storage, but thought I would also post here, as little progress is being made. What do your numbers look like? Are they anything like this? THANKS!

 

-Tom O'Toole

 

$1$dga12176 (write)
Using EXE$GQ_SYSTIME to calculate the I/O time
    accumulated write time = 33815000us
    writes = 10550
    total blocks = 806494

LBC     <2ms     <4ms     <8ms    <16ms    <32ms    <64ms   <128ms   <256ms   <512ms      <2s      <4s
=== ======== ======== ======== ======== ======== ======== ======== ======== ======== ======== ========
  1     2678        6        2        3        4        2        2        1        3        5        -     2706
  2        3        -        -        -        -        -        -        -        -        -        -        3
  4       13        -        -        -        -        -        -        -        -        -        -       13
  8      696        6        3        3        1        -        -        -        1        5        -      715
 16      324        2        2        -        3        -        -        -        -        2        -      333
 32      456        4        3        1        1        -        1        -        1        2        -      469
 64      265        -        1        -        -        3        -        1        3        1        -      274
128     6004       12        2       11        -        7        -        -        -        -        1     6037

       10439       30       13       18        9       12        3        2        8       15        1    10550


7 REPLIES

Re: some IO ops with large latency in SDA fc perf

Hi Tom,

 

not sure if this is useful but the attached file shows some stats for one node of a Blade cluster using EVA 4100.

There are 28 physical disks, all scsi, and I think 10,000 rpm.

No HB shadowing - all logical disks are configured as mirror sets in the EVA.

I'd guess you've already checked this but are all cluster nodes directly attached (via fibre) to the EVA - that is none accessing via MSCP for example? We had one case where a node lost the fibre connections and was being served by another node.

 

Cheers,

 

Mark

shcts
Occasional Advisor

Re: some IO ops with large latency in SDA fc perf

 

Yes Mark, very useful thanks...

 

I can see you have some of the same problem I do (but maybe not as bad)... I would love to find out what causes an IO to take 512ms, 1 sec, or even 2 seconds to complete. Our application is possibly sensitive to this latency, we have intermittent  failures to get a database lock before it times out.

shcts
Occasional Advisor

Re: some IO ops with large latency in SDA fc perf

 

Does anybody reading these boards and running VMS have a p6500?

Hoff
Honored Contributor

Re: some IO ops with large latency in SDA fc perf

As is my common rejoinder of late, "Call HP support".

 

Ask them to determine the trigger for this EVA HP P6500 EVA (HSV360) performance oddity.

 

These are complex controllers, and implementation details such as "thin provisioning" may well cause some I/O operations to have unusually longer latencies, as might parallel operations and I/O activity from other FC SAN hosts sharing the controller.

shcts
Occasional Advisor

Re: some IO ops with large latency in SDA fc perf

 

Hi Hoff,

 

I have been working it with HP for quite some time, but progress has been slow, to understate the situation.

 

This array is  new, not using thin provisioning, was using it for one app cluster, and a mail server that does 100 ops/sec when it's busy... As it stands I can't keep it in production on this one app cluster because of sluggish response.

 

no fancy stuff at the moment just basic vraid1

 

My primary motivation for posting here is to get other people's (who have p6500s, or p6300)  input, and see their fc perf outputs.

 

Thanks all.

 

 

Hoff
Honored Contributor

Re: some IO ops with large latency in SDA fc perf

Replace it.

 

Ask questions later.

shcts
Occasional Advisor

Re: some IO ops with large latency in SDA fc perf

 

Thanks for the response, replacing it may be what has to happen. It is still being worked through HP, and a couple of things isolated, but no definite fix at this point, we'll see...