Operating System - OpenVMS
1828023 Members
2107 Online
109973 Solutions
New Discussion

Analyzing EVA5K Disk Throughput

 
SOLVED
Go to solution
Jack Trachtman
Super Advisor

Analyzing EVA5K Disk Throughput

VMS V7.3-2 w/HBVS (Host Based Volume Shadowing)
2 x EVA5K, one w/36GB 10K RPM disks, the other
with 146GB 10K RPM Disks

Before we began using the EVAs, my rule of thumb was that if I saw a consistant disk-que size of > 0, I would begin investigating a possible throughput bottleneck and look at changing my disk config or file placement. With the EVAs, I'm somewhat confused.

Right now I use ECP (Enterprise Capacity Planner) to collect usage data. I generate daily reports for our systems that summarize the 10-11am hour for CPU, Mem, and disk usage.
(Suggestions on other products to get more disk usage info?)

I am seeing our two busiest systems with disk que lengths greater than 0. This would seem to be a problem, but I'm not sure if it really is - i.e. I may have to change my definition of what constitutes a throughput problem when using EVAs (comments please).

Besides, except for adding more drives (to get more spindles to spread the load), is there really anything else that can be "tuned" on an EVA (since "all disks are virtual and spread across all spindles").

I have attached partial outputs from two ECP reports for examination.

Any thoughts (both practical and philosophical) would be greatly appreciated.
6 REPLIES 6
Garry Fruth
Trusted Contributor

Re: Analyzing EVA5K Disk Throughput

I didn't see the attached ECP report, and you did not state how much over 0 the disk queue was. But 'high' disk queues on an EVA disk would be a problem. I wish I could offer a rule-of-thumb.

As far as tuning opportunities, there are a few. If you have multiple paths to the disks, they could be balanced (SET DEVICE/PATH); VMS does some of this automatically; as I recall the algorithm is documented in the Manager's manual. The algorithm does not lead to the same path balance from boot-to-boot (actually, from mount-to-mount), so it may be worthwhile coming up with your own method to maintain a consistant performance.

As another tuning opportunity, if you are using host-based shadow sets, and there are a lot of sequential reads, e.g. backing up a database, then raising the read cost on one of the members can make a noticable difference.
Jack Trachtman
Super Advisor

Re: Analyzing EVA5K Disk Throughput

Here's the attachment that got dropped.

I presently do a manual round-robin fibre path assignment in SYSTARTUP (I believe that HP will be adding this feature.)
Hein van den Heuvel
Honored Contributor

Re: Analyzing EVA5K Disk Throughput

Don't worry, be happy... your EVA is serving thousands of IO/sec and at a decend response time.
As you suspect, the queue depth is largely irrelevant in this picture.

I would speculate that a problem zone would be one where the sum of the queues to all the Virtual disks in a group is larger then the number of physical disks in that group might be a problem (for random read load).

Or... there is a problem if the service time degrades seriously.

If you take your data and SORT it then you'll
see that the queue depth is pretty much linear with the IO/sec, but that the response time is not correlated.
I would like to read that as: for higher IO/sec there are higher odd'd that a disk will be busy (queue), but it is still returning data at the normal rate. No bottle neck.
Just take your numbers. 100 IO/sec each IO 6ms. So if all IOs were issued one after the other, 600ms or 60% of that time an IO would be busy. Add one more random IO and the odd are 60% it will show as an extra count in the queue depth. It 'feels' statistically normal to see a queue of 2 for 200 IO/sec and 4 for 500 IO/sec. But they are all being worked on independenly as proven my the response time: goodness.

+-----------------------------------------------+
| | Pct | Total |pTim |vTim |ueue |
| Disk Name | Busy | IO/s |(ms) |(ms) |ngth |
|-----------------------------------------------|
| $1$DGA6103 | _0.9 | 8.3 | 0.2 | 0.2 | 0.0 |
| $1$DGA5103 | _0.8 | 8.3 | 0.2 | 0.2 | 0.0 |
| $1$DGA5102 | _5.6 | 90.5 | 0.9 | 0.6 | 0.1 |
| $1$DGA6102 | _6.3 | 90.5 | 1.1 | 0.7 | 0.1 |
| DSA102____ | _8.2 | 94.9 | 1.2 | 0.8 | 0.1 |
| $1$DGA6101 | 70.5 | 138.9 | 7.2 | 5.1 | 1.0 |
| $1$DGA5101 | 64.2 | 147.9 | 5.9 | 4.4 | 0.9 |
| $1$DGA6302 | 86.2 | 191.2 | 6.6 | 4.8 | 1.2 |
| $1$DGA6301 | 58.4 | 191.4 | 5.4 | 3.7 | 1.1 |
| $1$DGA6303 | 60.2 | 202.3 | 6.0 | 3.4 | 1.4 |
| $1$DGA6100 | 16.5 | 245.4 | 0.7 | 0.7 | 0.2 |
| $1$DGA5100 | 16.1 | 245.4 | 0.7 | 0.7 | 0.2 |
| DSA101____ | 83.5 | 245.7 | 7.5 | 3.4 | 1.8 |
| DSA100____ | 23.3 | 255.5 | 1.1 | 0.9 | 0.3 |
| $1$DGA6105 | 92.9 | 273.8 | 7.3 | 3.6 | 1.9 |
| $1$DGA5105 | 89.1 | 328.9 | 5.6 | 2.9 | 1.8 |
| DSA105____ | 97.4 | 559.9 | 6.8 | 1.9 | 3.7 |
| $1$DGA6304 | 86.6 | 671.2 | 6.9 | 1.8 | 4.5 |

Are you host-based shadowing to the same EVA?
Does that make sense? You can already survive adapter and cable failure. And you can still not survive full EVA failure. The EVA can protect against drive and shelf failure. Just use Vraid1 for the Virtual Disks and be happy?

Just thinking aloud...

hein.
Jack Trachtman
Super Advisor

Re: Analyzing EVA5K Disk Throughput

Hein,

Thanks for your thoughts.

- We use HBVS across 2 EVA5Ks

- I did notice that the Response Time figures were consistant across disks with various Que lengths. Maybe (as you suggest) this should be my major throughput criteria rather than Que length.

- You imply that for the EVA, Que length should only be seen as a problem if the Que starts to exceed the number of spindles in the physical group. Is this a valid summary of what you were trying to tell me?

- As an additional question about ECP - any idea how the Pct Busy column is calculated? The disks with the "high" Que lengths also show high "Pct Busy" >80% which has me worried.
Hein van den Heuvel
Honored Contributor
Solution

Re: Analyzing EVA5K Disk Throughput


- We use HBVS across 2 EVA5Ks

Good. Just checking :-). I guess the device names should have tipped me of.

- Maybe (as you suggest) this should be my major throughput criteria rather than Que length.

Yeah... I think so. And even that can be deceptive, or at least needs a separation of read and write time. Why? Application actually have to wait for reads... they need the data and can not make progress without it. But writes time might be largely irrelevant (as long as it get's done).

- You imply that for the EVA, Que length should only be seen as a problem if the Que starts to exceed the number of spindles in the physical group. Is this a valid summary of what you were trying to tell me?

Yeah, but it is over simplified. At some point you have to worry about overloading a fiber or a controller versus the physical disk.

You may want to look around for a tool called 'evaperf'. PC based, can not run on teh applicance, but a great source of info. Witness for example this report:

http://h71019.www7.hp.com/ActiveAnswers/downloads/Exchange2003EVA5000PerformanceWhitePaper.doc

VMS Engneering is experimenting with a VMS native version of this. It would not surprise me if this turned into a tool available to customers possibly and called "VEvaMon", and it might just become available T4 - aware. No time promiss, no commitment, just a line of thinking.
You might just get started on getting some hands-on T4 experience.


That's all for now,
Cheers,
Hein.

Jack Trachtman
Super Advisor

Re: Analyzing EVA5K Disk Throughput

Thanks all