Re: EVA 8000 cache monitoring?

J Peak · ‎10-17-2008

Is there any direct way to monitor cache on an EVA 8000? I'm using EVA perf to monitor latency, throughput, etc... but I can't find any way to see how much of our cache is being swapped in/out and at what rate.

Our issue is that we sustain about 80-100 MB/s ( not Mb) during the day from our primary database. When someone rights a large sequential file or creates new database chunks on a new server, it apparently burns through our write cache leaving us on spindles. The result is that the EVA takes several minutes to catch up after the write is complete, but at that point our database is in a checkpoint and we are effectively down.

EVA8000
Command View 6
Primary volume group is 142 x 76GB FC drives.
~125MB/s of throughput chokes our system.

Thanks for any help you can offer.

IBaltay · ‎10-17-2008

Hi,
a)what type of host loadabalancing do u use
(the aim is to hit always the master controller to reduce the mirror port communication between the master and proxy controller)

b)what are the host queue depth values
(the aim is to have good ratio of parallel IO feed across all the paths)
c)evaperf details to a) and b) can find here:
http://forums11.itrc.hp.com/service/forums/questionanswer.do?admit=109447626+1224267365002+28353475&threadId=1239472

d)mirror ports max throughputs are here:
http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1272857

the pain is one part of the reality

J Peak · ‎10-22-2008

We are using round-robin in our multipath setup with each path weighed equally. On the EVA side the LUN is set to no-preference for controller.

We have a single large LUN 500GB that we have not been able to break up as of yet due to system availability demands.

We're running Redhat Linux as our host OS, it has 4 HBAs, with two on each of our core SAN switches. Giving us a total of 16 paths to the LUN.

Is there a better method for balancing a single LUN than round-robin?

We are a 20/1 Read/Write ratio on this database LUN.

IBaltay · ‎10-22-2008

Hi,
try here:
http://h20000.www2.hp.com/bizsupport/TechSupport/DriverDownload.jsp?pnameOID=315741&locale=en_US&taskId=135&prodTypeId=12169&prodSeriesId=315739

the pain is one part of the reality

J Peak · ‎10-22-2008

We're using Emulex cards on all of our hosts. the default multipath software with redhat is our load balancing, using the round-robin methodology.

J Peak · ‎10-22-2008

Also to answer your previous question, our host queue depths are 0-1 typically. It is when we run out of cache that we have problems and the queue increases.

IBaltay · ‎10-22-2008

then maybe this could be interesting:

Linux HBA drivers and multipathing changes:
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c01430228&prodTypeId=12169&prodSeriesId=3662826

ALUA/ALB support
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=12169&prodSeriesId=3559651&swItem=co-64600-1&prodNameId=3559652&swEnvOID=4006&swLang=13&taskId=135&mode=4&idx=0

the pain is one part of the reality

J Peak · ‎10-22-2008

That is an extremely useful link. The new drivers and support for them will hopefully improve our round-robin. We're testing it now on our stage EVA 8100 with redhat 5.2 64 bit.

I still think we'll have a problem with write cache though. When we burn through it, we're on spindles for everyone's writes and this will enable us to write faster.

Tom O'Toole · ‎10-22-2008

Depending on the database, maybe this is one of those cases where you should create a separate disk group for 'redo logs' (or whatever the equivalent is for you app.)

The implication from the docs, although detail is sorely lacking, is that cache management is done on a per-disk group basis, and sequential write access can conflict with random read access going on in the same group.

I don't know if this is practical in your case, but it's worth a try.

Can you imagine if we used PCs to manage our enterprise systems? ... oops.

J Peak · ‎10-22-2008

We're running informix 10.x for our database. Currently when we have enough transactions that the database must flush memory to disk, the database does a "checkpoint" that blocks transactions. What we've observed is that when it writes this data to disk, if it is a large amount of data, our write and read latency both hit triple digits. Our database is reading and writing from the same database files which are on a single LUN in a disk group with 144 disks. But LUNS experience this wait time across all of our disk groups within the EVA. Basically, anyone that wants to read or write has to wait.

We've been running full EVAPERFs for months. Is there a specific set of output that would be helpful in determining the bottleneck on the EVA?

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: EVA 8000 cache monitoring?

EVA 8000 cache monitoring?