Re: Just a big EVA LUN or many smaller EVA LUNs ?

CAS_2 · ‎08-09-2006

Hi

All my database (near 60 GB) is located in just one EVA LUN (70 GB). All DB resides on VxFS file systems.

Are there advantages in DB performance by splitting that 70GB-LUN in many LUNs (for instance, 10 GB x 7) in the same EVA array ? -- I'm aware of performance improvements by spreading the data files in different spindles when using disks attached directly to box, but in my case they are virtual disks.

As every I/O operation is actually done to the same disk file (for example /dev/dsk/c8t1d2),
is there any constraint of I/O operations in HP-UX (as internal I/O queues and others than VxFS buffer-cache and Oracle buffer-cache) that might affect the DB performance ?

Thanx in advance.

Steven E. Protter · ‎08-09-2006

Shalom,

The biggest factors on db performance are:

1) Structure of the LUN
2) How the database works.

If its a write intensive database (item2) item 1 becomes more important.

If the LUN is Raid 5 and the database is write intensive, all those writes need to be done in a log more places and i/o will back up. If the database is primarily a data mine or inquirely Raid 5 lun structure will be fine.

You are better off from the database performance issue to have a smaller OS buffer cache (dbc_max_pct low and close to dbc_min_pct) and a larger SGA than vice versa.

The number of innodes ninodes and vx_ninodes can have an impact on performance.

As far as big lun versus lots of little luns, the performance advantages are not that huge.

Large luns make management easier since everything sits together and the fs can be expanded easily. Smaller LUNS make dealing with peformance problems much easier, because converting to Raid 10 can be done on the disk array without downing the database.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

TwoProc · ‎08-10-2006

I like the "Lotsa Lun" approach rather than the "BIG LUN" approach.

The limitation would be in the scsi request area. Each lun gets a scsi request queue (it looks like one device), this is a problem for lots of i/o because one queue has to handle all of that i/o for all of the disks and controllers represented. You also get down to a concurrency issue here. For that reason, I put up a fair amount of smaller (7G, 11G or 14G) luns and I use distributed striping across them. I also like my alternate paths across the drives and even the san striped in rotated via pvgs in my volume groups to try and spread the load across fiber cards, SAN ports, storage system ports, cards, and disks in making the list of where my luns come from and taking that all into account when disk luns are built into volume groups. It's a lot of extra work, but a well balanced storage frame is generally a fast storage frame.

We are the people our parents warned us about --Jimmy Buffett

Steve Lewis · ‎08-10-2006

I am inbetween the above.

I like to use all controllers all of the time.

I like to separate my logs from my data, from my metadata and from my unload areas. I like to separate various database tables that are frequently joined together. Striping helps but using different disks is always going to be quicker.

I like to have different fibre strands going to the tape drive (or backup array).

Managing small and large databases over the past 12 years has taught me that as your data grows, so your storage always gets more complicated, so management and documentation of the storage gets much more complicated, DR tests get more complicated, adherence to your naming standards is vital and so is up-to-date documentation.

There are also technical limitations to consider in advance.

So you trade off manageability against better performance.

At the end of the day, you should know your application better than us, it depends on the application/database profile more than anything else.

TwoProc · ‎08-10-2006

Steve, you and I are in agreement on the separation topic. I separate and divide then aggregate and balance. It takes a lot of hardware to do it this way, but I always like what I end up with.

I also agree with the idea of how much management is involved. It's not small, but it's what we're here for.

We are the people our parents warned us about --Jimmy Buffett

A. Clay Stephenson · ‎08-10-2006

One of the chief disadvantages of large LUN's is that they APPEAR to be a bottleneck to host-based performance tools like Glance. Glance sees this thing as a single physical disk (no matter that it my actually be comprised of 20 physical disks) and all it knows (or can know) is that a heaping helping of i/o is going to and from that "disk". In most cases, breaking a large LUN into smaller LUN's makes thing APPEAR better but the actual i/o throughput is changed very little.

However, there is one strategy that will apply to essentially all disk arrays and that is to spread the i/o over as many SCSI paths as you have. For example, let's say that your current 70GiB LUN has a primary path (SCSI Path A) and an alternate (SCSI Path B) for redundancy so that a cable or HBA failure is tolerated. This is a perfectly good and valid practice. However, notice that SCSI Path B goes unused most of the time. A better approach would be to split your 70GiB into as many smaller LUN's as for which you have SCSI paths from the host to the array. In this example, that would be two so you would create two 35GiB LUN's to replace the single LUN. Next, you make the primary path for LUN0, SCSI Path A (alternate B) and you make the primary path for LUN1, SCSI Path B (alternate A). You then put these PV's into a Volume Group and stripe each LVOL over both PV's, typically in 64Kib-256KiB stripes. This way, you efficiently spread the i/o over both SCSI paths and still wind up with few LUN's from a manageability standpoint. Spreading the i/o over SCSI paths like this may or may not make huge differences in performance -- depending upon things like is PowerPath already installed but it will never do harm.

If it ain't broke, I can fix that.

Volker Borowski · ‎08-10-2006

Additional thought,

check the storage docs on cache configuration. If you are single to this box it should not matter at all, but if you have to share against other servers, some cache parameters can be LUN number dependend. So using more LUNs may give you better overall Quotas on SAN cache memory and/or a seperate areas with own LRU lists.

I.E. we switched the oracle log filesystems from single to double LUNs, to double their cache size. I do not exaclty know, what the SAN guys did configure, but it was pretty impressing.

Volker

TwoProc · ‎08-10-2006

Well, I expected that one from A. Clay (and boy, I hate to disagree with him more than anybody) - but the fact is, our disk systems with Glance have always APPEARED heavily loaded since we went with a Storage network, and ditto for the network since we went 1G fiber - and still do, and that's not what our decisions were based on ( I know, A. Clay didn't *say* that's what *we* did - but I wanted to clear that up ). I just wanted to point out that we/I don't naively look at what the tools say without thinking about what they represent ... I'm just clearing that up. We actually use Glance, Perfview, and Mercury LoadRunner (and clocks and timers) to evaluate total system throughput along with all of the factors, plus information and applied logic.

But, aside from all of that - think of what it means to have a BIG LUN representing a large space from the storage server: One LUN is addressed and handled by the OS as a *SINGLE* SCSi queueing structure to address all of the I/O requests through it in linear fashion.
Sounds like a concurrency issue to me.
So my approach is more luns to get more concurrency. The reasoning is based simply on the above premise. I can certainly imagine ways in which it could be said that the OS can somehow handle a single I/O list of requests faster than multiple queues of them ( maybe analagous to a Risc-Cisc argument ). But, in general I feel that the recommendation for more LUNS is the safest, and like A. Clay says, it certainly won't hurt.

As far as spreading the I/O across multiple paths - I'd add one dimension often overlooked. If you're using a SAN and you've got, for example 2 lines coming from the server in to the SAN switch, and 2 lines going to a single side/portion/segment of a storage server - then you've really got 4 paths to balance along (not just two). I'd recommend you use them all in a PVG inside of your VG (and thus your lvol) to balance across as much of your available hardware as possible. This too is arguably not necessary by our experts, but one I'd still recommend, in an effort to wring out the best possible performance from your system.

To be fair, I DON'T ALWAYS seek performance over manageability, and what you choose to do may have more to do with your style and appreciation for what is involved in system management. For example, I very purposely avoid using raw partitions for my databases, even though I know that it is very inconsistent that I go through all of the trouble mentioned above to wring out all of the possible I/O I can get from the server, and then don't do the same when it comes to raw partitions (for tablespaces). Following what I'm recommending above, I should logically also use raw partitions, just to make sure that I'm getting every last bit of performance from the server possible. So, what I'm at last going to get across to you is - you'll have to do the things to improve your system that you feel are worth it for your needs and your organization. Feel free to use the recommendations, as that is only what they are - recommendations - not scientific proof.

Lastly, I humbly believe that if you did nothing else than blindly follow and learn exactly what A. Clay advises to do for all decisions and policies on systems administration - in the end you'd probably end up being one of the best sys admins out there, certainly better than I.

We are the people our parents warned us about --Jimmy Buffett

V.Manoharan · ‎08-10-2006

Hi CAS,
I don't think DISK i/o performance improvement with making smaller sizes of LUN. Normally in Virtuall arrays, whether is is VA or EVA The read/write and raid management overhead is taken care by controller.

EVA controllers will take care of wrting we need not worry.

regards
Mano

A. Clay Stephenson · ‎08-10-2006

This is certainly no "correct" answer to this question other than to measure for yourself, using your hardware, and your applications. Generally, I've done that already on a Sandbox so that I know what are reasonable settings for my environment.
It's also generally desirable to strike a reasonable balance between performance and manageability --- and typically there is some contention between those two issues.

The same can be said for choosing raw/io over cooked filesystems and in choosing to bypass the UNIX buffer cache for Oracle i/o even for fully cooked files.

You should also experiment with scsi_max_qdepth but in any event, the only way to really know on your platforms is to measure.

Finally (just so I can attract additional ire), take much of what you learned in Oracle classes with a grain of salt. I will admit that it's been several years since I've attended one but even then when disk arrays and vxfs filesystems were already common they were still talking about "spindles" and the importance of filesystem blocksize and matching it to your database blocksize --- and had no idea what an extent-based filesystem was.

So the real answer to your question is always measure.

If it ain't broke, I can fix that.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Just a big EVA LUN or many smaller EVA LUNs ?

Just a big EVA LUN or many smaller EVA LUNs ?