Operating System - HP-UX
1819884 Members
2680 Online
109607 Solutions
New Discussion юеВ

How to choose stripe size in large Data warehouse system

 
Berson Mark
Occasional Contributor

How to choose stripe size in large Data warehouse system

I am about to set a 60TB Oracle database.
The Storage is DMX4 with Meta size of 240GB (8x30GB Hypers)
The recommendation was to define RAID10 striping on host side using VxVM 5.0
The Oracle block size will be 16KB.
The db files will be 10GB each
The O.S. is hp-ux 11i-v3

Can anyone recommend the optimal stripe size in VxVM for such a system?
3 REPLIES 3
Steven E. Protter
Exalted Contributor

Re: How to choose stripe size in large Data warehouse system

Shalom,

Couple of good documents.

http://docs.hp.com/en/B7961-90025/ch12s04.html

http://docs.hp.com/en/5187-1369/ch01s06.html

There is no optimal stripe size. You have to figure out, possibly through testing what size works for you.

I've personally always favored handling that on the storage side of the house. You can actually change things on a running system if you handle it on your storage array.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Hein van den Heuvel
Honored Contributor

Re: How to choose stripe size in large Data warehouse system

>> How to choose stripe size in large Data warehouse system

You don't. Not anymore.

There is very little, or no reason to worry about this too long. Just do extend based striping at the PE level and be happy?

The original reason for striping was to keep mutliple disks busy, to avoid bottlenecks. Well, the DMX will take care of that.

The next reason was to avoid contention at the HBA/fibre/Controller level. Well, we are talking gb/sec there, so there is little point in f*&ting around with kilobytes, chunks of several MB will be just fine for that.

And surely for a Datawarehouse, you'll be doing partitioning and execute queries with degrees of parallelism, so that will keep all disk paths and controller busy at the same time right there.

>> I am about to set a 60TB Oracle database.

>> The recommendation was to define RAID10 striping on host side using VxVM 5.0

Mirroring on the server side may well be desirable, but are the Meta's not already constructed with full redundancy?

>> The Oracle block size will be 16KB.

For everything? That's probably good for a warehous application with the indicated size. But keep an open mind for some tables/tablespaces perhpas being better at a different size, notably 8KB. Oracle does allow you to mix 'n match.

>> The db files will be 10GB each

That seems small, and will leave you and ORacle to manage 6000 files.
What's good about 10GB?
Isn't 20GB twice as good, and 40 GB better still?
Go in with gusto!

As you define this, I believe you need to analyze and document why Oracle ASM should, or should not, be chosen.
I'm not saying you should, but you MUST evaluate it now, before deployment, not next year. I would be inexcusable not to... IMHO of course.

>> The O.S. is hp-ux 11i-v3

Good.

Hope this helps some,
Hein van den Heuvel
HvdH performance consulting.
Yaniv Valik
Occasional Advisor

Re: How to choose stripe size in large Data warehouse system

Hello Mark,

Here are a few recommendations that may help you decide:

- The stripe size should be a power of two multiple of the track size configured on the DMX (a multiple of 32 KB on a DMX-2 and 64 KB on DMX-3), the database, and host I/Os. Alignment of database blocks, Symmetrix tracks, host I/O size, and the stripe size can have considerable impact on database performance. Typical stripe sizes are 64 KB to 256 KB, although the stripe size can be as high as 512 KB or even 1 MB.

- Multiples of 4 physical devices for the stripe width are generally recommended, although this may be increased to 8 or 16 as required for LUN presentation or SAN configuration restrictions as needed. Care should be taken with RAID 5 metavolumes to ensure that members do not end up on the same physical spindles.

- When configuring in an SRDF environment, smaller stripe sizes (32 KB for example), particularly for the redo logs, are recommended. This is to enhance performance in synchronous SRDF environments due to the limit of having only one outstanding I/O per hypervolume on the link.

- Data alignment (along block boundaries) can impact performance. Refer to operating-system-specific documentation to learn how to align data blocks from the host along Symmetrix DMX track boundaries.

- Ensure that volumes used in the same stripe set are located on different physical spindles.

The above is based on EMC's "Oracle Databases on EMC Symmetrix Storage Systems" document.

Last, if you use RAID1 or RAID5 devices, you may want to skip mirroring at the LVM level and just do striping.


Yaniv Valik
Continuity Software