tablespace/datafile layout

Brian Gebhard_1 · ‎01-14-2004

Hello all,

Myself and the other DBA here are trying to determine which is the best way to layout our new schemas/tbsp/datafiles. Hardware hp9000 series with a SAN. I am useto laying out 1 tablespace per schema that contains multiple datafiles evenly spaced so that they are 50% filled to start. Preverably in an increment of the number of CPU's of the machine. He currently has multiple tablespaces per schema with 1 to 2 datafiles per tablespace. So basically there is no distinction between a datafile and tablespace. This causes 1 datafile/tablespace to be 25 gig and another is 1 gig. I know there's a performance hit because of this. I want to export and import back into one tablespace with multiple datafiles. I'm basically looking for suggestions or online documentation to help sway our decision one way or another.

thanks in advance.

Sean OB_1 · ‎01-14-2004

What kind of SAN are you on?

Brian Gebhard_1 · ‎01-14-2004

EMC hardware. What specifics are you looking for? I can ask the os admins for more specifics. I'm still reading the EMC manual.

Brian Gebhard_1 · ‎01-14-2004

Mainly I'm interested in the oracle side at the moment. Is it better to have 1 tablespace per schema or multiple ones. His argument is that if you have multiple tablespaces then the application can still function while you restore a tablespace in case of failure. My point is what application do you want to work when some of the tables are missing or corrupted.

Brian Crabtree · ‎01-14-2004

Brian,

I wouldn't bother. Datafiles can be as large as you want, since Oracle keeps the blocks in memory. Multiple tablespaces aren't going to give you any extra performance on the system either, although you shouldn't have a big performance hit for it. Probably the biggest hit you will see is if you get a huge number of datafiles, then the checkpoint will take a while to run, and you could get timeouts on it.

Let me know if you have any questions.

Thanks,

Brian

Hein van den Heuvel · ‎01-14-2004

The traditional reason for multiple datafiles is to spread the IO activity.
In a San, or a striped LVM setup, this is of little or no value as SAN appliance is likely to spread IO over disks for you. Your mission is to verify that your EMC setup behaves that way and does not simply fill one disk, then the next, then the next.

In Oracle I like to have multiple dataspaces to get easy visibiltiy on performance. Oracle will report IO activity per tablespace. Yes it also reports per file under a tabelspace, but you have no control over placement below the tablespace level.

>1 datafile/tablespace to be 25 gig and another is 1 gig.
> I know there's a performance hit because of this

I don't know that. Please explain. Why would the size play any role? It is only the usage that matters. Is that related to the size in your application? I challenge you to show that a single 24 Gb file performs any worse that 4 files each 6 Gb ... living on the same mountpoint/disk. Now if you can spread them out over more physical spindles....

Beside the (statspack/utlXstat) visibility I also like multiple tablespace because they allow me to take selected parts of a schema offline, and with recent Oracle versions maybe transport them, maybe change the block size and so on.

I would only worry about many (hundreds of) tablespaces from a managebility point of view. To many system objects to keep track of! I would not expect a measureable performanc impact.

I would worry about a single tablespace as it remove any sense of control I might have had. Not all access patterns are alike. Some tables / indexes may very well warrant their own (set of) disks. You can not control that under a single tablespace.

If the current setup works, then I would just leave well enough alone. If and when you detect a real quantifiable problem with the current setup, then start thinking about alternatives. And the beauty of that approach will be that at that change point you will have a concrete argument for your change versus the 'this feels better' or 'that's how I always do it' argument you are stuck with now.

hth,
Hein.

Yogeeraj_1 · ‎01-14-2004

hi,

You would rather set up locally managed tablespaces by object size -- setting up extent sizes for "small", "med" and "large". You would want all segments to be placed in a tablespace such that they would never have more then say 1,000 extents.

So, maybe for you if you have some really small tables:

64k extent size -- good for 64k to ~6meg, could goto ~60meg
512k extent size -- good for 512k to ~50meg, could goto ~500meg
1m extent size -- good for 1m to ~100meg, could goto ~1gig
5m extent size -- good for 5m to ~500meg, could goto ~5gig

For your own convenience, if you many schema, you may wish to define several "small", "med" and "large" tablespace "groups".

hope this helps too!

regards
Yogeeraj

No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)

Jean-Luc Oudart · ‎01-14-2004

Regarding datafile/tablespace layout we use LMT (Locally managed tablespaces) with different size for extent :
SMALL, MEDIUM, LARGE and XLARGE. We have tablespaces for data and tablespaces for indexes.

The file systems layout : we use stripping over RAID (1+0) on SAN XP128
cf. thread
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=290174

The approach is when possibel is to benchmark the application, analyze results and keep best settings !
In reality, you may not have the opportunity to benchmark. In any case keep track of your performance (server, database, application). You will have a baseline for future reference (when system grows, more users, ...)
For this use the usual suspects (Glamce/mesureware, statspack for Oracle).

Regards,
Jean-Luc

fiat lux

Brian Gebhard_1 · ‎01-15-2004

Ok, lots of points to discuss. First the performance hit I'm referring to:
Here they have a 20 gig datafile. One table inside this datafile is 15 gig. This table is the main table of the application. A file can only be read/written one at a time. You can't spread the object over multiple tablespaces, so it'll be stuck in one datafile. I'm not an sys admin so i'm not positive, but I didn't think you could set up a mt pt/file system that spanned across multiple arrays. Maybe it's a lot different with a SAN.
2. When they do os backups here, one file system takes 50 minutes longer because of these larger datafiles, downtime for cold backups are being affected.
3. As far as being able to take one tablespace offline while the rest of them are online. Maybe if the app was written and planned well for that possibility then I would be fine with taking it offline. But I've never worked at a company yet that planned the application that well. So therefore taking some of the objects offline would cause the application to break.
4. As far as being able to look at I/O based on the object lvl so you can move an object from one tablespace to another, I don't see it being helpful. Maybe is cause the applications I've worked with have only had 100Gig schemas, larger ones might work different, but so far the majority of the I/O for these schemas were based on a handful of tables. Usually 5 or less, so unless all these or most of these tables are in one datafile moving the table won't help. The main I/O for the app goes with the main tables. You could add two datafiles which might help but to me you are still limiting the read/writes. If you go higher then you should just have 1 tablespace with multiple datafiles.
5. I'll have to read up on SAN some more to understand how that hardware is set up and writes files.
##
I can see the benefit in limiting I/O hog tables to certain datafiles. You can do that easier with more then one tablespace. But there's more work to maintain it. I've never had any problems with not being able to maintain an even I/O rate across the file systems with just moving datafiles. As far as being able to take a few tablespaces offline under the application, to me there's no benefit to that. I don't want part of the application to work. If part of the application is standalone then it should be in a different schema.

Thank you all for you suggestions. I'll read up on the link and see what that says. I have to say I'm still not convinced multiple tablespaces for one schema is better then one. As far as a separate index tablespace goes, that I think depends on the type of database you have. How often it gets updated mainly.

Dave Johnson_1 · ‎01-15-2004

We are using Oracle Applications 11i with Oracle DB 8i. This configuration creates 2 tablespaces for each schema, one for data and one for indexes. We have decided to put a limit on datafile size of 2GB so that all UNIX tools will work with the datafiles, cp, rcp, ftp, tar, and such. Our largest tablespace is using several datafiles for data and a few more for the indexes.

We also have all of this on an XP512 disk array which is similar to EMC. Our Production environment makes use of BC, Business Copy volumes, for backups. At midnight we shutdown the environment, resync the BC, split the BC, restart the apps, mount the BC, start a backup of the BC volumes. Our total down time is <20 minutes for the application and the backup can take several hours if needed without affecting PROD. If the backup to tape fails, I can restart it without having to shutdown PROD. Using this I get a cold backup every night. I even have one on disk for use the next day in case I want to clone my environment.

With the XP and EMC, you need to be carefull about getting carried away with "which disks the data is stored on". When the admin carves out LUNs for you, they can all come from the same array group. An array group in an XP is 4 disks working together to provide RAID-1 or RAID-5. If all your "disks/mount points" are on the same array group, that would be bad. Better is to select LUNs from as many array groups as you can when setting up high I/O stuff and a few when setting up low I/O stuff. Keeping PROD and development on different array groups would be ideal, but it cannot always be done.

I have yet in our environment to need to take one application/schema/tablespace off line for any reason in production. That does not mean it will never happen. I have been a UNIX Admin for >10 years but a backup DBA for only 2 years.

Hope this helps
-Dave

Volker Borowski · ‎01-16-2004

Hi,

In the old days a SAP application was installed with seperate TS for
SYSTEM / TEMP / ROLLBACK.
This is the still the same for new releases. (and it still makes a good sense)

Concerning the data, the old layout seperated
masterdata and transactiondata and had seperate indexes for each as well.
The idea was, that an application usually reads masterdata to access corresponding transactiondata afterwards so the goal was to minimize disk-head movements. This was the time of xxxMB or xxxxMB disks.

Typically we now have 70GB or 140GB disks in raid or stripe configuration. So the IO is done in parallel anyway and we have lots of diskcache on each disk in addition. So do not care about it any more.

The new layout for SAP seperates only data, that might need to be frequently deleted to avoid fragmentation. I.E. the release dependend data is in a single tablespace and when you upgrade the application you get a new tablespace and drop the old one.

It is worth to seperate queue-like tables/indexes, that get a lot of deletes and inserts for the same reason. -> These might require reorgs for fragmentation reasons. Keeping them seperate avoids to drill holes in your datafiles.

As far as distribution of storage is required, a tablespace does not really have to do with the location on disk any more.
You can manage it on datafile level like
ALTER TABLE ALLOCATE EXTENT IN DATAFILE ...
but who likes to do so, if the datafiles are striped ?

I agree with your point of the backup access for big datafiles. This is a reason to have a few more of them to balance the backup sessions. I also agree, that a lot of datafiles is costy for a checkpoint. (there is always one death to die).

Concerning the point where data of a table is located, keep in mind that due to the use of partitions a table does not need to reside only in a single tablespace any more.
This is still valid for non-partitioned tables, but if a table is partitioned it can very well spread about several tablespaces. http://www.bikle.com/TThier/DBAtips/partitionDemo1.html

Summary:
1) I'd vote for a reasonable number of datafiles (for backup and handling reasons).
In fact I still stay with 2000M datafiles, because I faced some 2GB bugs in the early days of this boundary being broken.
2) I'd never care in which tablespace they are beside they might be reorg-candidates.

No real hard data to decide...
Make your choice :-)
Volker

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

tablespace/datafile layout

tablespace/datafile layout

Re: tablespace/datafile layout

Re: tablespace/datafile layout

Re: tablespace/datafile layout

Re: tablespace/datafile layout

Re: tablespace/datafile layout

Re: tablespace/datafile layout

Re: tablespace/datafile layout

Re: tablespace/datafile layout

Re: tablespace/datafile layout

Re: tablespace/datafile layout