Re: Dealing with a badly fragmented disk

John A. Beard · ‎01-16-2008

Hi,

We are experiencing poor performance issues associated with a particular SAN disk (DSA11) that appears to be extremely badly fragmented. This disk is part of a OpenVMS environment. The biggest concern relates to the amount of time it is taking to perform an incremental backup of this disk.

The device was created with 314572800 blocks and a /CLUSTER value of 100. There are a total of 486 directories, 15881 files and 237167981 blocks currently in use. 220 files (225126315 blocks) are recreated on a daily basis via BACKUP commands from other disks to DSA11

We do not have a defragmentation tool running on this server, so everything is left to RMS to sort out. We were getting reports of significant window turns and split i/o's, so we set acp_window to 255. This has not changed the status of the window turns or split i/o's on our daily reports.

I have a couple basic questions -

[1] Based on the information below (a cross section using DFU fragments > 1000), just how badly in your oppinion do these figures reflect the unhealthy state of the disk.

DSA11:[ORACLE_DATABASE.BACKUPS.PROD1C]CRD_INDEX_PROD.DBS_BACKUP_12;1
7168000/7168000 26/1681

DSA11:[ORACLE_DATABASE.EXPORTS]PROD1_SYSTEM.EXP;119 19227768/19227800 1125/76557

DSA11:[ORACLE_DATABASE.BACKUPS.PROD1]TDSA_EXPORT.DMP;225
4134136/4134200 509/34540

DSA11:[ORACLE_DATABASE.EXPORTS]AM0589P_SYSTEM.EXP;225 4132544/4132600 504/34198

DSA11:[ORACLE_DATABASE.BACKUPS.PROD1]TDSA_ITEM1_DATA1.DBS_BACKUP_35;1
8282112/8282200 63/4233

[2]

As regards what happens when a new file is being created on this disk, ie a 4GIG database copy, how exactly does the physical placement of this file get determined. Not using something like COPY /CONTIG, does RMS simply place this new file all over the shop, even if it means creating 1000s of fragments.

[3] We are contemplating purchasing DFO, but in the interim, does DFU offer the same base functionality for defragging as it's big brother. I have tried a single passes using DFU, but it doesn't really seem to have helped. We noticed that a number of files could not be move beacuse of lack of space, but do you think DFO might a least get us over the biggest part of the problem.

Glacann fear críonna comhairle.

Robert Gezelter · ‎01-16-2008

John,

While these files may be the most fragmented files on the disk, they may not be the ones causing all of the window turns and split I/Os (note that all of these files are exports and dumps).

Dumps and exports frequently grow as they are created, so it is not uncommon for them to expand in many steps. A first question that I have is: What are your RMS parameters for the jobs that create these files (SHOW RMS) and what programs are being used to create these files.

Additionally, I would suggest tracking down exactly what is happening BEFORE setting up DFO for what may turn out to be a different problem altogether (e.g., if files which are almost never read are fragmented, it is not that much of an issue).

- Bob Gezelter, http://www.rlgsc.com

Hoff · ‎01-16-2008

0: There are storage controllers which have performance issues around unaligned transfers. This is why a senior storage engineer within OpenVMS engineering has recommended a multiple of 16 for the disk cluster factor.

1: dump file and listing file and archive fragmentation is not centrally relevant; fragmentation matters with performance critical files, and rather less so with infrequently-accessed and non-critical files.

2: that depends greatly on what creates it, and how it is extended. Databases can play all manner of games with their files and file placement.

A new file created through typical RMS means -- and databases can use completely different interfaces -- would follow the application and process and disk and system defaults, and would create and extend the file accordingly. If the creation or particularly the extension is in tiny hunks (and a disk this big should be set for big extents), then the file will be fragmented. Incremental extension tends to be worst case, as other activity can grab fragments.

3: I don't know that DFU and DFO use different schemes; I'd expect both use the move file primitive within OpenVMS.

4: As Bob G writes in his reply, you definitely do need identify the source of the performance troubles.

5: I'd here tend to look at the volume extent size, and at the settings of the processes that create the files. And at the volume contention. And I'd fix the cluster factor at your next opportunity.

6: Disk fragmentation can be your friend, though too much of a good thing can certainly lead to performance problems.

7: I'd not normally expect to see accesses within archival files, save for sequential writes and extensions. And extensions. If you know the size of the file to be created, pre-size and pre-allocate it. If you don't, then pick a reasonable guess at the size and pick an extent of 500 to 1000 or such; find the knee in the performance curve. Bigger extents may or may not provide a payback.

8: Consider splitting up the disks and disk structures differently, if this volume is being targeted by multiple nodes, and targeted for heavy I/O in parallel. Consider dedicating a volume per node, for instance. This to avoid lock contention.

DFU or DFO might well clean up the existing on-disk allocation, but if the creation and extension settings are stuffed, then the fragmentation will return or will continue.

(And why the snark is ITRC complaining about embedded tags and XSS in this posting when there are no embedded tags in this posting?)

Stephen Hoffman
HoffmanLabs LLC

Steven Schweda · ‎01-16-2008

> (And why the snark is ITRC complaining
> about embedded tags and XSS in this posting
> when there are no embedded tags in this
> posting?)

When this happened to me most recently, a
re-Submit worked. The ITRC forum software
appears to be approximately garbage.
Sometimes "connection refused" is more of a
blessing than a curse.

John A. Beard · ‎01-16-2008

I am not up on what you are referencing by snark, etc, but I don't believe it was anything intentional on my part. No sooner that I post this request, I was not able to get back in to view your replies...please don't shoot the messenger, as the responses are sometimes confusing enough.

Before I delve into your suggestions, I was just curious about the CLUSTER size issue. If I follow the VMS Help suggestion for ODS-2 disks, then then result is 301 for the size of the volume and I can't see how that relates to a multiple of 16....

Glacann fear críonna comhairle.

John A. Beard · ‎01-16-2008

Hi Bob,

This whole tread may have to be put on hold for a week. The Application expert is away until then, and I will need to get him to answer some of your questions. I need to find out what from him what exactly is present on this disk, what is being accessed on a continous basis, and what program(s) are being used to create all these files originating from other disks.

I have attached the output for RMS_DEFAULT which I believe are the default values -

I may well be repeating myself here, and I fully accept that there may well be other problems lurking in the background, but the focus of our efforts have up to now being trying to figure out why the total elapsed time for backing up this particular disk is now takimg much much longer than before. We are backing up all this data to a TSM server, using ABC on the client side. We have now got to the point where it is taking more than 24 hours to complete a successful backup. I don't believe this is due to additional amounts of data being backed up, as it appears fairly consistant from one week to the next. Once again, everything seems to point to this particular disk as being the bottlekneck. Things get even wors if the following days backup kicks in before the previous one has completed. I have already wrote something to put things on hold if that situation should occur.

Glacann fear críonna comhairle.

Robert Gezelter · ‎01-16-2008

John,

Having debugged many of these problems over the years, certainly generalities repeat on a regular basis. Often, when this type of problem is looked at in its totally, all (or many) of the symptoms are interconnected.

The RMS parameters appear to be the defaults. In particular, there is a distinct possibility that the files are being extended in small extensions on an ongoing basis. This can cause extremely long delays in elapsed time when processing these tasks.

- Bob Gezelter, http://www.rlgsc.com

Jur van der Burg · ‎01-16-2008

>DSA11:[ORACLE_DATABASE.EXPORTS]PROD1_SYSTEM.EXP;119 19227768/19227800 1125/76557

So this means 1125 file headers and 76557 fragments which is indeed a VERY bad fragmentation. In general this means that the default extension of the file is way too low. I've seen this when the default of 3 blocks is used, and the file is growing 3 blocks at a time. Combine that with multiple files that get extended and there's your problem. The proper way is to preallocate the expected storage for the files, and be generous with the extend size.

I've seen this in the past when a customer was complaining about performance, and they were working with defraggers to get around it. It was much simpler, just create a file the proper way and give it it big extend size. Add some global buffers as well (this was just plain rms) and the number of disk i/o's in that case went back from 70 to 4 per second. And that was about 20 years ago when the hardware was way slower than today.

Jur.

Willem Grooters · ‎01-16-2008

[1] I wonder what the extent size is of the files. If this is relatively small, you may well run into large numbers of extents. If you know the size of the files, and have some idea of the growth rate, it would be an idea to create the files with enough space in one go. Increase the extentsize anyway.
As stated by Hoff, defining an appropiate clustersize is not just "pick any number". If you used 100 on this disk, 96 or 112 would have been better...
[2] AFAIK, backup will try to re-use clusters that have been marked 'free'and if these are scattered over the disk, you may find a lot of fragments. Within time, this gets worse. Regular image backup to an emoty disk will no doubt help. But that should fit in your scheme and probably is a problem to fit.

Do you use Index-sequential files in your application, and are these highly frequently updated (New records added, records deleted, records uopdated)? If so, takea look to internal fragmentation of these files. These files need to be converted on a regular bases, and the sizing of key- and data areas needs to be monitored and adjusted one in a while. I've seen significant incease of performance in applications after convert alone, and even more after resizing the area's.

[3] if DFU has a problem DFO might as well. It may well be that there is enough space alltogether, but too little contiguous space. The only solution in that case is creating an image backup of this disk to a re-initialised disk with a clustersize that is a multiple of 16 (112 would be my choins in this case)

Willem Grooters
OpenVMS Developer & System Manager

Peter Barkas · ‎01-17-2008

DFO is going to do a better job than DFU at least because, as far as I am aware, DFO consolidates free space and DFU does not. This can be a critical issue on badly fragmented disks.

Hein van den Heuvel · ‎01-17-2008

I would suggest that more than 10 headers (or 20 as your fragments>1000 implies) is painfull performance wise IF the file is accessed at all. But I have seen OpenVMS more or less happily dealy with actively used files with thousands of headers. (Andy G was impressed/surprised by his own work!)

Even if a heavily fragmented file is not accessed, all those fragments are likely to cause other actively used file to be more fragmented.
So I would manually take your worst files, roll them out to an other disk /cont. Roll them back only if/when you have to.

John>> If I follow the VMS Help suggestion for ODS-2 disks, then then result is 301 for the size of the volume and I can't see how that relates to a multiple of 16....

They do not relate.

The 301 comes from the allocation bit map being the traditional 1 million bits = 256 blocks.

The multiple of 16 is driven mostly by storage characteristics, and a little by the way RMS and the XQP work.

The cluster size should be as big as possible within the acceptable 'waste' of space constraints.
You indicate 16,000 files.
Let's round up generously to 50,000 files.
Each of those might waste up to a cluster minus 1 blocks. What is you max waste?
5% of the volume? Then your max clustersize would be 314M/(20*50000) = 314 blocks.
So I would pick 256 or 512, probably 512.

That way you can count on 5x less fragment due to casual, careless allocation or free space fragmentation.

John>> I have attached the output for RMS_DEFAULT which I believe are the default values -

Correct. Free performance to be found.
Change the default extent to 2000 (or 5000)
SET RMS/SYS/SEQ/BUF=4/BLO=64
SET RMS/SYS/IND/BUF=20
SET RMS/SYS/EXTEN=2000

Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting

Robert Gezelter · ‎01-17-2008

John,

I agree with Jur, and have demonstrated the phenomenon to many clients over the years.

The more important questions here are whether this is a cumulative situation that has been gradually been getting worse, or a situation that is being created on a daily basis.

Running a defragmenter is a cure for the side effect, not a cure for the underlying condition.

- Bob Gezelter, http://www.rlgsc.com

Jon Pinkley · ‎01-19-2008

RE: "The device was created with 314572800 blocks and a /CLUSTER value of 100. There are a total of 486 directories, 15881 files and 237167981 blocks currently in use. 220 files (225126315 blocks) are recreated on a daily basis via BACKUP commands from other disks to DSA11"

From your description it sounds like you are using a scratch disk for files that have long lifetimes and for a temporary holding place for a disk-to-disk-to-tape operation.

95% of the used disk space is being recreated every day. (225126315/237167981) and that's in 220 large files (average size just over 1 million blocks each). The remaining 5% used is in "small" files with an average size of 759 blocks (237167981-225126315)/15881

There are things that can minimize the fragmentation. Increasing the extent size as suggested by others will help, but you probably don't want it to be the same for the backup files and the small files.

One thing that leads to fragmentation of free space is having multiple things writing to the disks and extending the files, when it is time to extend a file; the blocks adjacent to the end of the file have been used by another file. To eliminate that, you can dedicate a device to each writer. To reduce it without dedicated devices the best option is to have whatever creates the file ask for big pieces.

At a minimum, I would segregate the D2D2T files from the rest, and set the extend size on the volume to something like the largest multiple of your cluster size < 65536. For example, if the cluster size was 16, use 65520.

I would recommend using LDDRIVER to split the DSA disk into "partitions" that are used for different types of files. Alternatively if you have something like a EVA, just create another device of the appropriate size for the small files.

This is what I would recommend, assuming you can't create SAN devices an any size you want.

Stop activity to DSA11 (dismount and mount privately)

Backup the 220 D2D2T (big) files. After they are backed up, delete them.

Make an image backup of the disk (remaining smaller longer lived files) to a save set.

Dismount DSA11. Initialize at least one member. Set your cluster to some appropriate value. For example:

$ init $1$DGA11: D2D2T /cluster=16 /extend=65520 /index=begin /head=500 /limit /system /own=[1,1] ! You could certainly consider a larger cluster like 256 or even 512, since you won't have many files on the disk.

$ mount /cluster DSA200: /shadow=($1$DGA11:) ! if this is only for disk backups, consider /nocache

Using LDDRIVER (I recommend V9, but at least 8.3, especially if you have any volume sets) create a container files for the small files. Size as needed (you report the current size used as 12041666), I will use 25000000 in this example.

$ create /directory dsa200:[000dsk]
$ ld create dsa200:[000dsk]old_dsa11.dsk /contig/size=25000000/nobackup ! name so you will know what it is. Backups will be done of the ld device, but see more about this later.
$ mc sysman set env/cluster
SYSMAN> do ld connect dsa200:[000dsk]old_dsa11.dsk lda11 /share
SYSMAN> exit
$ init lda11: label_for_small_files /cluster=16/extend=256/header=20000/index=begin/limit
$ mount/cluster lda11: labelâ ¦

Change your startup/shutdown procedure to mount/dismount lda11

Change the location that he backups are being written to, and set the label of the lda11 device to what the old DSA11 label was.

No need to do incremental backup of the D2D2T, since you just want to back everything on it up.

For the LDA11 device, you will need to back it up. You can back that up just like you would any other disk device. However, if you can dismount the lda11 disk while you back it up, you can use/ignore=nobackup on your backup of the DSA200 device and it will do the equivalent of a backup/physical of the LDA11 device, since it is making a copy of the disk file that is acting like a disk. It has the same advantages/disadvantages of physical backup; it is fast, but to restore you must restore the whole thing. And if you can't dismount the LDA11 device, what you will get is much less likely to work than a backup/ignore=interlock. It is equivalent to a physical backup of a disk mounted for shared write access.

After you have done this reorg, if there is a time after you have made your tape backup of the save sets on the DSA200 disk, and before you start to recreate new save sets, you can delete all the 220 big files, and you will essentially start with a clean slate from the defragmentation point of view. You shouldn't need to do this too frequently as long as you have a large extent size on the volume. Perhaps once a month if you want to keep the free space from getting to fragmented.

Jon

it depends

Jon Pinkley · ‎01-20-2008

Oh, I see I never said anything about restoring all the files that were saved in the image backup. Once you have created the small disk (either a new EVA vdisk or the LDA device, and initialized it, you will want to do an image restore, specifying /noinit

$ mou/for lda11:
$ mou/for tape:/nowrite ! if the saveset was written to tape
$ backup/image/noinit /save lda11:/truncate ! the truncate is important
$ dism lda11:
$ mou/ov=id lda11:
$ set volume/limit lda11: ! unless using backup from VMS 8.3
$ analyze disk/repair/record lda11:
$ dism lda11:
$ mou/cluster/noassis lda11:

The fragmentation of the filesystem residing on the LDA11 device will not affect the device its container file resides on.

Good Luck,

Jon

P.S. Evidently this disk was the reason for your previous thread about defragmenters.

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1175284

From what you have shown us, your disk would have been a good test.

Can you share what you found about the different products you tried?

I really don't think a defragmenter will be needed if you segregate the disk-to-disk-to-tape files from your other files, use a large file extent on the volume, and can periodically delete all the backup savesets after they have been saved to tape.

it depends

John A. Beard · ‎01-21-2008

My thanks to all who have replied. Due to other commitments, I will not be in a position to look at all the responses for a few days yet. I will follow up on all your suggestions and assign points later... thanks again.

Glacann fear críonna comhairle.

John A. Beard · ‎01-21-2008

I had just attempted to submit a lengthy reply, when I lost connection to the forum. I'll get the energy to retype it later, but in the mean time this piece of information from the Application team just came my way. It might help you to understand what type of data we are working with here.

ADSM = Tivoli network storage environment

the DSA11 (Oracle$disk) contains the following

1) Oracle Base code - 380 directories, 11544 files, 5446551 blocks. This code is used to create and operate the Oracle databases. Executables in the code tree are accessed by the Database engine and user apps. It also contains logs and alert files for all the databases on the system.

2) Oracle Archived Log files. These are the database transactional log needed
to recover the database. We store them on this disk as they are created.
Hourly they are copied to ADSM and deleted.
Normally there are a few files actually, unless we have high activity in the
database during nightly batch processing and the ADSM is unreachable.
In which case we can have a few giga-bytes of data.

3) Oracle Hot backups. These are the daily full copies of the databases from the oradata1 and oradata2 volumes. They are copied here in a "backup mode" so you can copy them off to ADSM.
This is the bulk of the data on the disk (100GB).

4) Oracle Exports - these are created nightly and are used to recover individual tables for applications, or to port data to the test systems.

5) user directories for some Hilltown associates. These are all very small in
size and should not contains any application operational data.

In relation to making image backups of this disk or noving "static" files to another home, we only have a very limited window in which we can perform such activities, ie twice a year for approximately 4-6 hours max.

I had also copied (contig) one of the badly fragmented files (2.5gig)to another disk, deleted the original and then tried to copy/contig back again. I would not allow a contigous operation, so I had to revert back to a regulat copy.

As for Extent sizing, I was wondering if setting the value on the volume itself as opposed to RMS might be a better short term answer. I have to take into account that there are 14 other volumes that go to make up this node.

Glacann fear críonna comhairle.

Jon Pinkley · ‎01-22-2008

RE:"I had just attempted to submit a lengthy reply, when I lost connection to the forum."

I had a similar experience when posting my first response to this thread. I intended to write a short note, but it evolved into a lengthy reply, and unfortunately I was using the web entry form from a windows PC. But my fingers are trained from years of using the VMS line editing commands, and I wanted to insert something, type ^A followed quickly by another character... no recovery possible. Control-A is windows "select everything" short cut key and typing something after that replaces the highlighted text with the new text, which definitely wasn't my intention.

I highly recommend using something outside of the web form for entry, even if it is MS Word. However, that has its own problems, specifically its use of characters that the forum software doesn't display correctly. To solve that I use metapad see thread http://forums12.itrc.hp.com/service/forums/questionanswer.do?threadId=1155331 my replay dated Aug 24, 2007 03:56:23 GMT and Aug 24, 2007 07:27:37 GMT

Back to your problem:

Straight to your last statement: "As for Extent sizing, I was wondering if setting the value on the volume itself as opposed to RMS might be a better short term answer."

I just did some testing with process RMS default and volume extent ($ set volume/extend or $ init /ext) settings (I didn't change the system default). In this case it appears to me that the MAX(process extend size,volume extend size) is what is used. The only way to override that value with something smaller is to explicitly specify the extend size to RMS (I tested this in FORTRAN using the EXTENDSIZE qualifier in open). The point is I don't think you want to make the volume extend size something really large if you have small files being created on the volume, especially short-lived files.

If specific usernames create the large files, you could have the login.com for those usernames set their RMS default /extend_quantity to 65535. Then if those applications aren't specifically asking for a smaller extent, then when the files are extended, they will be extended in large increments. That does not imply that the extensions will be contiguous.

In your original post, you included the following. Some interesting info can be extracted from this. If we take the allocated space divided by the number of extents, we get the average extent size for the file.

DSA11:[ORACLE_DATABASE.BACKUPS.PROD1C]CRD_INDEX_PROD.DBS_BACKUP_12;1 7168000/7168000 26/1681 Ave ext 7168000/1681 = 4264

DSA11:[ORACLE_DATABASE.EXPORTS]PROD1_SYSTEM.EXP;119 19227768/19227800 1125/76557 Ave ext 19227800/76557 = 251

DSA11:[ORACLE_DATABASE.BACKUPS.PROD1]TDSA_EXPORT.DMP;225 4134136/4134200 509/34540 Ave ext 4134200/34540 = 120

DSA11:[ORACLE_DATABASE.EXPORTS]AM0589P_SYSTEM.EXP;225 4132544/4132600 504/34198 Ave ext 4132600/34198 = 121

DSA11:[ORACLE_DATABASE.BACKUPS.PROD1]TDSA_ITEM1_DATA1.DBS_BACKUP_35;1 8282112/8282200 63/4233 Ave ext 8282200/4233= 1957

Notice that the DBS_BACKUP files have larger average extents than the other files do. This implies to me several possibilities.

The application creating the files is asking for larger extensions than the default. Another possibility is that there isn't anything else extending files on the volume when these files were created, so the blocks adjacent to the previous extension is available the next time a request is made, and the extent is combined with the previous one.

My guess is that what creates the .DBS_BACKUP files is explicitly asking for larger extents. If that is the case, then your average extent size on the disk is probably somewhere between 2000 and 4000 blocks (20 to 40 clusters of 100 blocks each).

And I would guess that what creates the EXP and DMP file is using the system/volume defaults which are probably less than the cluster size of the disk, so they are getting small chunks; in two of the five cases listed the average extent is 120 blocks, meaning that at least 4/5 of your extents have only a single cluster (if none of the extents were more than 2 clusters, then 4/5 extents would be 1 cluster, and 1/5 would be 2 clusters. If some extents have more than 2 clusters, then more than 4/5 have to have only 1 cluster to average to 1.2 clusters/extent.) In other words, the TDSA_EXPORT.DMP file is nearly perfectly fragmented.

Can you do the following and share the output?

$ define DFU$NOSMG T
$ dfu report dsa11:

The section about free space has what the largest free piece is, and the average size of the free pieces.

You can install DFO and use the reporting feature even if there is no PAK. You won't be able to defrag, but it is still worth installing (in my opinion). It can give you more information about free space fragmentation.

If I were in your position I would set the volume extent to something bigger than the default. The question is how big. The problem is that it is a volume wide setting, and any file that gets extended will grab large chunks for the file. So if someone starts an editor on a small file, the file and the journal file will be extended in large chunks. Normally these files will be truncated when they are closed. Be aware that doing that on volumes with high water marking can cause delays when files are extended, on non-sequential files. The point is that you will probably want to use something much smaller than 65535 on a disk that has many active small files. Also be aware that a large extent doesn't ensure contiguous allocation even if contiguous space is available. So if a disk is already fragmented, it won't help as much; it will just grab a bunch of extents in one go. It if turns out that only one cluster was needed and it has allocated 100 extents to get the 65000 blocks specified by the extent value, you have just created a multi-header file for no reason.

The worst case is an application that repeatedly opens file for append, appends a record and then closes it. Since your cluster size is 100, that won't hurt much. But consider what happens when the disk cluster is 1, the extend size is 65000, and each record is 1 block in size. Every time the file is opened for append, and one record is written, the file gets extended by 65000 blocks, 1 block is used, and then the file is truncated when it is closed (that is the default in many languages). In that case the default volume extend size of 5 blocks seems pretty reasonable. If you have an application that does something similar, it is best to explicitly ask for a 1 block extension it you know you are just going to write a single record and then close. Better is to ask for a larger extent, and not request truncate on close, as this is the only way to avoid getting a badly fragmented file if any other things use that disk.

You could start with something like a volume extend size of 400 or 800, perhaps even 1600, depending on the frequency of small file creations. If you have small temporary files being created, and them deleted, I would be more inclined toward the smaller end. I still think you have much more control if you know what is creating the large files, and can set the process defaults for those processes to something large.

On the other hand, if you your disk has all free space in a single piece, the disk will tend to fragment less quickly if the extent size is large, and the average size of free space extents is large, it won't take a lot of extents to satisfy even a request for a large extent size.

RE: "As regards what happens when a new file is being created on this disk, ie a 4GIG database copy, how exactly does the physical placement of this file get determined. Not using something like COPY /CONTIG, does RMS simply place this new file all over the shop, even if it means creating 1000s of fragments."

Strictly speaking, it isn't RMS that determines the placement, it is the XQP, of which RMS is a consumer. As far as I know, there is no "look aside list" categorizing the extents into size ranges, I think there is just the extent cache, and then extents are pulled from free space the based on what not marked as in use in BITMAP.SYS, continuing from its last position, with no respect for the size, unless the request was for contiguous or contiguous best try.

Which brings us to your failure to be able to create the 2.5 GB file contiguously. You don't have a free extent that is 2.5 GB in size. And unfortunately there is no copy/cbt. My guess is that the file that was copied back was still less fragmented than the original, as the size is known at the time the file is created, the whole amount was probably grabbed at once. Do you know how fragmented it was before (do you have output from DFU

Your original question was how to reduce the amount of time to do incremental backups on the DSA11 device. Unless you are doing something to avoid the hot backup file from being backed up, they are going to go to your incremental backups. If that's what you expect and want those files included in your incremental backups before they go to the ADSM system, then the following won't help. If you don't want to include those 100 GB in your incremental backups, you can reduce your incremental backup time by avoiding them being backed up. One way is to have a job that runs immediately before your incremental backup, and have it set the backup date to 1 second after the modification timestamp on the big files. Then your

$ backup DSA11:[000000...]*.*;*/modified/since=backup/fast tape:saveset ...

will avoid backing up those files. However, if nothing is backing up the ADSM, then you may need to be backing those files up.

If your evaluation version of PerfectDisk hasn't expired, I would use it on the volume to try to consolidate the free space. But I would try to do it immediately after copying the Oracle Hot backups to ADSM and deleting them. It would help if you could prevent new hot backup files from being created while you are defragging. If you were lucky enough to get enough free space that you could create a really large container file for the ORACLE backup files, and use LDDRIVER to present that as a different disk. But I think creating a small LD container for the small files is a "better" way. Those are probably "hotter" files, so depending on your SAN storage controller, there may be a performance advantage to these being "close" to each other. If you have an EVA, it really doesn't make much difference, and I would just use another vdisk instead of using LDDRIVER to partition your storage.

Good luck,

Jon

it depends

John A. Beard · ‎01-31-2008

My apologies to all over the lack of an update. I was away for the last week, and haven't had the chance to respond to your many fine suggestions. With the server and disk in question being accessed 7x24, I will not have an opportunity to perform an image back, re-initialization and restore of the volume until March 15th.

I just wanted to say thank you to all the many responses to my original question. They have proved to be very beneficial in giving me a better understanding as to what should be in place.

Glacann fear críonna comhairle.

Jan van den Ende · ‎01-31-2008

John,

from your Forum Profile:

I have assigned points to 26 of 52 responses to my questions.

Maybe you can find some time to do some assigning?

http://forums1.itrc.hp.com/service/forums/helptips.do?#33

Mind, I do NOT say you necessarily need to give lots of points. It is fully up to _YOU_ to decide how many. If you consider an answer is not deserving any points, you can also assign 0 ( = zero ) points, and then that answer will no longer be counted as unassigned.
Consider, that every poster took at least the trouble of posting for you!

To easily find your streams with unassigned points, click your own name somewhere.
This will bring up your profile.
Near the bottom of that page, under the caption "My Question(s)" you will find "questions or topics with unassigned points " Clicking that will give all, and only, your questions that still have unassigned postings.
If you have closed some of those streams, you must "Reopen" them to "Submit points". (After which you can "Close" again)

Do not forget to explicitly activate "Submit points", or your effort gets lost again!!

Thanks on behalf of your Forum colleagues.

PS. - nothing personal in this. I try to post it to everyone with this kind of assignment ratio in this forum. If you have received a posting like this before - please do not take offence - none is intended!

PPS. - Zero points for this.

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.

Robert Gezelter · ‎01-31-2008

John,

Also consider that a full image backup/restore of the device, and the resulting interruption of operations may not be neccessary.

I do not have enough information, but working with the worst offenders (and the procedures that create them) may be sufficient to resolve the issue without a service interruption.

- Bob Gezelter, http://www.rlgsc.com

John A. Beard · ‎01-31-2008

Hi Robert,

Maybe you could clear up one thing in relation to not having to backup/init/restore the disk. I may be off the beaten track altogher with this, but I know from a few tests I did that I could not copy /contig a 2 gig file back onto this disk, and had to revert to allowing it be created whereever the system deemed fit. How can I make the free space contiguous if it is already fragmented (or does it matter in the slightest?). The 100 odd gig of files that created on this disk are deleted after the network backup, amd shortly thereafter. a replacement set is created. Are not all of these new files going to have the same contigous space allocation problems if nothing if things remain unchecked.

Glacann fear críonna comhairle.

Robert Gezelter · ‎01-31-2008

John,

Working without visibility is a challenge. However, moving these files will often free up space.

In the case of backup files, the they are effectively never in use, so the fragmentation is not necessarily a performance issue.

What I am recommending is exhausting the possibilities BEFORE going through the step of image backup/restore, as that will not inherantly solve more than the instant problem.

- Bob Gezelter, http://www.rlgsc.com

John Gillings · ‎01-31-2008

John,

If you do resort to rebuilding the disk...

(and I'd STRONGLY recommend instead of BACKUP/INIT/RESTORE via tape to the same physical drive that instead, you get a new drive,

$ BACKUP/IMAGE

then swap the physical drives. That way you always have an immediate recovery path, you're never at risk of losing your data to a failed tape or tape drive, and it's MUCH faster)

anyway, back to my original point. If this disk contains very large files, consider setting the cluster size to a very large value when you reinit (and remember to BACKUP/IMAGE/NOINIT to preserve the new cluster size). The larger the cluster size, the less fragmentation is possible, and the less it matters. Yes, you can "waste" up to a whole cluster at the end of every file, but if you have a small number of large files, that's a negligible overhead.

Remember that a file header can hold between 55 and 75 extents, so choose a cluster size that is at least 1/50th of the size of your average file (so, for example, if the files are around 1GB, perhaps choose a cluster size of 65536 blocks). That way it's impossible for those files to overflow their headers.

If the disk is shared with small files, consider segregating your data according to size. With appropriate use of search lists, this can often be made transparent to the application.

A crucible of informative mistakes

Jon Pinkley · ‎01-31-2008

Re: "if the files are around 1GB, perhaps choose a cluster size of 65536 blocks)."

If the help is correct, that isn't possible.

V7.3-2

INITIALIZE

/CLUSTER_SIZE

/CLUSTER_SIZE=number-of-blocks

Defines, for disk volumes, the minimum allocation unit in blocks.
The maximum size you can specify for a volume is 16382 blocks, or
1/50th the volume size, whichever is smaller.

V8.3

INITIALIZE

/CLUSTER_SIZE

/CLUSTER_SIZE=number-of-blocks

Defines, for disk volumes, the minimum allocation unit in blocks.
The maximum size you can specify for a volume is 16380 blocks, or
1/50th the volume size, whichever is smaller.

----

Which value is correct 16380 or 16382, or if either of these is correct, I am not sure.

The HM2$W_CLUSTER is a 16 bit field, and since 0 isn't a valid value, it could have been interpreted as 65536. However, according to "VMS File System Internals", (Kirby McCoy 1990), HM2$W_CLUSTER must have a non-zero value for the homeblock to be considered valid. So that limits the upper bound to (2^16)-1 (65535). I am not sure why the value is limited to (2^14)-4 (16380 according to V8.3 help).

it depends

Marty Kuhrt · ‎02-21-2008

A few things to consider about fragmentation.

If your file has more than 70ish fragments it creates a new file header to contain them. IIRC it was 72 for the first header, and 76 after that, but it's been a while since I've dealt with this. (I did Diskeeper support for a long time.)

Even if you consolidate the fragments for a specific header, you'll need to do a header consolidation to fix up all the fragments in the file. Once you get a specific header down to a couple of fragments you can then start consolidating headers. If you have, say, ten headers, with one fragment a piece, it will still look like a file with ten frags.

I haven't looked at DFU for a while. If it has a header consolidation function you may want to use that _after_ you do the defrag of the specific headers.

Can't keep it up? You need VMS!

Categories

Company

Local Language

Forums

Discussions

Knowledge Base

Forums

Discussions

Knowledge Base

Forums

Discussions

Knowledge Base

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Dealing with a badly fragmented disk

Dealing with a badly fragmented disk