1748003 Members
4772 Online
108757 Solutions
New Discussion юеВ

copying tons of data

 
SOLVED
Go to solution
Jim Smith
Advisor

copying tons of data

Hi,

were copying gigabytes of data from one system to another periodically by disconnecting our disks and reconnecting them on a second system and importing the Volume groups. Then we copy all the data. Whats the fastest way to copy all the data over ??
6 REPLIES 6
Stefan Farrelly
Honored Contributor
Solution

Re: copying tons of data


Youre already doing well, the fastest way to copy tons of data like this is to reattach the disks on the remote server. Use as many SCSI connections as you can. If the VGs and lvols are the same size then nothing is faster than using dd if=/dev/vg00/rlvol of= bs=1024k. If you have say 10 lvols then spawn off 10 dd's all at the same time (using &). This will be the fastest way to copy (if you dont mind your server being tied up during the copy).

If you have different sized lvols then your using some copy command in which case if youve got JFS filesystems then mount them all with the nolog option, this will speed up the copy, and once again spawn off as many copy commands at once as you can (say 1 for each lvol being copied). Ive seen different cases where using cpio or find | cpio or cp is faster, you will need to test these in your case to see which is quicker.
Im from Palmerston North, New Zealand, but somehow ended up in London...
CHRIS ANORUO
Honored Contributor

Re: copying tons of data

Hi Jim,

Try using this cpio command to do the copy and free your systems.
find .(directory where you are copying from) -depth|cpio -ocB|remsh (servername) "cd /(directory to copy into); cpio -imdvcB"


Cheers!
When We Seek To Discover The Best In Others, We Somehow Bring Out The Best In Ourselves.
Bill Hassell
Honored Contributor

Re: copying tons of data

cpio is a good filesystem copy utility but since the disk is directly attached, the cpio -p option is the fastest method to copy files that are 2Gb or less. The cpio incantation would be:

find . | cpio -pudlmv /target

(the puddle-move option) Leave off the v option if you don't want to see the individual files listed as they are copied. I always use the u option even if the target is empty as I once got burned by stopping a copy to verify operation then restarted it. All the existimg files were skipped as expected, but the last file copied on the first pass was only partially copied and cpio does not check file sizes, just the existenc of the file. The u option ensures that every file on the source will be copied to the destination.

For filesystems with large files, a dd copy will work although it will copy unused space as it has no knowledge of individual files. You could also use fbackup piped to frecover as a file-copy method.


Bill Hassell, sysadmin
Stefan Farrelly
Honored Contributor

Re: copying tons of data


Bill is, as usual, correct. If your total amount of Used data is less than half the Total Assigned Data then it wouldnt be quicker to dd - as dd copies all blocks wether they be used or not. Normally a raw dd is twice as fast as a normal copy command but not if you arent using more than half of your available space.

eg. If you have a 9 Gb disk but are only using lvols on it totalling say 4 Gb of used data then it would be faster to use find | cpio to copy it, if you are using say 7 Gb of it then it would be faster to use raw dd.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Tim Malnati
Honored Contributor

Re: copying tons of data

This sounds like a situation where time is money. There may be an opportunity to drastically improve data availability with multi-homed mirroring from a large disk array like an EMC or an XP256. My guess is that you need the original data back on the first machine asap. With a multi-homed array it never leaves, you split of a volume and import it on the other machine without any loss af availabilty on the original machine and no physical hazard associated with continually moving connections from one machine to another. With the data mounted on both systems, there would be no particular hurry to copy the data either where you can use the data directly from the mirror for as long as you like. You may be able to setup some sort of round robin situation with multiple mirrors where copying the data may not be required at all.
Rick Garland
Honored Contributor

Re: copying tons of data

Another option would be the use of dump or vxdump.

/usr/sbin/vxdump 0f - /dev/vgXX/lvXX | remsh '(cd /; /usr/sbin/vxrestore -rf -)'

This will dump the data from one system to another without having to disconnect/swap disks between systems.

If using HFS filesystems, use the command 'dump' and 'restore' instead of 'vxdump' and 'vxrestore'.

NOTE: This will not dump to the tape device. It will dump to the disk device specified.