1758535 Members
1825 Online
108872 Solutions
New Discussion юеВ

Fastest Copy Method

 
SOLVED
Go to solution
Geoff Wild
Honored Contributor

Re: Fastest Copy Method

Your main issue isn't any of the methods - but the Clarion itself - it use parity raid - so large data transfers suck - as it has to constantly keep on figuring out the parity....

vxdump, dd, etc are great for mirrored (or RAID 10) type disks...

I'm afraid with the CX600 - you will need a long outage....

My 2 cents...

Rgds...Goeff
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.
David Poe_2
Advisor

Re: Fastest Copy Method

Geoff, we are using RAID 10 on our Clariion. The issue is that cp was outpacing every other method by a wide margin which is surprising to me based upon my research for fast copy methods in the forum. Thanks to the suggestions of all, I found that with some tweaking, these other methods were much faster, but still shy of cp. The benefits from what I can see are data integrity checks which add to the processing time. Very soon we will be migrating to a DMX box, but having said that, the I/O on the server is my current constraint and not the SAN or Disk Array.
Zinky
Honored Contributor

Re: Fastest Copy Method

David,

Having just migrated approximately 32 TB of data from 2 EVA5000's to an XP12K over the weekend, I would say dd was on the average 15-20% faster than vxdump/vxrestore and is the fastest. I used a block size of 4096K (4MB). I think in your dd syntax - you were merely using 4KB instead of 4MB - that is why it was slow. On servers where I have 2 FC-HBAs, I run 4 simultaneous dd copies whilst on servers that have 4 FC-HBAS, I run 8 concurrent. I literally approached the limits of my FC-HBAs during my copies.

dd if=/dev/sourcevg/rsrclvol of=/dev/vx/rdsk/targetdg/tgtvol bs=4096K

Oh, btw.. we also converted to VxVM from LVM.. that's whythe above syntax.

Hakuna Matata

Favourite Toy:
AMD Athlon II X6 1090T 6-core, 16GB RAM, 12TB ZFS RAIDZ-2 Storage. Linux Centos 5.6 running KVM Hypervisor. Virtual Machines: Ubuntu, Mint, Solaris 10, Windows 7 Professional, Windows XP Pro, Windows Server 2008R2, DOS 6.22, OpenFiler
David Poe_2
Advisor

Re: Fastest Copy Method

Here are some numbers from my tweaked settings (thanks to your suggestions). dd - 32.5 minutes for a 75 GB LUN. And fbackup 5.75 minutes for 20 GB a files transfered. These numbers are much better, still slower than cp when used on the smaller number of large files. I expect dd to be much faster on our fuller file systems with (unfortunately) several million files scattered throughout a few hundred directories.

Thank you all for your input! I'm going to keep this thread open for a couple more days in case anyone else wants to post other suggestions.

steg002:/# timex dd bs=2048k if=/dev/santest1/rlvtest of=/dev/santest2/rlvtest &
37500+0 records in
37500+0 records out

real 32:45.16
user 0.09
sys 13.60

steg002:/santest1# timex fbackup -0i /santest1 -f - -c /home/dpoe/fbackup_disk.conf | (cd /santest2; frecover -Xrf -)

fbackup(1004): session begins on Mon Apr 4 21:53:39 2005
fbackup(3024): writing volume 1 to the output file -
fbackup(3055): total file blocks read for backup: 39062518
fbackup(3056): total blocks written to output file -: 39062576

real 5:43.47
user 0.91
sys 24.51
David Poe_2
Advisor

Re: Fastest Copy Method

I just thought of another question that I didn't think to ask. Is there someplace that I can look to verify the data integrity check for each of these commands? They don't seem to be any place obvious within the MAN pages. I just would like the warm fuzzy of seeing it formally in writting somewhere when I propose these solutions to my boss. TIA!
Zinky
Honored Contributor

Re: Fastest Copy Method

David,

If you use vxdump/vxrestore - then be assured that your copies are fine as long as there are no errors during the process. VxFS (JFS) still is the best mainstream filesystem out there and does all integrity checks.

If you use dd - then make sure your destination (and the source) raw lvol/vol device do not have their filesystem mounted - if they contain one. If you're using dd - then post dd copies, your check on the destination will be via a mount of the overlying filesystem - which if it is VxFS, will alert you if there are any inconsistencies. Very rare you will have inconsistencies specially if you unmount your source filesystem as well. Never mount the filesystem on th etarget raw lvol/vol while your dd copy is in progress.



Hakuna Matata

Favourite Toy:
AMD Athlon II X6 1090T 6-core, 16GB RAM, 12TB ZFS RAIDZ-2 Storage. Linux Centos 5.6 running KVM Hypervisor. Virtual Machines: Ubuntu, Mint, Solaris 10, Windows 7 Professional, Windows XP Pro, Windows Server 2008R2, DOS 6.22, OpenFiler
Geoff Wild
Honored Contributor

Re: Fastest Copy Method

Interesting - didn't know the CX600 did RAID 10 - not bad at all...

How many paths to the LUNS do you have?

I did a fairly recent migration from old symetrix to DMX1000's - one of the DB's we used vxdump - it was 1.5 TB - and took almost 8 hours...that was on a HP-UX 11.0 parisc server - 4gb ram, 4 cpus, 2 HBA's....

Course - the system was quiet at the time - db down...etc....no other I/O what so ever...

Rgds...Geoff
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.
David Poe_2
Advisor

Re: Fastest Copy Method

We are using PowerPath, and have 4 zoned paths (dual HBA's) to the Clariion. Speedwise, we aren't outpacing the Clariion yet. We are just starting our migration to the Itanium servers as our older PA-RISC just aren't cutting it.

Everything we have is on RAID 10 as I can't get anyone to sign-off on moving some of the less critical data to RAID 5. At least it isn't my money. :)
Hein van den Heuvel
Honored Contributor

Re: Fastest Copy Method


Just some observations / re-enforcements.

Copy and and Fbackup have the advantage of knowing how much real data there is. dd has no choice but to copy all blocks, used or not. In your case dd has to do almost twice as much IO. With that it might even get to nooks and crannies that have laid dormant before. Specifically, if that had been an EVA then the output may only have been 'promissed', not actively allocated. So 'dd' may cause that one-time additional activity. Dunno about a Clarion.

Copying millions of small files is really not comparable with a few large files. File based copiers will slow down as the number of files increase. dd will be oblivious to that.

dd is contraint be being synchroneous: read a block, write a block. Repeat untill done. Little or no cpu time, but lots of waiting for io to be done. dd becomes vary useful if (and only if?) you can launch multiple concurrent streams. This is notbaly great if you have multipel luns. Depending on the io sus-system characteristics I would even consider splicing a lun with iseek/oseek to copy your rvol in 3 - 10 parallel chunks. Of course this requires the lun not to have been carved from a single disk, but from a large group (ala EVA).

fwiw,

Hein.
David Poe_2
Advisor

Re: Fastest Copy Method

I found another advantage to dd over cp, btw. I noticed that cp seemed took nearly 100% of disk I/O (via glance). dd only took about 15%, meaning that there is plenty of bandwidth to move multiple streams, if you are copying multiple rLV's. Which may be what all you guys were trying to tell me, but I just got. :)

I did notice in the man pages that you can use iseek and oseek to start your copy (both infile and outfile) at different points throughout the rLV. Does that mean that I could potentially break up a single rLV copy into say, 3 dd streams? This would be for fast copies of single, large rLV's. Something like the following.

There are 75000, 1024K blocks in my file system.

1) dd bs=1024k if=/dev/santest1/rlvtest of=/dev/santest2/rlvtest iseek=0 oseek=0 count=24999

2) dd bs=1024k if=/dev/santest1/rlvtest of=/dev/santest2/rlvtest iseek=25000 oseek=25000 count=24999

3) dd bs=1024k if=/dev/santest1/rlvtest of=/dev/santest2/rlvtest iseek=50000 oseek=50000

count intentionally left blank for #3 as I hope this means it will copy the rest.