1823181 Members
3708 Online
109647 Solutions
New Discussion юеВ

Re: Verify data move

 
Larry Basford
Regular Advisor

Verify data move

I'm moving 16GB across our local network 100FD
with this command.
tar -cf - HUNDX|remsh hpdxha "(cd /data2/concordx;tar -xf -)"

What is the best way to verify it is complete and not corrupted?
Desaster recovery? Right !
10 REPLIES 10
Dennis Handly
Acclaimed Contributor

Re: Verify data move

I'm not sure if there is a "best" way. You could use find and cksum on both the source and target and then compare:
$ find . -type f -exec cksum + | sort -k3,3
268918048 189 ./ABC/fffff
...

You might want to use "remsh hpdxha -n ..." above.
Peter Godron
Honored Contributor

Re: Verify data move

Larry,
if you are not confident in the error reporting you could nfs mount and then copy.
After your copy you could then also run dircmp or diff to compare directories.
Wouter Jagers
Honored Contributor

Re: Verify data move

- You can count the number of files using "find . -type f -print | wc -l"

- You can check the total size of the files using "du -sk ."

This will put some load on your machine for a bit, though. But if that's acceptable, you'd know all files are copied and the used diskspace is the same, which might ease your mind.

Also consider checking out rsync for such things, it can be quite useful:
http://samba.anu.edu.au/rsync/

Cheers
an engineer's aim in a discussion is not to persuade, but to clarify.
Yogeeraj_1
Honored Contributor

Re: Verify data move

hi,
if you are concerned about performance, you may also try to compress the TAR files before actually transferring the files to another location.

hope this helps too!
kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Larry Basford
Regular Advisor

Re: Verify data move

TO system
# ll -R HUNDX|wc -l
15646
# du -sk HUNDX
16404730 HUNDX

FROM system

# ll -R HUNDX|wc -l
15646
# du -sk HUNDX
16404930 HUNDX

How does this make you feel?
cksum HUNDX
174277909 67584 HUNDX
cksum HUNDX
2081901881 65536 HUNDX

I tried to just copy a directory to a different location and the chksum is different. Must be an directory inode difference or empty directories.
Desaster recovery? Right !
A. Clay Stephenson
Acclaimed Contributor

Re: Verify data move

Trying to do a cksum on directories as you did most trecently is doomed to failure (or if it did happen to "work"; it would work purely by accident) because of the differences in directory sizes and inode assignments. The only meaningful and trustworthy though resource and time intensive method is to do a find on the regular files and -exec a cksum of each file and sort the output on both the target and source machines. Compare those lists using sdiff and you have an extremely high confidence that all is well. You should also be aware of the filesize limits imposed by the classic commands like tar and cpio so you might consider replacing your tar|tar pipeline with an fbackep|frecover pipeline which would also perform marginally better.
If it ain't broke, I can fix that.
James R. Ferguson
Acclaimed Contributor

Re: Verify data move

Hi Larry:

I agree with Dennis and Clay --- 'cksum' is the most reliable way to confirm success.

The value of 'fbackup/frecover' too is that not only does it handle largefiles should they exist, but it can also keep sparse files sparse.

Do not bother to evaluate the size of directories on the source and target points. Once a directory has been populated with files and space therein consumed, the deletion of a file does *not* return the now free directroy space. Hence you will often find that the target directory after the copy of its files is smaller in size than the source directory from which the files came.

Regards!

...JRF...
Stefan Schulz
Honored Contributor

Re: Verify data move

Hi Larry,

you might also want to check rdist. This tool is intendet to maintain identical copies on several hosts and can be used for a filetransfer. I also has some reporting and comparison options. Do a man rdist for more informations.

For a second look at the copied files i would also do a cksum on all files and compare the sorted output from source and target.

Kind regards

Stefan

BTW: Happy New Year
No Mouse found. System halted. Press Mousebutton to continue.
Eric Raeburn
Trusted Contributor

Re: Verify data move

I agree cksum(1) is the only really reliable method. But to save time and network bandwidth, I'd do it like this:

1. On the source system, do:

find -type f -exec cksum {} \+ > /tmp/cksum.out

2. tar and gzip the data, copy to the target system, gunzip and untar.

3. Do the find(1) command in step 1 on the target system.

4. Finally, do a cksum of the two /tmp/cksum.out files on the two systems. They should be identical. If they are not, move one to the other system (with a new name) and do a diff to see which files got corrupted.

-Eric
Florian Heigl (new acc)
Honored Contributor

Re: Verify data move

please just start using rsync,
it does the checksumming on the fly which provides FAR better performance especially when handling large files or file sets.

in Your case I'd do:
rsync -arHlS HUNDX hpdxha:data2/concordx/

rsync is available in the HP-UX Internet Express.
yesterday I stood at the edge. Today I'm one step ahead.