1826444 Members
4088 Online
109692 Solutions
New Discussion

Time to do a basic cp

 
James Clay_2
Occasional Advisor

Time to do a basic cp

Pardon my ignorance but I expeerienced some strange issues last night when copying some files on my system, and was hoping someone could explain what was going on.

Filesystem A had 22GB worth of data contained in 42 files, it took ~45min to copy to a new filesystem.
filesystem B had 7GB worth of data in over 42000 files, it took +2 hours to copy to a new filesystem.

Both the filesystems had the same mount options etc. and did not differ from what was there originally. I did one cp after the other, and did not experience major disk contention etc.

Any ideas as to why this would happen?

Cheers


"Computer games don¹t affect kids, I mean if pac man affected us as kids, we¹d all be running around in darkened rooms, munching pills and listening to repetitive music."
8 REPLIES 8
Robert-Jan Goossens
Honored Contributor

Re: Time to do a basic cp

Hi,

Large files do copy faster then a lot of small files. Large files can use the lan interface more efficient, if you use the cp command it copies file after file. If you have a lot of small files, this will take a long time.

Create a tar archive before you copy the files, this will save a lot of time.

Hope this helps,
Robert-Jan
Keith Bevan_1
Trusted Contributor

Re: Time to do a basic cp

James,

You may want to provide a little more information for the forum members, so they can evaluate more precisely !

Are the file systems A & B on the same physical server ?

Are the filesystems on the same logical volume, volume group, physical disks, or are your running nfs mounts for example ?

What was the system usage at the time of the cp (ie disk i/o, memory, process priority) ?

The more precise and detailed information you can provide the more likely you are to receive beneficial advice.

Keith
You are either part of the solution or part of the problem
Bruno Ganino
Honored Contributor

Re: Time to do a basic cp

This is normal !
The system must take present attribute of many files and to copy it one to one.
HTH
Bruno
Torino (Turin) +2H
Kent Ostby
Honored Contributor

Re: Time to do a basic cp

Also, for each file, the system has to go through the overhead work to find a new inode to link it into and to set aside the space where its going to write the data.

With the large files, it only has to do these I/Os 42 times and then it can get on with the business of writing out the data.

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
doug mielke
Respected Contributor

Re: Time to do a basic cp

also with large directories the creation of the file list alone can be time consuming.


a dd copy would ignore most of this structure, and even with many files, would have copied the 7 gig faster.
A. Clay Stephenson
Acclaimed Contributor

Re: Time to do a basic cp

The biggest overhead in copying the larger number of small files is the fact that directory has to be rewritten each time a file is added. The system also has to use locking mechaisms to ensure that only one process is actually writing to the directory at any moment. This overhead completely swamps the fact that the total data volume was ~ 3X larger in the run with a small number of individual files.
If it ain't broke, I can fix that.
James Clay_2
Occasional Advisor

Re: Time to do a basic cp

Thanks for that.

For the record I was copying data within the system onto new LV's in different VG's. I think next time I'll tar the whole lot up :-)
"Computer games don¹t affect kids, I mean if pac man affected us as kids, we¹d all be running around in darkened rooms, munching pills and listening to repetitive music."
Mark Grant
Honored Contributor

Re: Time to do a basic cp

Actually, I'm not sure tar is going to help you too much when moving all the stuff to new LV's. Even with one big tar archive, you'll still need to extract that tar archive on the other side and therefore still need to muck about with all those directories that A.Clay mentions.
Never preceed any demonstration with anything more predictive than "watch this"