1819802 Members
3163 Online
109607 Solutions
New Discussion юеВ

cat buffering

 
venkata chunduri
New Member

cat buffering

we have a multi thread process that creates smaller files from original file for processing (validations etc). once the validations are complete, the files are put together to create the original files.
cat command is used to put all the small files together to create one file. this is a sequential process, only one file from group of files is selected at a time.

we have noticed that the cat takes longer as it progresses through the files.

the example would be, cat starts with the first file, concatenates it to the file taking about 10 minutes. when the second file is picked, it takes about 15 minutes to concatenate the file, and so on. all the files are approximately same size. the total volume of the records is about 10 million.

does the cat command's buffering anything to do with the process being slow or any other OS setting needs to be changed? any information would be of great assistance.
3 REPLIES 3
Steve Steel
Honored Contributor

Re: cat buffering

Hi


Each time it starts the buffer displacement that must be rewritten gets bigger and thus slower


Use the format

cat file1 file2 > file3

concatenates file1 and file2, and places the result in file3.

It will be a lot quicker


Steve STeel
If you want truly to understand something, try to change it. (Kurt Lewin)
A. Clay Stephenson
Acclaimed Contributor

Re: cat buffering

It sounds as though you are writing a new output file each time and then feeding that output file + 1 additional file to cat each time to produce a newer output file.

What you should be doing instead is

cat file1 > ofile
cat file2 >> ofile
cat file3 >> ofile
...
...
cat file99 >> ofile

The append does an lseek() to EOF and then starts writing so that the only data that need be read is the current input file.
If it ain't broke, I can fix that.
venkata chunduri
New Member

Re: cat buffering

with the approach, "cat file1 file2 > file3" we have seend about 6% reduction in time.

Is there any other way to improve the speed while concatenating files, this does not necessarily a cat command but, any variation or some other command with same functoinality.

we are interested in making the concatenation faster.