Operating System - HP-UX
1753868 Members
7282 Online
108809 Solutions
New Discussion юеВ

Re: Zip files 1.7 Million files

 
SOLVED
Go to solution
Michael Steele_2
Honored Contributor

Re: Zip files 1.7 Million files

And I have provided the procedure that I would use if I had the task to perform.
Support Fatherhood - Stop Family Law
Dennis Handly
Acclaimed Contributor

Re: Zip files 1.7 Million files

>I have a filesystem which has got 1.7 million files. ... All files are residing on a single directory.

I assume people have told you this is not a good idea?

>The files can be identified by their time stamp

Encoded in their name, or in the ll(1) output?
I have a case where they are encoded in their name.

>there are no separate directories for each year.

If the names include the year, the first thing to do is to create a subdirectory and move all of a year into it.

If they don't include the year, you can make a simple script to do that:
last_year=""
ll -trog | while read F1 F2 F3 F4 F5 F6 F7; do
case $F6 in
200[5-8]) ;;
*) continue;;
esac
if [ "$F6" != "$last_year" ]; then
mkdir -p $F6
last_year=$F6
fi
echo mv "$F7" $last_year
done

Once they are in a separate directory, you use the tar-gzip suggestions as Steven suggested.

If you don't want to include the directory name in the tarball, you can use -C:
tar cf - -C 2005 . |
gzip > year_2005.tar.gz
OldSchool
Honored Contributor

Re: Zip files 1.7 Million files

perhaps the biggest problem here is that questions to the OP get a restatment of the original question, without additional information. and specific questions go unanswered.

there are a variety of answers posted, some of which may be more appropriate than others, depending on the exact goal, which isn't clear here.

I will add that if ssheri has any control of the creation of these files, going I'd encode the creation date in the filename somehow, as anything relying on the "timestamps" is not going to be a reliable method for determining when a file was created.
ssheri
Advisor

Re: Zip files 1.7 Million files

Thanks a lot..

Your suggestions match my requirement.
The files are not modified after their arrival to the filesystem. These files are getting saved to the filesystem as a result of a scheduled job. Later these are not getting modified by any user.

ssheri
Advisor

Re: Zip files 1.7 Million files

Hi,

I have checked up using the options which "oldschool" provided.
=======================================
1. created refernec files

2. ran find . -xdev -type f -newer $HOME/first.ref -a !-newer $HOME/last.ref -exec cp -p {} /yourname/2005/. \+

======================================

I have used cp instead of mv for resting. Test was only for 1 month data. But I am getting an error when I excecute it.

cp:./filename: not a directory, where filename is the last file whcih supposed to be copied as per the reference file.

For example if use
touch -a -m -t 200510010000.00 $HOME/first.ref
touch -a -m -t 200511302359.59 $HOME/last.ref

the above error is coming up with last file dated 20051130. I tried changing the date for touch and I am getting the same error for the last file created on the date as per last.ref.



OldSchool
Honored Contributor

Re: Zip files 1.7 Million files

what happens with

find . -xdev -type f -newer $HOME/first.ref -a !-newer $HOME/last.ref -exec ls -l {} \+
ssheri
Advisor

Re: Zip files 1.7 Million files

It works fine. I am getting the error with last file in a purticular month as per the last.ref file when I do a cp instead ll.
Dennis Handly
Acclaimed Contributor

Re: Zip files 1.7 Million files

>2. find . -xdev -type f -newer $HOME/first.ref -a !-newer $HOME/last.ref -exec cp -p {} /yourname/2005/. \+

(I would forget about cp instead of mv for a million files.)

>-exec cp -p {} /yourname/2005/. \+

You can't do this. You can only put the {} last. (Unless GNU find does it?)
(Also leave off the stinkin' \ before the "+".)

So you'll need to write a script:
-exec cp_stuff.sh /yourname/2005 +

Inside cp_stuff.sh you have:
#!/usr/bin/sh
# swap first to last to make "find -exec +" happy.
target=$1
shift
cp -p "$@" "$target"

(The quotes are to make Steven happy with his stinkin' spaces. :-)

>OldSchool: what happens with

Why fiddle with the find syntax when it is the -exec that is the problem?
Also this is what tusc is for.
OldSchool
Honored Contributor

Re: Zip files 1.7 Million files

"Why fiddle with the find syntax when it is the -exec that is the problem?
Also this is what tusc is for."

because it lost enough in translation that I couldn't tell that (from what the OP posted)...

Plus, based on the question, I doubt that the OP has the ability to interpret the output of tusc....