- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Zip files 1.7 Million files
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 08:23 AM
тАО11-02-2009 08:23 AM
I have a filesystem which has got 1.7 million files. This filesystem contains files since Jan 2005.I need to gzip all files till Dec2008. I need to do this in an yearly basis. ie one zip file for 2005 , one zip file for 2006 . Same procedure for rest of the years.
Can anyone help me with the commands for this task?
Your help is much appreciated.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 09:47 AM
тАО11-02-2009 09:47 AM
Re: Zip files 1.7 Million files
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 09:51 AM
тАО11-02-2009 09:51 AM
Re: Zip files 1.7 Million files
First make a tar file then zip it with compress or gzip utility.
tar -cvf 2005.tar /directiory_name
gzip 2005.tar
Suraj
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 09:55 AM
тАО11-02-2009 09:55 AM
Re: Zip files 1.7 Million files
find . -atime 360 -exec ll {} \;
Verify your selection by listing everything captured by find
When ready
find . -atime 360 -exec gzip {} \;
This is for one year. 720 for two years, etc.
I'd also suggest using 'tar' after you gzip else you'll run out of space fast. Real, real fast. In fact, having another dir to work with would be good.
find . -name *.gz | tar cvf backup.tar {} \;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 10:57 AM
тАО11-02-2009 10:57 AM
Re: Zip files 1.7 Million files
gzip 2005.tar"
that assumes the OP had the files already segregated into directories by year, which may or may not be the case.
"find . -atime 360 -exec gzip {} \;"
is probably closer to what the OP wants, but will result in one zip file for each original file found...which may be what their after.
or you could take the above "find" and "mv" the file to a separate directory, then gzip each, then tar the results....or mv the file, tar the directory and gzip *that*.
ssheri needs to remember the their is no "create date" stored in unix filesystems, M. Steele is going after the "access time" which would be may be a good bet. see the "man" page for "find", in particular the "-atime", "-mtime" and "-ctime" options to see which best fits.
Another option would be to create two reference files with appropriate dates, and use the "-newer" and "-older" options to sort out what you want.
All of the above is why I originally asked if the date was somehow "buried" in the filename.
some additional information about the original data layout, and the desired results might help in providing more appropriate responses.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 11:43 AM
тАО11-02-2009 11:43 AM
Re: Zip files 1.7 Million files
> reference files [...]
This seems like a better scheme than any of
the "-
you're not running the job at 00:00 on 1
January. "-atime" would seem to be the
least likely to get the desired result
(unless no one ever looks at these files).
> or you could take the above "find" and
> "mv" the file [...]
I'd vote for moving them to year-specific
directories that way, and then doing
something like:
tar cf - year_2005_dir | \
gzip -c > year_2005_dir.tar.gz
Creating an actual "tar" archive file, and
_then_ hitting it with gzip tends to require
more disk space, at least temporarily.
> find . -atime 360 -exec gzip {} \;
>
> This is for one year. 720 for two years,
> etc.
Around here, years are longer than 360 days.
Which calendar do you use? (And which does
"find" use?)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 05:25 AM
тАО11-03-2009 05:25 AM
Re: Zip files 1.7 Million files
# find . -exec ll {} + | awk '$8 == "2007"' | tee list_2007
This lists the files exactly from year 2007, (1st jan -> 31th dec) and also dives into subdirs. After that you could feed this file to gzip/tar or whatever you want...
Unix operates with beer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 08:16 AM
тАО11-03-2009 08:16 AM
Re: Zip files 1.7 Million files
from what was originally stated, it could well be that the OP wants a gzip file for a given year that contains all the files for that year (as opposed to zipping a tar of those files).
If so, I don't think that option has been covered yet, and it might be a pain to implement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 08:18 AM
тАО11-03-2009 08:18 AM
Re: Zip files 1.7 Million files
Thanks for your quick responses. I hope I would explain my requirement in detail.
=======================================
I have a filesystem which contains 1.7 million files. File are there since 2005 till today. My requirement is to tar and zip the files for each year separately. ie one tar/zip file for 2005, 2006,2007 and 2008. The files can be identified by their time stamp and there are no separate directories for each year. All files are residing on a single directory.
======================================
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 10:37 AM
тАО11-03-2009 10:37 AM
SolutionOk, this could get ugly. Making the assumption that the files will be removed after archiving, then something like the following can be modified to work:
First, you need to realize that UNIX doesn't have / track a file timestamp related to the "creation time". It knows the following:
atime (File Access Time)
Access time shows the last time the data from a file was accessed - read by one of the Unix processes directly or through commands and scripts.
ctime (File Change Time)
ctime changes when you change file's ownership or access permissions. It will also naturally highlight the last time file had its contents updated.
mtime (File Modify Time)
Last modification time shows time of the last change to file's contents. It does not change with owner or permission changes, and is therefore used for tracking the actual changes to data of the file itself.
So...which one you look at depends on what you want. IF you can guarantee that the contents of the file, once written, were never modified, then the mtime option of find should be ok. Access time is useless for this if the file has ever been read after writing. Ctime *might* work.
If none of the above apply, then you're toast, as you've no way to locate files written in 2005.
Let us say that mtime is workable in your case, and you are going to find those files in year 2005. I'd create two reference files representing the upper and lower limits of the times you wish to locate:
touch -a -m -t 200501010000.00 $HOME/first.ref
touch -a -m -t 200512312359.59 $HOME/last.ref
should get be everything between 01/01/2005 at 00:00 and 12/31/2005 at 23:59 and 59 seconds.
then use find to locate the relevant file using find and move them to a directory by themselves
mkdir /yourname/2005
cd /where_files_are
find . -xdev -type f -newer $HOME/first.ref -a !-newer $HOME/last.ref -exec mv {} /yourname/2005/. \+
at that point, you should be able to tar the newly created directory and pipe that to zip as noted in one of the posts above.
Note that the above has not been tested, you might want to substitute something harmless, like ls for the move until you get it sorted out.
repeat the above, after adjusting timestamps on the ref files, and creating the required directories.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 02:44 PM
тАО11-03-2009 02:44 PM
Re: Zip files 1.7 Million files
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 05:14 PM
тАО11-03-2009 05:14 PM
Re: Zip files 1.7 Million files
I assume people have told you this is not a good idea?
>The files can be identified by their time stamp
Encoded in their name, or in the ll(1) output?
I have a case where they are encoded in their name.
>there are no separate directories for each year.
If the names include the year, the first thing to do is to create a subdirectory and move all of a year into it.
If they don't include the year, you can make a simple script to do that:
last_year=""
ll -trog | while read F1 F2 F3 F4 F5 F6 F7; do
case $F6 in
200[5-8]) ;;
*) continue;;
esac
if [ "$F6" != "$last_year" ]; then
mkdir -p $F6
last_year=$F6
fi
echo mv "$F7" $last_year
done
Once they are in a separate directory, you use the tar-gzip suggestions as Steven suggested.
If you don't want to include the directory name in the tarball, you can use -C:
tar cf - -C 2005 . |
gzip > year_2005.tar.gz
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-04-2009 06:36 AM
тАО11-04-2009 06:36 AM
Re: Zip files 1.7 Million files
there are a variety of answers posted, some of which may be more appropriate than others, depending on the exact goal, which isn't clear here.
I will add that if ssheri has any control of the creation of these files, going I'd encode the creation date in the filename somehow, as anything relying on the "timestamps" is not going to be a reliable method for determining when a file was created.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-05-2009 04:12 AM
тАО11-05-2009 04:12 AM
Re: Zip files 1.7 Million files
Your suggestions match my requirement.
The files are not modified after their arrival to the filesystem. These files are getting saved to the filesystem as a result of a scheduled job. Later these are not getting modified by any user.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-16-2009 10:28 AM
тАО11-16-2009 10:28 AM
Re: Zip files 1.7 Million files
I have checked up using the options which "oldschool" provided.
=======================================
1. created refernec files
2. ran find . -xdev -type f -newer $HOME/first.ref -a !-newer $HOME/last.ref -exec cp -p {} /yourname/2005/. \+
======================================
I have used cp instead of mv for resting. Test was only for 1 month data. But I am getting an error when I excecute it.
cp:./filename: not a directory, where filename is the last file whcih supposed to be copied as per the reference file.
For example if use
touch -a -m -t 200510010000.00 $HOME/first.ref
touch -a -m -t 200511302359.59 $HOME/last.ref
the above error is coming up with last file dated 20051130. I tried changing the date for touch and I am getting the same error for the last file created on the date as per last.ref.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-16-2009 01:25 PM
тАО11-16-2009 01:25 PM
Re: Zip files 1.7 Million files
find . -xdev -type f -newer $HOME/first.ref -a !-newer $HOME/last.ref -exec ls -l {} \+
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-16-2009 01:29 PM
тАО11-16-2009 01:29 PM
Re: Zip files 1.7 Million files
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-17-2009 02:37 AM
тАО11-17-2009 02:37 AM
Re: Zip files 1.7 Million files
(I would forget about cp instead of mv for a million files.)
>-exec cp -p {} /yourname/2005/. \+
You can't do this. You can only put the {} last. (Unless GNU find does it?)
(Also leave off the stinkin' \ before the "+".)
So you'll need to write a script:
-exec cp_stuff.sh /yourname/2005 +
Inside cp_stuff.sh you have:
#!/usr/bin/sh
# swap first to last to make "find -exec +" happy.
target=$1
shift
cp -p "$@" "$target"
(The quotes are to make Steven happy with his stinkin' spaces. :-)
>OldSchool: what happens with
Why fiddle with the find syntax when it is the -exec that is the problem?
Also this is what tusc is for.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-17-2009 07:11 AM
тАО11-17-2009 07:11 AM
Re: Zip files 1.7 Million files
Also this is what tusc is for."
because it lost enough in translation that I couldn't tell that (from what the OP posted)...
Plus, based on the question, I doubt that the OP has the ability to interpret the output of tusc....