- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- How to delete a large number of small files?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 07:01 PM
11-27-2006 07:01 PM
I got a directory contains a number of 300k small files(aroud 200GB). I wonder if there is a way to delete them quickly. I experienced 3+ hours to delete them by deleting the directory directly(rm -rf directory).
OS: HP-UX B.11.11.
Server: Superdome.
Storage: XP1024.
Regards,
Yongye
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 07:18 PM
11-27-2006 07:18 PM
Re: How to delete a large number of small files?
#rm z*
#rm y*
#rm x*
...
#rm 1*
#rm 0*
regards,
ivan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 07:22 PM
11-27-2006 07:22 PM
Re: How to delete a large number of small files?
if they are all contained on one lvol, newfs the lvol and restore the backup of the files you do need.
Is 3Hrs for 200GB that bad ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 07:45 PM
11-27-2006 07:45 PM
Re: How to delete a large number of small files?
#rm z*
I'm not sure if reverse would help but instead of doing the directory search (z*) 26*2+10 times, you may want to use ls, then sort -r then xargs to remove a bunch at a time:
ls | sort -r | xargs -n40 rm -rf
I'm not sure this would be any faster than rm -rf?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 08:33 PM
11-27-2006 08:33 PM
Re: How to delete a large number of small files?
1.) Using shell wildcards directly is bad, because it causes the filenames to be expanded by the shell. The expansion will make the command line extremely long, maybe longer than the shell can handle. Furthermore, the expanded list will be sorted alphabetically, which may take a non-trivial time and memory when the amount of files is huge.
2.) Using 'ls' command is almost as bad: it will also try sort the filenames before displaying the output, which makes it a waste of resources. In a huge directory, the system needs to skip back and forth in the directory structure, which takes time unless the directory structure fits entirely in the caches. If this is a production server which already has a significant workload, you may have problems.
When dealing with huge directories like this, the tool of choice is 'find'. It will browse through the directory structure in the order files are stored on the disk, *without sorting the list in any way*. Proceeding in the disk order (vs. alphabetical order) will allow the directory manipulation operation to be done as one pass through the directory structure.
Theoretically, the "rm -rf directory" without any wildcards should be as fast as this kind of a recursive removal operation can possibly be.
Unfortunately, the slowness may be caused by the directory structure being fragmented to multiple parts, so most of the time is spent seeking back and forth across the disk. The VxFS filesystem normally handles fragmentation quite well, but heavy usage with lots of small files being created and deleted may cause fragmentation. If you want to defragment a VxFS, see "man fsadm_vxfs".
If you have a need to completely clean huge directories like this frequently, consider making the respective directories into separate filesystems: unmounting the filesystem, running a mkfs on it and then mounting the now-empty filesystem may well be much faster than 'rm -rf', when the number of files in one directory is large. This will also eliminate any fragmentation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 09:43 PM
11-27-2006 09:43 PM
SolutionMy solution:
There are many
ls -1 > list
while read -r filename
do
rm -f $filename
done < list
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 10:05 PM
11-27-2006 10:05 PM
Re: How to delete a large number of small files?
my two perferred way are:
1. create a new file system and move the rest of the files to the new file system. (adjust the mount points accordingly..
2. perform delete using parallel commands
hope this helps!
kind regards
yogeeraj
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2006 11:55 PM
11-27-2006 11:55 PM
Re: How to delete a large number of small files?
If you are removing *everything*, as noted, use 'mkfs' to destroy and recreate the entire filesystem. Do this even if you have to followup with a list of empty directories to create.
I would believe that you are probably more selective in your deletion, however. Thus, my choice would be to use a small Perl script that makes one-pass through the filesystem directory, removing candidates that meet the criteria, "inplace".
# perl -MFile::Find -le 'find(sub{unlink if -f && -M >= 30},"/mypath")'
This simple script searches the directory (or directories) of you choice, looking for *files* that have not been modified in 30 or more days and unlink()'s (removes) them.
Modify the age and path according to your needs.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 01:15 AM
11-28-2006 01:15 AM
Re: How to delete a large number of small files?
I tried your solution. It cost me 1+ hour to delete them.
228GB/293885 files
real 1:04:01.4
user 8:33.8
sys 53:40.6
Can you explain the details why it spent less time than rm -rf direcotry? Thanks.
Hi others,
I want thank you provide me the solution. I have given you the point. And I will try your solution next time and give the result to you.
Yongye
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 02:28 AM
11-28-2006 02:28 AM
Re: How to delete a large number of small files?
Where Stevens takes the work that the shell would have to do to build that list and tells it directly which files to remove.
I know sometimes when you have a lot of files and you rm * you would get arg list too long.
So you know its building a list. My guess is even without a wildcard it still builds the list.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 04:21 AM
11-28-2006 04:21 AM
Re: How to delete a large number of small files?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 04:27 AM
11-28-2006 04:27 AM
Re: How to delete a large number of small files?
To empty that kind of directories containing too many small files I use a simple find:
find . -size +100c -exec rm {} \;
You can increase/decrease the size number.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 04:28 AM
11-28-2006 04:28 AM
Re: How to delete a large number of small files?
To empty that kind of directories containing too many small files I use a simple find:
find . -size +100c -exec rm {} \;
You can increase/decrease the size number.
Anyway, I assume there is nothing faster than delete the entire folder.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 04:52 AM
11-28-2006 04:52 AM
Re: How to delete a large number of small files?
As a matter of interest, I've experimented a bit. I would agree with Clay, that most approaches probably yield similar times.
I simply wrote a small shell script to create 5,000 files in '/tmp' on a test server. I then compared (using 'timex') the execution times for the Perl script I suggested; to Dennis's divide-and-conquer solution; to a simple 'rm -rf /path'.
Statistically the execution times are the same with the 'rm -rf' perhaps being slightly faster.
I believe that the search time (in this case) was negligible compared to the time required to lock the directory for updates.
Note that the *slowest* approach that one can take in situations like this is to use 'find' with '-exec'. This spawns a new process for *every* file to be removed. The use of a piped 'xargs' eliminates this. After all, birthing isn't called "labor" without reason.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 05:27 AM
11-28-2006 05:27 AM
Re: How to delete a large number of small files?
Not if you use -exec ... +, it is like xargs.
Here is a silly idea. What if you /etc/unlink on the directory. Would using fsck to recover the lost disk space be faster since no locks?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 05:51 AM
11-28-2006 05:51 AM
Re: How to delete a large number of small files?
> ..if you use -exec ... +, it is like xargs.
Ah, yes, now I recall that is the benefit of using "+" as the '-exec cmd' terminator. Thanks for that correction. It's documented in the 'find' manpages too. Old habits sometimes die hard.
As for using '/usr/sbin/unlink' --- that's an interesting concept as you noted. Perhaps if it were exercised for *files* and not directories, or at least files *first* and then empty directories, then some speed could be gained. The manpages for 'link(1M)' note the appropriate caution.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 07:05 AM
11-28-2006 07:05 AM
Re: How to delete a large number of small files?
There is no "appropriate caution" since we are trying to fool it and create orphans.
Probably removing the directory would just put the files in lost+found, we would have to tell fsck not to do that. And with files, unlink would just remove them and act as rm(1).
But if someone is going to try the newfs solution, they could try this first?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 08:03 AM
11-28-2006 08:03 AM
Re: How to delete a large number of small files?
My thinking was that if we "...exercised [unlink] for *files* and not directories, or at least files *first* and then empty directories", then your suggestion might gain speed.
I had not thought of trying to leave orphaned files. That was my reference to "appropriate caution".
In my mind, still central to this whole thread's discussion is whether or not the author has a directory in which only *some* of its files (but not subdirectories?) are to be removed, or if the entire directory can be recursively removed en masse.
Thanks for the thought provoking discussion points!
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 04:47 PM
11-28-2006 04:47 PM
Re: How to delete a large number of small files?
This is indeed a very discussion topic, maybe we can create a new thread and post our test results and findings so that this is benefical others in the future.
We all come across such issues during our administration tasks at some points in time.
just some thoughts!
kind regards
yogeeraj
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2006 04:48 PM
11-28-2006 04:48 PM
Re: How to delete a large number of small files?
This is indeed a very interesting discussion topic, maybe we can create a new thread and post our test results and findings so that this is benefical others in the future.
We all come across such issues during our administration tasks at some points in time.
just some thoughts!
kind regards
yogeeraj