Operating System - HP-UX
1833059 Members
2593 Online
110049 Solutions
New Discussion

How to find directory with many small files?

 
SOLVED
Go to solution
James L Shirley (4075)
Occasional Advisor

How to find directory with many small files?

I'm trying to come up with a way to find directories with many small files or individual logs. My customers aren't good at keeping their logs and error messages under control. Any ideas???

-Jim
No Indian prince has to his palace ...
4 REPLIES 4
Rita C Workman
Honored Contributor

Re: How to find directory with many small files?

You can do a find based on the size of a file like:
find / -size -100000000c -print
... this would find files smaller than 100000000bytes in size. You define how small a file you want to find
Or if your user keep their files in *.log you could search
find / -name *.log -print
... this finds based on *.log
You could output this to a file instead of -print with a redirect. Or if you know your users keep log files in their home directory (or where-ever) you could do what I sometimes do...find based on -name and -mtime.
find /home -size -10000000c -a -mtime +30 -exec rm {}
....this would find in the /home directory files smaller than that have not been accessed in 30+ days and remove them.
..BUT A WORD OF CAUTION...
Remember it is removing based on file size..so you need to becare you do not remove things that are needed. Your description of size and name is not real specific to find on... Also the size will find everything smaller than that figure but the exact same file size it would omit.
I might recommend running the find and outputing a file than review before you remove...

Just a thought,
/rcw
Jerry Jordak
Advisor
Solution

Re: How to find directory with many small files?

Here's a script I whipped up real quick that might help, or at least give you an idea of where to start:

#!/bin/ksh
find . -type f -size -1000 -print | xargs dirname > /tmp/xxx.out
FILES=$(cat /tmp/xxx.out |sort -u)
for X in $FILES
do
echo $X $(cat /tmp/xxx.out | grep $X | wc -l )
done | sort -nr +1
rm xxx.out

What this does is find all files less than 1000 bytes (you can change the size if you want), takes the directory name where each files is located, and writes them to a file. Then it goes through the file and creates a list of the unique directory names and puts the list in the variable $FILES. Lastly, it goes through the list of unique directories and counts the number of occurances in the temp file, which will give you the number of small files in the directory and prints it out with the directory name. The sort at the end sorts them in reverse numerical order by number of files found.

So when you run this, you should see results like the following:

./stm/data/tools/verify 7
./stm/config/tools/verify 4
./stm/config/tools/monitor 2
./spool/pwgr 1
./opt/ignite/local 1

This will tell you the number of small files (as defined in the find command) in each directory. Hope this helps.

-JWJ
James L Shirley (4075)
Occasional Advisor

Re: How to find directory with many small files?

Thanx Rita & Jerry! I was working along the lines of creating a list of all directories, then counting the files, then check size of everything under the directory and then check the size to number of files ratio. I knew that was a very long way around to get the answer. Both of your suggestions are far more direct and efficient. Thanx again.

-Jim
No Indian prince has to his palace ...
Bruce Regittko_1
Esteemed Contributor

Re: How to find directory with many small files?

Hi,

Jerry's script is good and will work but make sure that you specify

-size -1000c

in your find command. Without the 'c', -size refers to blocks, not bytes (characters).

--Bruce
www.stratech.com/training