Operating System - Linux
1839275 Members
2806 Online
110138 Solutions
New Discussion

Perl/ksh script to find files used by users...

 
SOLVED
Go to solution
jmckinzie
Super Advisor

Perl/ksh script to find files used by users...

Ok,

We have directories like /home11 that store users directories. I need a perl/ksh script that searches the directories, lists the files from biggest to smallest and detirmines whther or not any of these diirectories or files is an orhan.

-TIA
20 REPLIES 20
Jonathan Fife
Honored Contributor

Re: Perl/ksh script to find files used by users...

What do you mean by "an orphan"? The owning user no longer exists?
Decay is inherent in all compounded things. Strive on with diligence
jmckinzie
Super Advisor

Re: Perl/ksh script to find files used by users...

Orphan file = not owned by anyone listed in the /etc/passwd file.
A. Clay Stephenson
Acclaimed Contributor

Re: Perl/ksh script to find files used by users...

This will get you started:

find . -type f -nouser -print
If it ain't broke, I can fix that.
H.Merijn Brand (procura
Honored Contributor

Re: Perl/ksh script to find files used by users...

Find and sort:

# perl -MFile::Find -le'find(sub{-f and push@{$f{-s$_}},$File::Find::name},@ARGV);for$s(sort{$b<=>$a}keys%f){printf"%9d %s\n",$s,$_ for sort@{$f{$s}}}' /home11


Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
James R. Ferguson
Acclaimed Contributor

Re: Perl/ksh script to find files used by users...

Hi Jody:

# find /home -xdev -nouser
# find /home -xdev -nogroup

...will find your unowned (no password or no group)

For listing the contents from largest to smallest:

# du -xk /home|sort -k1nr

Regards!

...JRF...
H.Merijn Brand (procura
Honored Contributor

Re: Perl/ksh script to find files used by users...

And plus ORPH indicator:

# perl -MFile::Find -le'find(sub{-f and push@{$f{-s$_}},[$File::Find::name,(stat$_)[4]]},@ARGV);for$s(sort{$b<=>$a}keys%f){printf"%4s %9d %s\n",($_->[1]eq getpwuid$_->[1]?"ORPH":""),$s,$_->[0] for sort@{$f{$s}}}' /home11

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Jonathan Fife
Honored Contributor

Re: Perl/ksh script to find files used by users...

What I came up with...

for file in $(find . -type f);
do
filesize=$(ls -l $file | awk '{print $5}')
fileowner=$(ls -l $file | awk '{print $3}')
orphan=""
if [[ $(grep -c "^$fileowner:" /etc/passwd) -eq 0 ]];
then orphan="ORPHANED FILE!"
fi
echo "$filesize $file $orphan"
done | sort -nr +0



HTH,
Jon
Decay is inherent in all compounded things. Strive on with diligence
Jonathan Fife
Honored Contributor

Re: Perl/ksh script to find files used by users...

OK, I really need to learn more perl :)
Decay is inherent in all compounded things. Strive on with diligence
jmckinzie
Super Advisor

Re: Perl/ksh script to find files used by users...

Procura,

Is there a way to combine the two perl statements?

Sorry but, I am still trying to learn perl.

THanks,
James R. Ferguson
Acclaimed Contributor

Re: Perl/ksh script to find files used by users...

Hi (again) Jody:

Merijn's very elegant solution *does* combine the listing of file sizes and whether or not they are unowned. There is one small bug (!) in his last post. Try this instead:

# perl -MFile::Find -le'find(sub{-f and push@{$f{-s$_}},[$File::Find::name,(stat$_)[4]]},@ARGV);for$s(sort{$b<=>$a}keys%f){printf"%4s %9d %s\n",(!defined getpwuid$_->[1]?"ORPH":""),$s,$_->[0] for sort@{$f{$s}}}' /home11

Regards!

...JRF...
H.Merijn Brand (procura
Honored Contributor
Solution

Re: Perl/ksh script to find files used by users...

JRF, you're right. Thanks. We indeed should use
defined getpwuid

But then I would use positive logic in the ternary

# perl -MFile::Find -e'find(sub{-f and push@{$f{-s$_}},[$File::Find::name,(stat$_)[4]]},@ARGV);for$s(sort{$b<=>$a}keys%f){printf"%4s %9d %s\n",defined getpwuid$_->[1]?"":"ORPH",$s,$_->[0] for sort@{$f{$s}}}' /home11

A small breadown:

> # perl

the command invokation

> -MFile::Find

use modules File::Find. This is equivalent to

use File::Find:

in a script

> -e

command line option.
-e will use the next arg as a chunk of perl code

> find (

find is a function from the File::Find module, and has several invocation methods, basically that sums down to

find (function, directories)

as function can be a (reference to a) named sub or an anonymous sub, I always prefer to inline it in the call using sub { }

The sub defines the actions being taken on all entries found

> sub {
> -f and push @{$f{-s$_}},[$File::Find::name,(stat$_)[4]]

-f is a file test operator, defaulting to $_. If you want to do that explicitely, add it:

# only scan *files*, not special files, symlinks, directories and fifos
-f $_ or return;
# Note that you have to use 'return', and not 'next' as you are leaving a sub, not skipping to the next file in a for or while loop. A common pitfall for find ()
# Now push the file info in a hash of lists
# The hash is `indexed' by the filesize (-s)
# For each entry, I push an anonymous list with two entries:
# 1. The full file name ($File::Find::name)
# 2. The owner of the file (stat $_)[4]
# see perldoc -f stat
push @{$f{-s$_}}, [ $File::Find::name, (stat$_)[4] ];

I chose to use the file size as base index for the hash, so descending sorting on the file size is now dead easy

> }, @ARGV);

} is end of sub, @ARGV is the directories. @ARGV is the rest of the command line arguments

So, now I have gathered all the info I need in a hash, I must show it

> for $s (sort {$b <=> $a} keys %f) {

$s is iterating over the numeriacally descending sorted (see perldoc -f sort) keys of the hash. As the keys are the file sizes, this meets your quest

> printf "%4s %9d %s\n", defined getpwuid $_->[1] ? "" : "ORPH" , $s, $_->[0] for sort @{$f{$s}}}

OK, that is a lot on one line. Lets break that down some more to something more extendable

# Get me the (alphabetically sorted) list of files that have this size
my @files = sort @{$f{$s}};
# Now iterate over it
foreach my $entry (@files) {
# Each entry had the name and the uid
# in an anonymous list
my ($file, $uid) = @$_;
if (defined getwpuid $uid) {
# This file has a known user
printf "%9d %s\n", $s, $file;
}
else {
# ALARM: Orphaned file
printf "ORPHANED: %9d %s\n", $s, $file;
}
}

>' /home11

As I used @ARGV for the directories to be found, I need to pass these as extra arguments. I could also have chosen to hardcode them in the find command, but the way it is both easier to move it to a script and to change it with command line editing

Hope that helped.
One more note. Files without a named user from /etc/passwd do not have to be orphaned by means of the user being removed from the system. It can also be the uid of a file being retreived from a cpio or tar (sw depot) file that was installed from another system with uid's that do not match your's

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
jmckinzie
Super Advisor

Re: Perl/ksh script to find files used by users...

Is there any way to pull the information for the directory at the same time?

IE:

ORPHDIR /home11/%username%

ORPH size file
ORPH size file

DIR /home11/%username%

Owner ID Size file


I gues what I am trying to say is can we make it say the username who owns the directory instead of ORPH if it is owned?

I would like this to do the same for all the directories in /home11.

-TIA....again sorry...still learning perl.
H.Merijn Brand (procura
Honored Contributor

Re: Perl/ksh script to find files used by users...

lt09:/tmp/home11 123 > ll -R
.:
total 2
627705 drwxrwxrwx 5 merijn users 120 2006-07-21 16:07 .
4700 drwxrwxrwt 46 root root 2312 2006-07-21 16:02 ..
627676 drwxrwxrwx 2 merijn users 168 2006-07-21 16:03 home1
959931 drwxrwxrwx 2 lp users 168 2006-07-21 16:03 home2
960241 drwxrwxrwx 2 12345 root 168 2006-07-21 16:07 home3

./home1:
total 0
627676 drwxrwxrwx 2 merijn users 168 2006-07-21 16:03 .
627705 drwxrwxrwx 5 merijn users 120 2006-07-21 16:07 ..
424038 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx1
960214 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx2
960245 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx3
960287 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx4
960290 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx5

./home2:
total 0
959931 drwxrwxrwx 2 lp users 168 2006-07-21 16:03 .
627705 drwxrwxrwx 5 merijn users 120 2006-07-21 16:07 ..
960291 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx1
960292 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx2
960295 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx3
960296 -rw-rw-rw- 1 12345 users 0 2006-07-21 16:03 xx4
960299 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx5

./home3:
total 0
960241 drwxrwxrwx 2 12345 root 168 2006-07-21 16:07 .
627705 drwxrwxrwx 5 merijn users 120 2006-07-21 16:07 ..
960301 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx1
960303 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx2
960304 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx3
960305 -rw-rw-rw- 1 12345 users 0 2006-07-21 16:03 xx4
960306 -rw-rw-rw- 1 merijn users 0 2006-07-21 16:03 xx5
lt09:/tmp/home11 124 > perl -MFile::Find -e'find(sub{-f and push@{$f{-s$_}},[$File::Find::name,(stat$_)[4],(stat".")[4]]},@ARGV);for$s(sort{$b<=>$a}keys%f){printf"%-8s %9d %s\n",defined getpwuid$_->[1]?"":getpwuid($_->[2])||$_->[2],$s,$_->[0] for sort@{$f{$s}}}' /tmp/home11
0 /tmp/home11/home3/xx1
0 /tmp/home11/home3/xx2
0 /tmp/home11/home2/xx1
0 /tmp/home11/home2/xx2
0 /tmp/home11/home1/xx1
0 /tmp/home11/home1/xx2
0 /tmp/home11/home1/xx3
0 /tmp/home11/home1/xx4
0 /tmp/home11/home1/xx5
0 /tmp/home11/home2/xx3
lp 0 /tmp/home11/home2/xx4
0 /tmp/home11/home2/xx5
0 /tmp/home11/home3/xx3
12345 0 /tmp/home11/home3/xx4
0 /tmp/home11/home3/xx5
lt09:/tmp/home11 125 >

Enjoy, Have FUN! H.Merijn [ still awaiting points ]
Enjoy, Have FUN! H.Merijn
jmckinzie
Super Advisor

Re: Perl/ksh script to find files used by users...

do i add this to the end of the previous command
H.Merijn Brand (procura
Honored Contributor

Re: Perl/ksh script to find files used by users...

No, it's a replacement. Just copy-n-paste the command line

# perl -MFile::Find -e'find(sub{-f and push@{$f{-s$_}},[$File::Find::name,(stat$_)[4],(stat".")[4]]},@ARGV);for$s(sort{$b<=>$a}keys%f){printf"%-8s %9d %s\n",defined getpwuid$_->[1]?"":getpwuid($_->[2])||$_->[2],$s,$_->[0] for sort@{$f{$s}}}' /home11

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
jmckinzie
Super Advisor

Re: Perl/ksh script to find files used by users...

I copied and pasted as requested and i get output that looks like this...

ORPH 297950 /home11/lacsnxb.precob/latinmpruat/new_prods.bcp
294912 /home11/sybase/ase125/shared-1_0/jre1.2.2/lib/PA_RISC/libmlib_image.sl
ORPH 293093 /home11/credit10/creditdv/unit.rep
ORPH 293093 /home11/credit09/creditdv/unit.rep
ORPH 293093 /home11/credit06/creditdv/unit.rep
ORPH 292263 /home11/workgroup/ASPPHGFFC/gfmis/TRASH/cocpl.out
ORPH 292263 /home11/workgroup/ASPPHGFFC/gfmis/TRASH/ebipl.out

Would like it to look like your example above...

Sorry, I will learn this soon but I have a requirement due now and don't know anything about this.
H.Merijn Brand (procura
Honored Contributor

Re: Perl/ksh script to find files used by users...

That cannot be a pste from my last port. You've still got the "ORPH" text in.

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
jmckinzie
Super Advisor

Re: Perl/ksh script to find files used by users...

Sorry copied the wrong one....

What do i do with this:
lt09:/tmp/home11 123 > ll -R


This is the command:

perl -MFile::Find -e'find(sub{-f and push@{$f{-s$_}},[$File::Find::name,(stat$_)[4],(stat".")[4]]},@ARGV);for$s(sort{$b<=>$a}keys%f){printf"%-8s %9d %s\n",defined getpwuid$_->[1]?"":getpwuid($_->[2])||$_->[2],$s,$_->[0] for sort@{$f{$s}}}' /home11

output:

6189 14218 /home11/lacsnxb.precob/.tmp/0429145524.out
6189 14218 /home11/lacsnxb.precob/.tmp/0502141739.out
6189 14217 /home11/lacsnxb/lampr12uat/emex_hist.out
6189 14217 /home11/lacsnxb/lampr12uat/hist.out
6189 14217 /home11/lacsnxb/lampr12uat/mpr_hist_dec12.out
6189 14217 /home11/lacsnxb/lampr12uat/mprhist.out
6189 14217 /home11/lacsnxb/latinmpruat/emex_hist.out
6189 14217 /home11/lacsnxb/latinmpruat/hist.out
6189 14217 /home11/lacsnxb/latinmpruat/mpr_hist_dec12.out
6189 14217 /home11/lacsnxb/latinmpruat/mprhist.out

Sorry about my total lack of knowledge.
H.Merijn Brand (procura
Honored Contributor

Re: Perl/ksh script to find files used by users...

ll is an alias for ls -l and ls -lR lists a folder recursively.
I entered it for you to show how it works

# perl -MFile::Find -e'find(sub{-f and push@{$f{-s$_}},[$File::Find::name,(stat$_)[4],(stat".")[4]]},@ARGV);for$s(sort{$b<=>$a}keys%f){printf"%-8s %9d %s\n",defined getpwuid$_->[1]?"":getpwuid($_->[2])||$_->[2],$s,$_->[0] for sort@{$f{$s}}}' /home11

output:

6189 14218 /home11/lacsnxb.precob/.tmp/0429145524.out
6189 14218 /home11/lacsnxb.precob/.tmp/0502141739.out
6189 14217 /home11/lacsnxb/lampr12uat/emex_hist.out
6189 14217 /home11/lacsnxb/lampr12uat/hist.out
6189 14217 /home11/lacsnxb/lampr12uat/mpr_hist_dec12.out
6189 14217 /home11/lacsnxb/lampr12uat/mprhist.out
6189 14217 /home11/lacsnxb/latinmpruat/emex_hist.out
6189 14217 /home11/lacsnxb/latinmpruat/hist.out
6189 14217 /home11/lacsnxb/latinmpruat/mpr_hist_dec12.out
6189 14217 /home11/lacsnxb/latinmpruat/mprhist.out

Correct. In this list, the 6189 at the start is the owner of the folder where the orphaned file is in, which in this case is an unknown user too :)
the 14217 is the file size

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
jmckinzie
Super Advisor

Re: Perl/ksh script to find files used by users...

used:

perl -MFile::Find -e'find(sub{-f and push@{$f{-s$_}},[$File::Find::name,(stat$_)[4],(stat".")[4]]},@ARGV);for$s(sort{$b<=>$a}keys%f){printf"%-8s %9d %s\n",defined getpwuid$_->[1]?"":getpwuid($_->[2])||$_->[2],$s,$_->[0] for sort@{$f{$s}}}' /home11

Thanks a bunch.