Operating System - HP-UX
1834461 Members
2710 Online
110067 Solutions
New Discussion

whats wrong with this "find" !!

 
Whitehorse_1
Frequent Advisor

whats wrong with this "find" !!

Hi Scriptians,

I made a script automated in cron to cleanup some application logs older than 10days. But at times, some files which are still older than 10 days are left undeleted.. Is anything wrong with my syntax,,, rather can someone fine-tune this please..

find /var/abc -mtime +10 -type f -name "*.log" -exec rm {} \;

--Adv thxs, WH
Reading is a good course medicine for deep sleep !!
15 REPLIES 15
V. Nyga
Honored Contributor

Re: whats wrong with this "find" !!

Hi,

script seems to be right, but have you checked,
if these log files older than 10 days are currently
in use, so a cron can't delete them?

Volkmar
*** Say 'Thanks' with Kudos ***
James R. Ferguson
Acclaimed Contributor

Re: whats wrong with this "find" !!

Hi:

Ten (10) days means 10*24 hours (actually, resolution is to 864,000 seconds).

By the way, you will get better performance with:

# find /var/abc -mtime +10 -type f -name "*.log" -exec rm {} \+

That is, using the "+" terminator bundles multiple arguments and invokes 'rm' with the bundle instead of invoking an 'rm' process for every file found!

Regards!

...JRF...
Whitehorse_1
Frequent Advisor

Re: whats wrong with this "find" !!

Hi JRF,

I tried your syntax using "+" terminator with "ll" execution, but I didnt get any o/p. Infact I get files listed with ";" terminator.. WH
Reading is a good course medicine for deep sleep !!
James R. Ferguson
Acclaimed Contributor

Re: whats wrong with this "find" !!

Hi (again):

First, was it clear the '-mtime +10' means ten days as in ten, 24-hour days or exactly 864,000 seconds? That was the answer to your question of why some files *apparently* older than 10-days are left undeleted.

Second, what release are you running? The "+" terminator works on 11.11 and later (at least). You need a space after the {} and before the terminator.

If you like you can use 'xargs' to achieve the same performance gain. My point is do *not* exec one process for every file you have to handle:

# find /var/abc -mtime +10 -type f -name "*.log" | xargs rm

Regards!

...JRF...
Whitehorse_1
Frequent Advisor

Re: whats wrong with this "find" !!

Hi JRF,

** I think its in multiples of 24 hrs (man page),

-mtime n True if the file modification time subtracted
from the initialization time is n-1 to n
multiples of 24 h. The initialization time
shall be a time between the invocation of the
find utility and the first access by that
invocation of the find utility to any file
specified in its path operands.


** I have 11.00 OS.. WH
Reading is a good course medicine for deep sleep !!
James R. Ferguson
Acclaimed Contributor

Re: whats wrong with this "find" !!

Hi (again):

> ** I think its in multiples of 24 hrs (man page)

Yes, that's what I said when I wrote, "ten days as in ten, 24-hour days or exactly 864,000 seconds" :-)

As for using the "+" terminator, this should work at 11.0 too:

http://docs.hp.com/en/B2355-90680/find.1.html

Regards!

...JRF...
Shannon Petry
Honored Contributor

Re: whats wrong with this "find" !!

Actually I see little difference in performance between "find (args) -exec /bin/rm {} \;" and "find (args) | xargs rm". Both commands spawn a child remove process, so should perform pretty evenly.

one thing of note is that you do not have pathing in your command. Even though crons default search path should pick up the commands, explicits are always faster than searching for binaries (and much safer too).

/usr/bin/find -mtime +10 -type f -name "*.log" -exec /usr/bin/rm {} \;
Microsoft. When do you want a virus today?
Peter Nikitka
Honored Contributor

Re: whats wrong with this "find" !!

Hi,

@Shannon:
The only case where you're right in
>>
... Both commands spawn a child remove process, so should perform pretty evenly.
<<
is, when the find leads to exactly one result.
N results will lead to N exec() in the first case and to INT(ARGLEN/ARGMAX)+1 calls in the xargs case.

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
Shannon Petry
Honored Contributor

Re: whats wrong with this "find" !!

Because of the tiny overhead of "xargs" it's very little difference. The overhead is in the find and the rm command, as is plainly visible in this quick test

# Reported by truss on a Solaris box
############################################
#command find /tmp -name -print -exec /bin/ls -l {} \; 2>>/dev/null
# syscall seconds calls errors
sys totals: .001 123 7
usr time: .001
elapsed: .050
###########################################
# find /tmp -name -print | xargs ls -l 2>>/dev/null
# syscall seconds calls errors
sys totals: .002 127 9
usr time: .001
elapsed: .050

NOTE: this is a pretty busy server, but notice that there are 4 system calls different.. in about a 16MB file system.


I agree that xargs can be a poor option in most cases, but not really here.
Microsoft. When do you want a virus today?
A. Clay Stephenson
Acclaimed Contributor

Re: whats wrong with this "find" !!

This is clearly a problem in stating what you mean in English (which probably means anything that has an mtime with a timestamp matching a date 10 days ago or earlier) vs. what you are really saying in UNIX which is 10 * 24 hours from now. Of course, you also have to define "older" (in English, time since file creation) but UNIX has no notion of a file's creation time. You can use the ! -newer syntax and use touch to create a file with a timestamp of midnite on the date 10 days ago to instantiate your probable English meaning.

Something close to this should work:
------------------------------------
#!/usr/bin/sh

TDIR=${TMPDIR:-/var/tmp}
REFFNAME=${TDIR}/R${$}.ref

trap 'eval rm -f ${REFFNAME}' 0 1 2 3 15

TIMESTAMP="$(caljd.sh -y -s $(caljd.sh -p 10))0000.00"
touch -m -t ${TIMESTAMP} ${REFFNAME}
find /var/abc -type f -name '*.log' ! -newer ${REFFNAME} -exec rm {} \;
exit 0
------------------------------------

You should be able to find a copy of caljd.sh with a search. Version 2.3 s is the latest.
If it ain't broke, I can fix that.
Peter Nikitka
Honored Contributor

Re: whats wrong with this "find" !!

Hi,

sorry for being slightly off-topic, but:
@Shannon - I couldn't believe your measures, so I asked my Solaris box like you.
uname -a
SunOS forth 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Blade-1500

I get completely different values:
cat /tmp/skr1
find /tmp -exec ls -ld {} \; >/dev/null
cat /tmp/skr2
find /tmp | xargs ls -ld >/dev/null
find /tmp | wc -l
209

Measurement:
truss -o ~/tmp/find-stat1 -cf sh /tmp/skr1
truss -o ~/tmp/find-stat2 -cf sh /tmp/skr2

Results:
cat ~/tmp/find-stat1
syscall seconds calls errors
...
execve .242 426 212
...
-------- ------ ----
sys totals: .599 21766 1067
usr time: .387
elapsed: 6.740

cat ~/tmp/find-stat2
syscall seconds calls errors
...
execve .005 9 3
...
-------- ------ ----
sys totals: .029 2084 26
usr time: .018
elapsed: .390

Quite a difference to your measures...

mfG Peter
Complete logfiles in the attachment!
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
Shannon Petry
Honored Contributor

Re: whats wrong with this "find" !!

Agreed at an apology for topic hijacking..

Not sure why your measure is that much different. I ran the commands several times to ensure correctness. I just changed file systems to a local disk with actual rm commands, and have some discrepensy greater than before, but still not as drastic as yours.


user@nodename: /export/home/petrys/RPMS/tmp
# ls
etc xargs.sh embedded.sh

user@nodename: /export/home/petrys/RPMS/tmp
# find . -depth -print | wc -l
54
user@nodename: /export/home/petrys/RPMS/tmp
# du -sk etc
17464 etc

user@nodename: /export/home/petrys/RPMS/tmp
# cat xargs.sh
find /export/home/petrys/RPMS/tmp/etc | xargs /bin/rm -rf

user@nodename: /export/home/petrys/RPMS/tmp
# cat embedded.sh
find /export/home/petrys/RPMS/tmp/etc -exec /bin/rm -rf {} \;

user@nodename: /export/home/petrys/RPMS/tmp
# truss -o embedded.log -cf ./embedded.sh

user@nodename: /export/home/petrys/RPMS/tmp
# cat embedded.log

syscall seconds calls errors
_exit .000 3
read .000 2
open .000 11 3
close .000 20 1
unlink .005 43
chdir .000 21
time .000 1
brk .000 9
stat .000 11
getpid .000 7
getuid .000 3
access .000 2
getsid .000 1
getpgid .000 1
getgid .000 2
ioctl .000 6 3
execve .002 3
fcntl .000 11
openat .000 9
rmdir .000 8
sigaltstack .000 1
sigaction .000 53
getcontext .000 3
setustack .000 3
waitid .000 3
mmap .001 34
munmap .000 8
fchdir .000 2
getrlimit .000 3
memcntl .000 5
sysconfig .000 2
fork1 .000 2
lwp_self .000 2
lwp_sigmask .000 4
schedctl .000 2
resolvepath .001 65
getdents64 .000 17
stat64 .000 17
lstat64 .001 58
fstat64 .000 9
getrlimit64 .000 1
open64 .000 2
getcwd .034 51
-------- ------ ----
sys totals: .054 521 7
usr time: .007
elapsed: .160

user@nodename: /export/home/petrys/RPMS/tmp
# truss -o xargs.log -cf ./xargs.sh

user@nodename: /export/home/petrys/RPMS/tmp
# cat xargs.log
syscall seconds calls errors
_exit .000 5
read .000 5
write .000 1
open .000 17 5
close .000 37 1
unlink .005 43
chdir .000 22
time .000 1
brk .000 18
stat .000 17
lseek .000 1 1
getpid .000 12
getuid .000 3
access .000 3
getsid .000 1
getpgid .000 1
pipe .000 1
getgid .000 2
ioctl .000 7 4
execve .005 5
fcntl .000 22
openat .000 16
rmdir .000 8
sigaltstack .000 1
sigaction .000 53
getcontext .000 5
setustack .000 5
waitid .000 5
mmap .001 51
munmap .000 12
fchdir .000 17
getrlimit .000 5
memcntl .001 7
sysconfig .000 2
fork1 .001 4
lwp_self .000 4
lwp_sigmask .000 8
schedctl .000 3
resolvepath .001 73
getdents64 .000 32
stat64 .000 18
lstat64 .004 158 50
fstat64 .000 26
getrlimit64 .000 2
open64 .000 2
getcwd .034 52
-------- ------ ----
sys totals: .065 793 61
usr time: .013
elapsed: .250
Microsoft. When do you want a virus today?

Re: whats wrong with this "find" !!

Whitehorse,

Having done a bit of calculating after reading that -mtime +10 will actually calculate at roughly 240 hours.........

If you ran your clearout script at say 20th January at 02:00am, you would expect there would be no files older than 10th January 02:00.
Infact, 240 hours ago equates to 8th January 03:00 and so it is before this time that it removes files from. Therefore you would find files still on your system dated on 10th, 9th and possibly 8th January.
Whitehorse_1
Frequent Advisor

Re: whats wrong with this "find" !!

Letmme "ON" my calculator first..

1 day = 24hrs = 1440 mins = 86400 secs,

** 10 days = (10 * 1) day * 86400 secs = 864000 secs,

** 10 days = (10 * 1) day * 1440 mins = 14400 mins,

** 10 days = (10 * 1) day * 24 hrs = 240 hrs,

My calci shows.. 10days=864000secs=14400min=240hrs

Is this not applicable for shell scripts..?? Or any other way of calculation for days/hrs/mins/secs ??? -- WH
Reading is a good course medicine for deep sleep !!
Dennis Handly
Acclaimed Contributor

Re: whats wrong with this "find" !!

>My calci shows: 10days=864000secs=14400min=240hrs

Yes.

>Is this not applicable for shell scripts..??

What are you asking? 10 days is 10 days. Were you confused by what Matthew said?