Operating System - HP-UX
1835545 Members
3098 Online
110078 Solutions
New Discussion

Re: Really ugly file deletion problem

 
SOLVED
Go to solution
Robert Fisher_1
Frequent Advisor

Really ugly file deletion problem

Hello experts,

Please help because I am totally lost. Our document management system creates huge log files that need to be deleted when they exceed 60 days. Now here's the catch. The last access or modification times mean nothing so that I can't do a find -mtime. The only thing that matters is the date string embedded in the filename. It always takes the form "03March2002" or "13September2001" but it is in the middle of the filename. Here's a smalll sample.

/u01/log1/h02071212July2002170041.Log.Z
/u01/log1/h02071313July2002170033.LOG
/u01/log1/h02071515July2002170017.Log.Z
/u01/log1/h02071818July2002170105.LOG
/u02/log16/B02081919August2003170041.Log.Z
/u01/log1/h02082020August2003170033.LOG
/u01/log1/X02082222August2003170017.Log.Z

I can't figure any way to make this work. I've attached a larger list of files. Anybody willing to help me fight this bear?

Thanks for any and all help,
Bob

P.S. I promise I'll awards points for this. I wish I could award negative points for the designer of this mess.

13 REPLIES 13
A. Clay Stephenson
Acclaimed Contributor

Re: Really ugly file deletion problem

Hi Bob:

You're right; this is ugly. A few days ago there was a similar (though simpler problem). Please see my awk solution.

http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0x9e2631ec5e34d711abdc0090277a778c,00.html

Yours is more difficult because the regular expressions for the match are more tricky. I also note that a few of the filenames in your data file indicate dates later than the current date? Are these okay?

Good luck (and you are going to need it), Clay

If it ain't broke, I can fix that.
Robert Fisher_1
Frequent Advisor

Re: Really ugly file deletion problem

Thanks Clay for your quick response. You're right about the dates that are newer than today. I'll need to look into that. It looks like I'll need to wait until we figure out what to do with those files as well.

Bob
Sridhar Bhaskarla
Honored Contributor

Re: Really ugly file deletion problem

Hi,

I can give you half solution using which you can find out the date. Then use Clay's perl script to determine the difference and delete the ones that are a month's old.

Create a file called months in the current directory and add the entries January February etc., as seperate lines.

$DIR=/u01
for i in $(ls $DIR)
do
FILE=$(echo $i|awk '{FS=".";print $1}')

while read month
do
echo $FILE |grep $month > /dev/null 2>&1
if [ $? = 0 ]
then
MONTH=$month
break
fi
done < months

NEWFILE=$(echo $FILE|sed 's/'$MONTH'//g')
LEN=$(echo $NEWFILE|wc -c)
(( A = $LEN - 10 ))
(( B = $LEN - 7 ))
YEAR=$(echo $NEWFILE|cut -c ${A}-${B})
(( A = $LEN - 12 ))
(( B = $LEN - 11 ))
DAY=$(echo $NEWFILE|cut -c ${A}-${B})

#You can use DAY,MONTH and YEAR variables now to be used with Clay's caljd script


echo $DAY $MONTH $YEAR
done


-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Robin Wakefield
Honored Contributor

Re: Really ugly file deletion problem

Hi Bob,

Here's a bit of perl that does the job. I've put the list of files in /tmp/files.

It looks for the month string, then the day and year either side, calculates the difference from now, and prints the filename if it's old enough:

============================
#!/usr/bin/perl

use Time::Local;

%months=qw(Jan 0 Feb 1 Mar 2 Apr 3 May 4 Jun 5 Jul 6 Aug 7 Sep 8 Oct 9 Nov 10 Dec 11);
$now=time;
$SECS_PER_DAY=60*60*24;
$diff=60*$SECS_PER_DAY; # how far back

open FH,"/tmp/files"; # list of files
@files=;
close FH;

foreach $file (@files){
foreach $month (keys %months){
if ($file=~/(..)($month)\D*(....)/){
$d=$1;
$m=$2;
$y=$3;
$time=timelocal(0,0,0,$d,$months{$m},$y-1900);
print "$file" if ( ($now - $time) > $diff );
last;
}
}
}

============================

rgds, Robin
Christian Gebhardt
Honored Contributor

Re: Really ugly file deletion problem

Hi

another solution:

for dat in `ls | grep -i log`
do
actdat=`ls $dat| awk -F . '{printf("%s.%s.%s\n"),substr($1,8,2),substr($1,10,length($1)-19),substr($1,length($1)-9,4) }`
if [ "`./diffday $actdat`" -gt 60 ]
then
ls $dat
fi
done


diffday is a little c-programm:

#include
#include
int
main (int argc, char *argv[])
{
time_t t1;
char *error=0;
struct tm endtime;
int diffday;

/* wrong numbers of arguments */
if (argc!=2) printf("\nusage: diffday DD.MONTH.YYYY\n");

time (&t1); /* aktual date in seconds */
endtime = *localtime (&t1); /* seconds to struct tm */

/* correct format ??? */
error=strptime(argv[1], "%d.%h.%Y", &endtime);
if (error==NULL) {
printf("wrong format of argument (DD.MONTH.YYYY)\n");
exit (1);
}

/* this is the difference */
t1=mktime (&endtime);
/* output */
printf( "%d\n", (int) difftime(time(NULL), t1)/86400);
}

I have attached the compiled binary for difftime


Chris
Gavin Clarke
Trusted Contributor

Re: Really ugly file deletion problem

I've got a one (twelve) line solution which gets rid of a months worth, I guess you could put a few crontab entries which would keep it down to 60 - 90 files. I don't know if you can afford to do this or not.

I know it's not very clever:

/usr/bin/rm `ls /u01/log/ | grep July` to be run for the cron month 10 (October?) of the year.

This assumes that there are no other files with July in /u01/log that you want to keep.

I guess if I had time I would script it a bit better.

PS: Check what I have written, I'm not ultra confident about it. All I do know is that it cuts out alot of scripting.
Gavin Clarke
Trusted Contributor

Re: Really ugly file deletion problem

Ah, have just looked at the attachment, they're not all in the same log file.

Yuk.

I guess it's more like

rm `find /u01 | grep log | grep July`
Then.

DEFINITELY check this with:

find /u01 | grep log | grep July

Besides it's still not the best solution, it's just the best I can manage in the time I've got.
Ralph Grothe
Honored Contributor

Re: Really ugly file deletion problem

Hi Robert,

attached another Perl version very similar to Robin's.
The only difference is that it avoids a couple of foreach loops by preparing a regex to match in advance.
Where it says "print unlink" it actually has to be a real unlink in order to get rid of the file.
I apologize for not using your attached filelist like robin.
Instead I tested it with only the little stuff listed beyond the __DATA__ token.
Today the critical split should occur on 14 Dec 2002 which is 60 days prior to today.

HTH
Madness, thy name is system administration
Carlos Fernandez Riera
Honored Contributor

Re: Really ugly file deletion problem

For reference and FUN ( based on sed on sort)

sed -e ' s/July/ Jul /
s/August/ Aug /' p1 | sort +2.0 -2.4 -n +1.0 -1.3 -M | sed -e 's/ Jul /July/
s/ Aug /August/'

unsupported
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: Really ugly file deletion problem

Okay Bob,

Here's my cut at this:

I do a 'heredocs' to create an awk script to do the actual removes. The system(rm xxx) is commented out so remove the comment when you feel brave. The month names are extracted used the locale LC_TIME command so this should even work with non-English month names.

The script does assume that caljd.sh is in your path.

Regards, Clay
If it ain't broke, I can fix that.
A. Clay Stephenson
Acclaimed Contributor

Re: Really ugly file deletion problem

To keep you from loading an earlier version, use this version of caljd.sh (2.1).
If it ain't broke, I can fix that.
Robert Fisher_1
Frequent Advisor

Re: Really ugly file deletion problem

Thanks everybody. I actually used A. Clay's awk solution only because I am more comfortable with awk than Perl. My next task is to rename all the files after 09February2003 by appending ".BAD" to the filename. An operator set the application date to 02/09/2004 this weekend.

Thanks to all who responded,
Bob
A. Clay Stephenson
Acclaimed Contributor

Re: Really ugly file deletion problem

Okay Bob,

That was a 2 minute change. Again, test and remove the comment before the system(mv xxx yyy) command. It too expects to read stdin.

Regards, Clay
If it ain't broke, I can fix that.