1825987 Members
3201 Online
109690 Solutions
New Discussion

Re: Check file script

 
SOLVED
Go to solution
ust3
Regular Advisor

Check file script

# vi file_for_compare
file1.txt;200712031200
file2.txt;200712041457
file3.txt;200712041451
file4.txt;200712051512

I have a file as above , the format is file name + date & time , I would like to compare it with the files in a directory as below

#ls /tmp/directory_for_compare
-rw-r--r-- 1 user edp 4324 Dec 04 14:57 file1
-rw-r--r-- 1 user edp 4324 Dec 04 14:57 file3
-rw-r--r-- 1 user edp 4324 Dec 05 18:57 file12

I would like to find out which file is missing for everyday ( what I want to find out is file name that the file_for_compare have but directory_for_compare don't have and vice versa ) . For example , assume today is 04-Dec , so only compare the file which the creation date is 04-Dec , in the file_for_compare , there are file2 and file3 need to compare , in the directory_for_compare , there are file1 and file3 need to compare ( because only these files creation date is 04-Dec ) , so I would like the output like below :


file_for_compare have but directory_for_compare do not have
"file2"

file_for_compare do not have but directory_for_compare have
"file1"


If today is 5-Dec , then only compare 5-Dec, if today is 6-Dec then only 6-Dec etc , no date input is required .

can advise how to write the script ? thx in advance.
24 REPLIES 24
Dennis Handly
Acclaimed Contributor
Solution

Re: Check file script

You need to convert the date to two separate search patterns. TODAY1 would have problems for files over 6 months old.

TMP=/var/tmp/dc.$$

TODAY1=$(date +"%b %d")
TODAY2=$(date +"%Y%m%d")

echo $TODAY1 # debugging
echo $TODAY2 # debugging

awk -F";" -v PAT=$TODAY2 '
$2 ~ PAT { print $1 }' file_for_compare | sort > $TMP.fff

ll /tmp/directory_for_compare | awk -v PAT="$TODAY1" '
$0 ~ PAT { print $NF }' | sort > $TMP.dff

echo "Files not in directory:"
comm -23 $TMP.fff $TMP.dff
echo "Files not in control file:"
comm -13 $TMP.fff $TMP.dff

rm -f $TMP.fff $TMP.dff
ust3
Regular Advisor

Re: Check file script

thx reply,

I tried the above script , but there is no output , I would like to ask what is PAT mean ? do I still need to modify before use this script ?


thx
Dennis Handly
Acclaimed Contributor

Re: Check file script

>I tried the above script but there is no output

The script won't give output if "today" isn't in Dec 3 though Dec 5. You said:
If today is 5-Dec, then only compare 5-Dec, if today is 6-Dec then only 6-Dec etc, no date input is required.

I only check the files for today, not every file and entry in file_for_compare. Also because of that, I don't bother checking hours and minutes. (Running the script at midnight would have problems.)-:

>would like to ask what is PAT mean? do I still need to modify before use this script?

PAT is the awk string variable that signifies "today" by an exact match. There are two date "styles" for the two data sources.
ust3
Regular Advisor

Re: Check file script

thx reply ,

I found there is single ' sign in the below statement , is it correct ? thx


awk -F";" -v PAT=$TODAY2 '
$2 ~ PAT { print $1 }' file_for_compare | sort > $TMP.fff

Dennis Handly
Acclaimed Contributor

Re: Check file script

>I found there is single ' sign in the below statement, is it correct?
awk -F";" -v PAT=$TODAY2 '
$2 ~ PAT { print $1 }' file_for_compare | sort > $TMP.fff

Yes, one on the first line to start the awk script and the other on the last line to end it. I usually break up complex awk scripts like that but since it is only one line, you can put them together.
ust3
Regular Advisor

Re: Check file script

thx reply ,

I would like to modify something , if I NOT ONLY compare today's file , I would like to compare ALL file name , no matter what date it is , as below file , there are different date ( 03Dec , 04Dec & 05Dec ) , I would like to compare all these files , can advise how to do it ? thx

file1.txt;200712031200
file2.txt;200712041457
file3.txt;200712041451
file4.txt;200712051512


ps. the file name is unique .

Dennis Handly
Acclaimed Contributor

Re: Check file script

>I would like to compare ALL file name

(If you want to ignore the date parts completely, ignore lots of this.)

Well first you have to convert from one format until the other so you can just use comm(1) to compare the files. I would suggest you convert file1.txt;200712031200 to:
file1.txt Dec 03 12:00
(This won't work for files older than 6 months, without more work. It will also fail if you aren't in the American Nerd locale (C).)

Let us assume you have file_for_compare and the output of that "ll /tmp/directory_for_compare" in a file, directory_for_compare.

Should I assume that file1.txt is the same as file1?
Did you want to compare by just the name or including the date?
TMP=/var/tmp/dc.$$
awk -F";" '
BEGIN {
mon[1] = "Jan"; mon[2] = "Feb"; mon[3] = "Mar"; mon[4] = "Apr"
mon[5] = "May"; mon[6] = "Jun"; mon[7] = "Jul"; mon[8] = "Aug"
mon[9] = "Sep"; mon[10] = "Oct"; mon[11] = "Nov"; mon[12] = "Dec"
}
{
name = $1
i = index(name, ".txt")
if (i != 0)
name = substr(name, 1, i - 1)
year = substr($2, 1, 4)
Mon = substr($2, 5, 2)
day = substr($2, 7, 2)
#hr = substr($2, 9, 2)
#min = substr($2, 11, 2)
print name, mon[Mon], day #, hr ":" min
} ' file_for_compare | sort > $TMP.fff

#-rw-r--r-- 1 user edp 4324 Dec 04 14:57 file1
awk '
{
print $9, $6, $7 #, $8
} ' directory_for_compare | sort > $TMP.dff

echo "Files not in directory:"
comm -23 $TMP.fff $TMP.dff
echo "Files not in control file:"
comm -13 $TMP.fff $TMP.dff

rm -f $TMP.ffc $TMP.dfc
ust3
Regular Advisor

Re: Check file script

thx reply,

sorry , I have a last requirement .

I sure that the date in the control file MUST be the same date , for example , like the below file , all files are the SAME date (3-Dec) . But it may be not today's date (today is 17-Dec , but the date in the control file is 3-Dec) .

file1.txt;200712031200
file2.txt;200712031457
file3.txt;200712031451
file4.txt;200712031512
" ;200712031551
" ;200712031541
" ;200712031554

so I would like to compare the date ( in the above example , it is 3-Dec ) in the control file with the files that are the same date ( so it is also 3-Dec) in the directory .

ps. the new script can let me not only comparing today's file , but also let me to do the comparsion everyday after I received the control file .

Thx again.
Dennis Handly
Acclaimed Contributor

Re: Check file script

>I have a last requirement. Insure that the date in the control file MUST be the same date

My last script does that. If you want to compare the time you need to remove the "#".
#hr = substr($2, 9, 2)
#min = substr($2, 11, 2)
print name, mon[Mon], day #, hr ":" min
ust3
Regular Advisor

Re: Check file script

thx reply ,

but when I run the script , it pop the error,

awk: cmd. line:5: fatal: file `directory_for_compare' is a directory .

can advise what is wrong ? thx
Dennis Handly
Acclaimed Contributor

Re: Check file script

>but when I run the script, it pop the error,
awk: cmd. line:5: fatal: file `directory_for_compare' is a directory .

That was a file that contained the ll(1) output. Just change those lines to:
ll directory_for_compare | awk '
{
print $9, $6, $7 #, $8
} ' | sort > $TMP.dff
ust3
Regular Advisor

Re: Check file script


Hi Dennis,

thx your reply and reply again.

I try the your scirpt , we have two problem.

1. As my request above all files date in control file is 3-Dec , so I only want to check the files with the SAME date (3-Dec also ) in the directory ( because there are many different date files in the directory , I don't want to list all of them ) ; on the same way , if the file date is 4-Dec in control file , so it compare the files that are 4-Dec in the directory ;
2. there are many file with various type in the diectroy , if I only want to compre .txt , can advise what can i do ?

thx
Dennis Handly
Acclaimed Contributor

Re: Check file script

>1. I only want to check the files with the SAME date (3-Dec also) in the directory,

That's what the script does. Compares both the name and the date. And then does this for ALL entries in the control file and directory.

Otherwise I need a specific example of what you want.

>2. if I only want to compare .txt, can advise what can i do?

It would help if you listed a new control file and directory. But if you only want to compare .txt files, you use "ll *.txt".
ust3
Regular Advisor

Re: Check file script

thx Dennis and sorry to ask again,

may be I not clearly state my requirement , your script works fine for using the control file to compare the files in directoty , like the above example , the date is 3-Dec , it can list out the files that are in the control file but not in directory , it works fine . Now , my requiremnet is that except this function , I also want to compare the same date (so it is 3-Dec) that the files in the directory but it is not exist in the control file .

the below is the part of the script , it list ALL files in the directory but not only 3-Dec , can advise how to change it ? thx

ll directory_for_compare | awk '
{
print $9, $6, $7 #, $8
} ' | sort > $TMP.dff


ps. why use 3-Dec , it is because the date in the control file is 3-Dec.
ust3
Regular Advisor

Re: Check file script

hi Dennis,

I think the first script provided is perfect , what need to change is TODAY1 and TODAY2 is not today's date , it should be 3-Dec , can advise how to change it ? thx



TMP=/var/tmp/dc.$$

TODAY1=$(date +"%b %d")
TODAY2=$(date +"%Y%m%d")

echo $TODAY1 # debugging
echo $TODAY2 # debugging

awk -F";" -v PAT=$TODAY2 '
$2 ~ PAT { print $1 }' file_for_compare | sort > $TMP.fff

ll /tmp/directory_for_compare | awk -v PAT="$TODAY1" '
$0 ~ PAT { print $NF }' | sort > $TMP.dff

echo "Files not in directory:"
comm -23 $TMP.fff $TMP.dff
echo "Files not in control file:"
comm -13 $TMP.fff $TMP.dff

rm -f $TMP.fff $TMP.dff
Dennis Handly
Acclaimed Contributor

Re: Check file script

>I want to compare the same date (so it is 3-Dec) that the files in the directory but it is not exist in the control file.

So instead of picking "today" or ALL dates, you want to give a specific date?

If so, that would have to be a parm to the script.

>it list ALL files in the directory but not only 3-Dec, can advise how to change it?

You would need to go back to the first script.

>I think the first script provided is perfect, what need to change is TODAY1 and TODAY2 is not today's date, it should be 3-Dec, can advise how to change it?

Well, you can work hard and provide both formats to your script:
$ script 20071203 "Dec 03"

Then the script would have:
TODAY1=$1
TODAY2=$2

Or you could take the YYYYMMDD format and generate a Dec 03 format:
TODAY2=$1 # Script arg
#YYYY=${TODAY2%????}
DD=${TODAY2#??????}
MMDD=${TODAY2#????}
MM=${MMDD%??}
set -A MON Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
TODAY1="${MON[MM-1]} $DD"
echo $TODAY1
echo $TODAY2
ust3
Regular Advisor

Re: Check file script

thx dennis ,

back to your first script

====================================================================
TMP=/var/tmp/dc.$$

TODAY1=$(date +"%b %d")
TODAY2=$(date +"%Y%m%d")

echo $TODAY1 # debugging
echo $TODAY2 # debugging

awk -F";" -v PAT=$TODAY2 '
$2 ~ PAT { print $1 }' file_for_compare | sort > $TMP.fff

ll /tmp/directory_for_compare | awk -v PAT="$TODAY1" '
$0 ~ PAT { print $NF }' | sort > $TMP.dff

echo "Files not in directory:"
comm -23 $TMP.fff $TMP.dff
echo "Files not in control file:"
comm -13 $TMP.fff $TMP.dff

rm -f $TMP.fff $TMP.dff

========================================================================

it works fine when you post it , I have tried it and get the perfect result , but it is strange that I can't get the result from the part below,

ll /tmp/directory_for_compare | awk -v PAT="$TODAY1" '
$0 ~ PAT { print $NF }' | sort > $TMP.dff

nothing is generated to $TMP.dff , is it because the year is changed to 2008 now ? thx for your advise.
Dennis Handly
Acclaimed Contributor

Re: Check file script

>it works fine when you post it

It only works for files with today's date. You have to look at the other script versions if you want other dates.

>I can't get the result from the part below,
ll /tmp/directory_for_compare | awk -v PAT="$TODAY1" '...
>nothing is generated to $TMP.dff, is it because the year is changed to 2008 now?

You need to show some example filenames, (ll of that directory). 2008 shouldn't matter unless the file is over 6 months old.

It would help if you can clearly state what you now want to do and provide some example data.
ust3
Regular Advisor

Re: Check file script

thx Dennis ,

You need to show some example filenames, (ll of that directory). 2008 shouldn't matter unless the file is over 6 months old.
It would help if you can clearly state what you now want to do and provide some example data.

What I want now is that the requirement of my first question ( please ignore my replies except the first post) , I have tried your script when you post it , it works fine , but now when I run it , the .dff is empty , I sure in the directory "directory_for_compare" , there is file which is today's date ,
I found that the output of date +"%b %d" is --> Jan 08
the output of ll directory_for_compare is --> Jan 8 , there is difference in output , is it the reason of the problem ? thx


-rw-r--r-- 1 user edp 4324 Dec 4 14:57 file1
-rw-r--r-- 1 user edp 4324 Dec 4 14:57 file3
-rw-r--r-- 1 user edp 4324 Dec 5 18:57 file12
-rw-r--r-- 1 user edp 4324 Dec 8 07:42 file12
Dennis Handly
Acclaimed Contributor

Re: Check file script

>I found that the output of date +"%b %d" is --> Jan 08
the output of ll directory_for_compare is --> Jan 8, there is difference in output, is it the reason of the problem?

Thanks for the details. That's exactly the problem. When I was looking to select the day of the month I used %d instead of %e. Because your example file dates had a leading "0".

So change it to:
TODAY1=$(date +"%b %e")

Hmm, ls(1) is controlled by /usr/lib/nls/msg/*/ls.cat:
11 %2d %b %Y
12 %2d %b %H:%M

And %2d seems to be %e, or printf %2d.
And %d seems to be printf %02d.
ust3
Regular Advisor

Re: Check file script

thx your reply,

date +"%b %e" works fine in my case.

sorry to have one more requirement , base on this script , if I want to compare the file that is one day before , that means today is 08-Jan-2008 , I want to compare the files that are 07-Jan-2008 , if today is 09-Jan-2008 then compares the files are 08-Jan-2008 , if today is 09-Jan-2008 then compare the files are 08-Jan-2008 ....

can advise what can i do ? thx



Dennis Handly
Acclaimed Contributor

Re: Check file script

>if I want to compare the file that is one day before ... can advise what can i do?

This is going to be a lot harder. It requires being able to do date arithmetic. This would be easy in C or perl but hard to do in a script. You would need to know how many days in each month and when a leap year occurs.

JRF has a perl solution:
http://forums12.itrc.hp.com/service/forums/questionanswer.do?threadId=1173649
http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1165476
Clay's script:
http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1040167
My script for days in a month:
http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1163079
Clay's script:
http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1158441
Yesterday:
http://forums12.itrc.hp.com/service/forums/questionanswer.do?threadId=1158025
Using cal:
http://forums12.itrc.hp.com/service/forums/questionanswer.do?threadId=1067933
ust3
Regular Advisor

Re: Check file script

thx reply,

OH~~ , I feel headache to understand the C language , anyone could help ?? very thanks.
Dennis Handly
Acclaimed Contributor

Re: Check file script

>I feel headache to understand the C language, anyone could help??

With C? that's trivial. See attached C source.
Otherwise you could look at that "yesterday" link above.