Operating System - Linux
1747997 Members
4683 Online
108756 Solutions
New Discussion юеВ

Re: awk substitution help

 
SOLVED
Go to solution
lawrenzo_1
Super Advisor

awk substitution help

hi,

I am trying to change a field within a file using awk and am having some difficulty ...

the file dates are displayed

file month year
a Sept 2006
b Oct 23:30
b1 Nov 16:00
c Dec 12:20
d Jan 02:13
e Feb 14:23
f Mar 22:01
g Apr 04:34

I want to chage the time stamp to display the where oct,nov,dec will be 2006 and jan,feb,mar,apr to display 2007


here is the syntax I have worked out so far:

awk '{filename=$1; month=$2; year=$3}; {if (substr(year,1,3) !=200) {year="unknown"}} {print filename,month,year}' datesort.lst

This will change the field to unknown.

I can pipe this into another file and run something similar several times to change the month however I am sure there is a much more effectient way.

any ideas?

Thanks

Chris
hello
13 REPLIES 13
Sandman!
Honored Contributor

Re: awk substitution help

Not sure I understand...you want to replace the time stamp with the year then why is the script replacing it with "unknown". A sample of the desired output would help.

~thanks
Hein van den Heuvel
Honored Contributor

Re: awk substitution help

Hey Chris,

"what is the real problem you are trying to solve?"

We caught a glimps of that in your other posting.
From there I would encourage you NOT to use the ls -l output with the date problem presented here. Or if you must use the listing, go to the file and stat it to get teh date rather then parse.

Still, if you want to move forward in this direction consder the following awk example.
Adapt to your detail needs.

awk '$3 ~ /:/ {$3=2007} {print}' datesort.lst


Hein.
James R. Ferguson
Acclaimed Contributor
Solution

Re: awk substitution help

Hi Chris:

# perl -pe 's{(oct|nov|dec)\s+(\d\d:\d\d)}{$1 2006}i;s{(jan|feb|mar|apr)\s+(\d\d:\d\d)}{$1 2007}i' datesort.lst

...and if you want to update, in-place do:

# perl -pi.old -e 's{(oct|nov|dec)\s+(\d\d:\d\d)}{$1 2006}i;s{(jan|feb|mar|apr)\s+(\d\d:\d\d)}{$1 2007}i' datesort.lst
's

...which preserves the oritinal file with a suffix of ".old".

Regards!

...JRF...
lawrenzo_1
Super Advisor

Re: awk substitution help

ok inputted the incorrect string ...

first I came up with this:

awk '{filename=$9; month=$7; year=$8}; {if (substr(year,1,3) !=200) {year="unknown"}} {print filename,month,year}' toberemoved.lst

this gave me the output

dailyReturns03022007.unl Feb unknown
dailyReturns03032006.unl Mar 2006
dailyReturns03042006.unl Apr 2006
dailyReturns03052006.unl May 2006
dailyReturns03062006.unl Jun 2006
dailyReturns03072006.unl Jul 2006
dailyReturns03082006.unl Aug 2006
dailyReturns03092006.unl Sep 2006
dailyReturns03102006.unl Oct 2006
dailyReturns03112005.unl Nov 2005
dailyReturns03112006.unl Nov unknown
dailyReturns03122006.unl Dec unknown
dailyReturns04012006.unl Jan 2006
dailyReturns04012007.unl Jan unknown
dailyReturns04022006.unl Feb 2006
dailyReturns04022007.unl Feb unknown
dailyReturns04032006.unl Mar 2006

the original toberemoved file looked like this

-rw-r--r-- 1 2865 staff 38 25 Oct 2005 AST0000002.csv
-rw-r--r-- 1 2865 staff 37 04 Nov 2005 AST0000003.csv
-rw-r--r-- 1 2865 staff 37 07 Nov 2005 AST0000004.csv
-rw-r--r-- 1 950 staff 43 21 Sep 2006 AST0000011
-rw-r--r-- 1 950 staff 43 28 Sep 2006 AST0000012
-rw-r--r-- 1 950 staff 43 29 Sep 2006 AST0000013
-rw-r--r-- 1 950 staff 43 05 Oct 2006 AST0000014
-rw-r--r-- 1 950 staff 43 05 Oct 2006 AST0000015
-rw-r--r-- 1 950 staff 43 17 Oct 18:42 AST0000016
-rw-r--r-- 1 950 staff 43 17 Oct 18:42 AST0000017
-rw-r--r-- 1 950 staff 43 18 Oct 08:41 AST0000018
-rw-r--r-- 1 950 staff 43 18 Oct 08:41 AST0000019
-rw-r--r-- 1 950 staff 43 18 Oct 14:42 AST0000020
etc
etc

I thought it may be easier piping the output with unknown to another file then work on unknown to either display 2006 for last year months and 2007 for this year but I am sure you can do this in one hit.

so

I have come up with:

awk '{filename=$1; month=$2; year=$3}; {if (substr(month,1,3) !=^[OND]) {year="2007"}} {print filename,month,year}' datesort.lst.unkown

which changes the 3rd field to 2007 for all files.

:(

I am working through an awk book by Alfred V. Aho which is pretty slow going especially when I am working in support so any help / pointers will help.

Thanks
hello
lawrenzo_1
Super Advisor

Re: awk substitution help

issue I have with the systems here is that there has been no file archiving on some audit and transaction files so accross 5 filesystems there is over 40 million files no larger than 1mb, I have been given the thankless task of sorting it out.

All the tips and tricks you guys have provided will be implemented once I have a policy - main issue is the files are dumped all over the place and most of the data cannot be deleted!
hello
Hein van den Heuvel
Honored Contributor

Re: awk substitution help

Here is something real close to what you did. Rather then lookgin for 'known months' would it not be safer to look for something that looks like a timestamp instead of a year?

$ awk '{filename=$9; month=$7; year=$8}; year ~ /:/ { year=2007} {print filename,month,year}' orig


That's really 3 code block, 1 conditional in the middle lookgin for the match.


It can also be written as:

$ awk '{filename=$9; month=$7; year = /:/? 2007 : $8; print filename,month,year}' x

Just 1 codeblock with a conditional assignment:

year = /:/? 2007 : $8;

Read as:

If there is a ":" anywhere on the line then year becomes 2007 (should be 'current year') else year becomes field 8.

Hein.


lawrenzo_1
Super Advisor

Re: awk substitution help

Thanks all for the help,

I may go with the perl solution and be done with it however would prefer to understand the syntax so Hein this works but there are months from 2006 with the time stamp (oct,nov,dec)
hello
Sandman!
Honored Contributor

Re: awk substitution help

>If there is a ":" anywhere on the line then year becomes 2007 (should be >'current year') else year becomes field 8.

Not neccessarily see ls(1) for details. Any past date within six months of the current date will be printed as a "month timestamp" instead of the "month year" format.
James R. Ferguson
Acclaimed Contributor

Re: awk substitution help

Hi (again) Chris:

The Perl snippet is largely regular expressions as Perl does them best:

# perl -pe 's{(oct|nov|dec)\s+(\d\d:\d\d)}{$1 2006}i;s{(jan|feb|mar|apr)\s+(\d\d:\d\d)}{$1 2007}i' datesort.lst

The '-p' switch creates a read-loop for the file specified as an argument to the script.

The 's' says "substitute" and the trailing "i" says to find matches to substitute case-insensitvely.

The (oct|nov|dec) notation says match a sequence of any of these three-character strings. The '\s+' says that one or more whitespace characters must follow. The '\d' stands for a digit. By enclosing things in parenthesis we can remember the pieces of what we match. The first group is referenced as $1; the second as $2, etc. Then, we can back-reference the parts of what we match create the substitution.

Regards!

...JRF...