Re: Script enhancement

Ferdie Castro · ‎11-17-2003

Hi All,

I have a problem right now. I have a list of files: filea, fileb, filec-------* (can be more). Each file has data example below delimited by comma ","
2003-11-09 04:13:07,baddete,30000,5,35,2003-11-08 23:59:29
2003-11-09 04:12:43,gerry,30000,35,85,2003-11-08 23:59:14
2003-11-09 04:13:32,lance,30000,35,35,2003-11-08 23:59:49
2003-11-09 04:13:32,lance,30000,35,178,2003-11-08 23:59:49
2003-11-09 04:13:32,lance,30000,35,605,2003-11-08 23:59:49

I want to make a script that prints output
example the name of the person $2 (example gerry, lance, badette) and the number of occurence where $1 (date),$4 (can be 5, 35 from example above) and $3 (example 30000)occured are the same. Also prints the $1,$4, & $3
Ouput file sample

lance 2 2003-11-09 04:13:32 30000 35
$2 (twice) $1 $3 $4 (which means that this event appears twice in all files*)
Can you help me here? thanks so much.

Henrik BOYE · ‎11-17-2003

Hi,
use awk
awk -v FS="," '{print $1 " "$4" "$3 }'
filelist
or
for in in file*
do
awk -v FS="," '{print $1 " "$4" "$3 }' $i
done

Graham Cameron_1 · ‎11-17-2003

Ferdie

That's a tall order.

I can get you started with

for f in file?
do
awk -F, '{printf "%s 1 %s %s %s\n", $2, $1, $3, $4}' $f >> intermediate_file
done

Then you'd have to do some sorting on intermediate_file to find the duplicates and sum them.

-- Graham

Computers make it easier to do a lot of things, but most of the things they make it easier to do don't need to be done.

TSaliba · ‎11-17-2003

hi

cd dir
cat file* > file_all
while read -r line
do
DATE=echo $line | awk ' FS = "," { print $1 } '`
NAME=`echo $line | awk ' FS = "," { print $2 } '`
VAL1=`echo $line | awk ' FS = "," { print $3 } '`
VAL2==`echo $line | awk ' FS = "," { print $4 } '`

COUNT=`cat file_all | grep "$DATE" | grep $VAL1 | grep -c $VAL2`
echo "$NAME $COUNT $DATE $VAL1 $VAL2"

NB: NOT TESTED

TS

jj

Ferdie Castro · ‎11-17-2003

To simplify everything
I cat file* > masterfile
Now I need to get how many occurences for $2
where $1, $3, $4 are the same.
Output file can be
$2 occurences= x $1 $3 $4

lance occurence= 2 2003-11-09 04:13:32 30000 35

The problem is how can I print x.
PS will only print greater than 1 occurence.
If you can help me use awk much faster.

Thanks.

Elmar P. Kolkman · ‎11-17-2003

I think you could do it like this:

sort -d, -k 1,3,4,2 masterfile | awk '
prev == $1 $3 $4 { count++; }
prev != $1 $3 $4 {
printf "%s occurences = %d %",
prevlab,count,prev;
prev=$1 $3 $4;
prevlab=$2;
count=0
}
END {
printf "%s occurences = %d %",
prevlab,count,prev;
}'

Depending on what you find most important, you could change the sort order from 1,3,4,2 to 2,1,3,4, meaning that you get output per column 2 instead of per date.

Every problem has at least one solution. Only some solutions are harder to find.

Ferdie Castro · ‎11-17-2003

Hi Elmar,
Error occured can be in the usage.
sort: illegal option -- ,
Usage: sort [-AbcdfiMmnru] [-T Directory] [-tCharacter] [-y kilobytes] [-o File]
[-k Keydefinition].. [[+Position1][-Position2]].. [-z recsz] [File]..
occurences = 0 %
root#

TSaliba · ‎11-17-2003

hi
in my reply the content of varaible COUNT=x
so to print only x>1 add the following
if [ $COUNT -gt 1 ]
echo ....
else
:
fi

jj

Henrik BOYE · ‎11-17-2003

make awk program
ttt.awk
# begin ttt.awk

BEGIN {
LAST=""
COUNT=0
FS="|"
}
NR==1 { LAST=$0
LAST1=$1
LAST2=$2
COUNT= 1 }
NR > 1 {
if ( LAST == $0 )
{
COUNT +=1
}
else
{
print LAST1 " " LAST2 " Count : " COUNT
COUNT=1
LAST=$0
LAST1=$1
LAST2=$2
}
}

# cut here

example:
files tt1 tt2

cat tt? | awk -v FS="," '{print $2 "|"$1" "$4" "$3 }' | sort |awk -f ttt.awk

Elmar P. Kolkman · ‎11-17-2003

Sorry, I found the problem too. My mistake was with using 'cut' arguments to sort. It should be:
sort -t "," -k 1,3,4,2 | .....

(-t instead of -d)

Every problem has at least one solution. Only some solutions are harder to find.

Elmar P. Kolkman · ‎11-17-2003

And some 's'-es are gone from the printf statements, I see. It should be a %s\n on the end of the strings:
printf "%s occurrence = %d %s\n",prevlab,count,prev

Sorry.

Every problem has at least one solution. Only some solutions are harder to find.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Script enhancement

Script enhancement