General
cancel
Showing results for 
Search instead for 
Did you mean: 

Aggregation script needed - awk

Raynald Boucher
Super Advisor

Aggregation script needed - awk

Hello all,

 

I have a file containing transaction ids and times ex.

 

aaaa1 30 ms

aaaa1 40 ms

aaaa1 50 ms

bbbb1 40 ms

bbbb1 60 ms

bbbb1 200 ms

etc

 

I want to produce a report like:

aaaa1 3(count) 120 (tot) 40 (avg) 30 (min) 50 (max)

bbbb1 3(count) 300 (tot) 100 (avg) 40 (min) 200 (max)

etc

 

Is there and easy way to do this with awk?

I searched but could not find a way to group the lines so I can use the sum, avg, min and max functions.

 

I'd also like to know where I can find a nice tutorial for awk.

 

Thanks

 

Rayb

4 REPLIES
Hein van den Heuvel
Honored Contributor

Re: Aggregation script needed - awk

 

You can use awk's associative arrays to easily do the grouping.

 

At the end, use : for <var> in <array> to loop over the keys.

(edited to add comments)

 

{ if (0 == count[$1]++) { # First time count is zero. 
     sum[$1] = $2;
     min[$1] = $2; 
     max[$1] = $2;
  } else { # not the first time, work with old values.
    sum[$1] += $2;
    if ($2 > max[$1]) max[$1] = $2;
    if ($2 < min[$1]) min[$1] = $2; 
  }
}
END {
      printf ("%10s %5s %8s %5s %5s %5s\n", "Gizmo", "Count", "tot", "avg", "min", "max");
      printf ("%10s %5s %8s %5s %5s %5s\n", "-----", "-----", "---", "---", "---", "---");

   for (x in count) { # Main reporting loop driven by keys in array
 
      printf ("%10s %5d %8d %5d %5d %5d\n",
              x, count[x], sum[x], sum[x]/count[x], min[x], max[x])
   }
}

 

 usage example:

 

$ cat tmp.txt
aaaa1 30 ms
aaaa1 40 ms
aaaa1 50 ms
bbbb1 40 ms
bbbb1 60 ms
bbbb1 200 ms

Administrator@DX2200-HEIN /cygdrive/c/temp
$ awk -f tmp.awk tmp.txt
     Gizmo Count      tot   avg   min   max
     ----- -----      ---   ---   ---   ---
     aaaa1     3      120    40    30    50
     bbbb1     3      300   100    40   200

 Good luck,

Hein

 

Hein van den Heuvel
Honored Contributor

Re: Aggregation script needed - awk

>> I'd also like to know where I can find a nice tutorial for awk. 

 

I'm please to see you ask, but... has google stopped working for you?

If I feed that whole line, or just "awk tutorial" into google it comes back with many good suggestions. 

 

There are also many AWK books.

I would want (and have) the original book, written the mr's A, W and K.

"The AWK Programming Language"by Alfred V. Aho, Brian W. Kernighan, and Peter J. Weinberger.

Addison-Wesley, 1988. ISBN 0-201-07981-X.  

It is oddly expensive though, and surely there are better books for less money 20+ years later

 

Cheers,

Hein

Raynald Boucher
Super Advisor

Re: Aggregation script needed - awk

Thanks much for the solution.

 

I had thought of the procedure but was wandering if there wasn't an even quicker way... doesn't always pay to be lazy.

 

As for the documentation, yes I did google awk. But again, as you said, there are very numerous hits returned and too many not related to my query.  I got fedup looking through them and was asking for a shortcut.  I had extracted a nice one yesars ago but lost it when my machine went TU a few months ago.

 

Thanks again.

 

RayB

 

PS, How do we assign points with this new forum?   or is this a thing of the past?

RB

Hein van den Heuvel
Honored Contributor

Re: Aggregation script needed - awk

Points are out, 'kudos' are in.

 

You may also want to mark the topic, or a specific reply as 'solved', to help future visitors.

 

Hein.