Are there awk versions that provide syntax for computing aggregations?

4 votes

1 answer

715 views

                          From time to time I find myself writing awk scripts that compute some simple statistics. For example computing a histogram, the average of a value, the standard deviation or even the variance ...

Doing that again and again with helper arrays/variables and for-loops in the END clause etc. feels a little bit tedious and error-prone.

In Dtrace  there is a quite awesome syntax for such tasks which they call aggregations . It is similar to the concept/API of Accumulators in the Boost C++ library .

Thus my question: are there awk variants which provide similar concepts/syntax that allow for convenient and iterative computation of such statistics?

An imaginative example of such syntax:

    $ someawk '{ @time[$1] = avg($2) }' measurements.log
    prog1    150
    prog2    200
    ....

(where the 1st column contains the program name, the 2nd the runtime of one measurement, measurements.log contains multiple measurements for each program and the aggregate function avg computes the average)

Asked by maxschlepzig (59512 rep)

Dec 25, 2012, 07:00 PM
Last activity: Dec 25, 2012, 11:38 PM

Are there awk versions that provide syntax for computing aggregations?

Related Questions