Sample Header Ad - 728x90

Are there awk versions that provide syntax for computing aggregations?

4 votes
1 answer
715 views
From time to time I find myself writing awk scripts that compute some simple statistics. For example computing a histogram, the average of a value, the standard deviation or even the variance ... Doing that again and again with helper arrays/variables and for-loops in the END clause etc. feels a little bit tedious and error-prone. In Dtrace there is a quite awesome syntax for such tasks which they call aggregations . It is similar to the concept/API of Accumulators in the Boost C++ library . Thus my question: are there awk variants which provide similar concepts/syntax that allow for convenient and iterative computation of such statistics? An imaginative example of such syntax: $ someawk '{ @time[$1] = avg($2) }' measurements.log prog1 150 prog2 200 .... (where the 1st column contains the program name, the 2nd the runtime of one measurement, measurements.log contains multiple measurements for each program and the aggregate function avg computes the average)
Asked by maxschlepzig (59512 rep)
Dec 25, 2012, 07:00 PM
Last activity: Dec 25, 2012, 11:38 PM