regular awk - easily sort array indexes to output them in the chosen order
10
votes
2
answers
5016
views
[edit: clarified that I need an *in awk* solution, and corrected that I need to sort 'indexes' (or rather, output them in a sorted way) instead of the ambiguous 'values')]
In awk, I often count things, or store a set of values, inside an array, using the values as indices (taking advantage of awk's indexes_are_hashes mechanism)
For example: if I want to know how many different values of $2 I encountered, and how often each values were seen:
awk '
... several different treatments ...
{ count[$2]++ }
... other treatments ...
END { for(str in count) {
print "counted: " str " : " count[str] " times."
... and other lines underneath, with additional infos ...
}
}
'
The problem is that (non GNU, or other nicer versions) regular awk (and regular nawk) :
- [A] doesn't output the different values in the order it has 'encountered' them,
- [B] nor provide an easy way to go through the indexes either in numerical or alphabetical order
for [A]: not too difficult to do .. just have another array to index the "newly seen" entries.
the QUESTION is for [B]: **How can I do a simple call to sort to reorder the display of the different indexes?**
(note : I am aware that gnu awk has an "easy" way for [B]: https://www.gnu.org/software/gawk/manual/html_node/Controlling-Array-Traversal.html ... But I want the way to do something similar in regular awk/nawk !)
(ie: I need to do a loop to output the different indexes seen, sort them, re-read them [in an old awk...] into "something" ( ex: another array ordered_seen ?) and use that something to display the seen[s] in the chosen order. And this needs to be *inside awk* as under each indexes I often need to output a paragraph of additional infos. A "sort" outside of awk would reorder everything)
So far: I find no "axiomatic" one-liner (or n-liner?) way to do that.
I end up with a kludge that takes several lines, outputs each values to a file through sort, and then re-reads that sorted file and insert each line in order into a sorted_countindexes[n++], and then for(i=0;ifileA
printf 's f g r e d f g e z s d v f e z a d d g r f e a\ns d f e r\n'>fileB
# and the awk loop: It outputs in 'whatever order', I want in 'alphabetical order'
for f in file? ; do printf 'for file: %s: ' "$f"
tr ' ' '\n' < "$f" | awk '
{ count[$0]++ }
END { for(str in count){
printf("%s:%d ",str,count[str])
}; print ""
} '
done
#this outputs:
for file: fileA: d:3 e:5 f:3 g:1 r:4 s:6 z:2 a:5 b:1 c:3
for file: fileB: d:5 e:5 f:5 g:3 r:3 s:3 v:1 z:2 a:2
# I'd like to have the letters outputted in alphabetical order instead!
Asked by Olivier Dulac
(6580 rep)
Sep 17, 2020, 01:15 PM
Last activity: Sep 18, 2020, 01:50 PM
Last activity: Sep 18, 2020, 01:50 PM