Sample Header Ad - 728x90

regular awk - easily sort array indexes to output them in the chosen order

10 votes
2 answers
5016 views
[edit: clarified that I need an *in awk* solution, and corrected that I need to sort 'indexes' (or rather, output them in a sorted way) instead of the ambiguous 'values')] In awk, I often count things, or store a set of values, inside an array, using the values as indices (taking advantage of awk's indexes_are_hashes mechanism) For example: if I want to know how many different values of $2 I encountered, and how often each values were seen: awk ' ... several different treatments ... { count[$2]++ } ... other treatments ... END { for(str in count) { print "counted: " str " : " count[str] " times." ... and other lines underneath, with additional infos ... } } ' The problem is that (non GNU, or other nicer versions) regular awk (and regular nawk) : - [A] doesn't output the different values in the order it has 'encountered' them, - [B] nor provide an easy way to go through the indexes either in numerical or alphabetical order for [A]: not too difficult to do .. just have another array to index the "newly seen" entries. the QUESTION is for [B]: **How can I do a simple call to sort to reorder the display of the different indexes?** (note : I am aware that gnu awk has an "easy" way for [B]: https://www.gnu.org/software/gawk/manual/html_node/Controlling-Array-Traversal.html ... But I want the way to do something similar in regular awk/nawk !) (ie: I need to do a loop to output the different indexes seen, sort them, re-read them [in an old awk...] into "something" ( ex: another array ordered_seen ?) and use that something to display the seen[s] in the chosen order. And this needs to be *inside awk* as under each indexes I often need to output a paragraph of additional infos. A "sort" outside of awk would reorder everything) So far: I find no "axiomatic" one-liner (or n-liner?) way to do that. I end up with a kludge that takes several lines, outputs each values to a file through sort, and then re-reads that sorted file and insert each line in order into a sorted_countindexes[n++], and then for(i=0;ifileA printf 's f g r e d f g e z s d v f e z a d d g r f e a\ns d f e r\n'>fileB # and the awk loop: It outputs in 'whatever order', I want in 'alphabetical order' for f in file? ; do printf 'for file: %s: ' "$f" tr ' ' '\n' < "$f" | awk ' { count[$0]++ } END { for(str in count){ printf("%s:%d ",str,count[str]) }; print "" } ' done #this outputs: for file: fileA: d:3 e:5 f:3 g:1 r:4 s:6 z:2 a:5 b:1 c:3 for file: fileB: d:5 e:5 f:5 g:3 r:3 s:3 v:1 z:2 a:2 # I'd like to have the letters outputted in alphabetical order instead!
Asked by Olivier Dulac (6580 rep)
Sep 17, 2020, 01:15 PM
Last activity: Sep 18, 2020, 01:50 PM