regular awk - easily sort array indexes to output them in the chosen order

10 votes
2 answers
5016 views
                          [edit: clarified that I need an *in awk* solution, and corrected that I need to sort 'indexes' (or rather, output them in a sorted way) instead of the ambiguous 'values')]

In awk, I often count things, or store a set of values, inside an array, using the values as indices (taking advantage of awk's indexes_are_hashes mechanism)

For example: if I want to know how many different values of $2 I encountered, and how often each values were seen:

    awk '
       ... several different treatments ...
       { count[$2]++ } 
       ... other treatments ...
       END { for(str in count) { 
               print "counted: " str " : " count[str] " times." 
               ... and other lines underneath, with additional infos ...
              }
           }
     '

The problem is that (non GNU, or other nicer versions) regular awk (and regular nawk) :
 - [A] doesn't output the different values in the order it has 'encountered' them, 
 - [B] nor provide an easy way to go through the indexes either in numerical or alphabetical order 

for [A]: not too difficult to do .. just have another array to index the "newly seen" entries.

the QUESTION is for [B]: **How can I do a simple call to sort to reorder the display of the different indexes?**

(note : I am aware that gnu awk has an "easy" way for [B]: https://www.gnu.org/software/gawk/manual/html_node/Controlling-Array-Traversal.html  ... But I want the way to do something similar in regular awk/nawk !)

(ie: I need to do a loop to output the different indexes seen, sort them, re-read them [in an old awk...] into "something" ( ex: another array ordered_seen ?) and use that something to display the seen[s] in the chosen order. And this needs to be *inside awk* as under each indexes I often need to output a paragraph of additional infos. A "sort" outside of awk would reorder everything)

So far: I find no "axiomatic" one-liner (or n-liner?) way to do that. 

I end up with a kludge that takes several lines, outputs each values to a file through sort, and then re-reads that sorted file and insert each line in order into a sorted_countindexes[n++], and then for(i=0;ifileA
    printf 's f g r e d f g e z s d v f e z a d d g r f e a\ns d f e r\n'>fileB
    # and the awk loop: It outputs in 'whatever order', I want in 'alphabetical order'
    for f in file? ; do printf 'for file: %s: ' "$f"
      tr ' ' '\n' < "$f" | awk ' 
           { count[$0]++ } 
       END { for(str in count){ 
               printf("%s:%d ",str,count[str]) 
              }; print "" 
           } '
    done
    #this outputs:
    for file: fileA: d:3 e:5 f:3 g:1 r:4 s:6 z:2 a:5 b:1 c:3
    for file: fileB: d:5 e:5 f:5 g:3 r:3 s:3 v:1 z:2 a:2
    # I'd like to have the letters outputted in alphabetical order instead!
                        
Asked by Olivier Dulac (6580 rep)
Sep 17, 2020, 01:15 PM
Last activity: Sep 18, 2020, 01:50 PM
regular awk - easily sort array indexes to output them in the chosen order

Related Questions