How do I find duplicate lines in multiple files within folders

1 vote

1 answer

2054 views

                          when i want to find duplicate lines between two files i use this command

    comm -12 <(sort file1.txt) <(sort file2.txt)

or

    sort file1.txt file2.txt | awk 'dup[$0]++ == 1'

But, how do I find duplicate lines in multiple files within folders. example:

    mainfolder
      folder1
        file1-1.txt
        file1-2.txt
        etc
      folder2
        file2-1.txt
        file2-2.txt
        etc

and that the result in terminal is displayed by file (that is, the lines repeated in all files but specify which file is the one that contains it) to know the origin of the problem.

PD: I tried this command  and it didn't work for me

    file_expr="*.txt"; sort $file_expr | sed 's/^\s*//; s/\s*$//; /^\s*$/d' | uniq -d | while read dup_line; do grep -Hn "^\s*$dup_line\s*$" $file_expr; done| sort -t: -k3 -k1,2 | awk -F: '{ file=$1; line=$2; $1=$2=""; gsub(/(^[ \t]+)|([ \t]+$)/,"",$0); if (prev != "" && prev != $0) printf ("\n"); printf ("\033[0;33m%s (line %s)\033[0m: %s\n", file, line, $0); prev=$0; }'
                        

Asked by acgbox (1010 rep)

Jun 16, 2021, 04:26 PM
Last activity: Jun 16, 2021, 05:44 PM

How do I find duplicate lines in multiple files within folders

Related Questions