How do I find duplicate lines in multiple files within folders
1
vote
1
answer
2054
views
when i want to find duplicate lines between two files i use this command
comm -12 <(sort file1.txt) <(sort file2.txt)
or
sort file1.txt file2.txt | awk 'dup[$0]++ == 1'
But, how do I find duplicate lines in multiple files within folders. example:
mainfolder
folder1
file1-1.txt
file1-2.txt
etc
folder2
file2-1.txt
file2-2.txt
etc
and that the result in terminal is displayed by file (that is, the lines repeated in all files but specify which file is the one that contains it) to know the origin of the problem.
PD: I tried this command and it didn't work for me
file_expr="*.txt"; sort $file_expr | sed 's/^\s*//; s/\s*$//; /^\s*$/d' | uniq -d | while read dup_line; do grep -Hn "^\s*$dup_line\s*$" $file_expr; done| sort -t: -k3 -k1,2 | awk -F: '{ file=$1; line=$2; $1=$2=""; gsub(/(^[ \t]+)|([ \t]+$)/,"",$0); if (prev != "" && prev != $0) printf ("\n"); printf ("\033[0;33m%s (line %s)\033[0m: %s\n", file, line, $0); prev=$0; }'
Asked by acgbox
(1010 rep)
Jun 16, 2021, 04:26 PM
Last activity: Jun 16, 2021, 05:44 PM
Last activity: Jun 16, 2021, 05:44 PM