Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

17 votes
2 answers
23476 views
comm: file is not in sorted order
I used `comm` to compare two sorted files. Each line in these files are positive integer numbers. But the results show comm: file 1 is not in sorted order comm: file 2 is not in sorted order How come the error even if these two files are sorted?
I used comm to compare two sorted files. Each line in these files are positive integer numbers. But the results show comm: file 1 is not in sorted order comm: file 2 is not in sorted order How come the error even if these two files are sorted?
wenzi (423 rep)
Nov 16, 2012, 11:25 AM • Last activity: Jul 27, 2024, 01:02 PM
1 votes
2 answers
4231 views
Can linux `comm` command handle UTF-8 encoded text files?
I want to compare two UTF-8 encoded text files. Can the Linux commands `diff` and `comm` handle this encoding?
I want to compare two UTF-8 encoded text files. Can the Linux commands diff and comm handle this encoding?
user41451 (113 rep)
Feb 4, 2017, 11:42 AM • Last activity: Jun 28, 2024, 10:39 AM
1 votes
0 answers
101 views
When should I use comm over diff?
It seems diff can do anything comm can do? When should I use comm only but not diff excepting the difference in result format? I guess comm is faster?
It seems diff can do anything comm can do? When should I use comm only but not diff excepting the difference in result format? I guess comm is faster?
tom10271 (111 rep)
Feb 4, 2024, 10:18 AM
0 votes
1 answers
82 views
comm issues displaying desired results
I learned here how to use comm to compare 2 already sorted (using sort) files and display lines/records present in either one of the files or in both of them. Something is actually not really working as expected so I wanted to get some help. Let's say I have a couple of sorted files that have about...
I learned here how to use comm to compare 2 already sorted (using sort) files and display lines/records present in either one of the files or in both of them.
Something is actually not really working as expected so I wanted to get some help. Let's say I have a couple of sorted files that have about ~200k records each and some of those lines are common - appear in both files previously sorted. then execute this command: comm -i -23 file1 file2 > test_01 very simple. the expectation being to have records/lines in test_01 that *only* show up in file1. but the output - test_01 - contains lines/records that are present in both files. Both files contain just plain email addresses - one column each. and previously sorted, as mentioned above, with the sort utility. each file has different amounts of records. did confirm (using grep) that the test_01 output file contains records present in both file1 and file2. based on the above process description is there something that i'm doing wrong?
darwingoat (1 rep)
Dec 22, 2023, 03:24 PM • Last activity: Dec 23, 2023, 11:01 AM
0 votes
1 answers
55 views
diff and comm are not finding difference between two env files
I have this env file - 1.env contents: BARF_BAG=1 then another env file: - 2.env contents: BARF_BAG=2 I run comm and diff on the files to see the difference: #!/usr/bin/env bash ( set -e; first_env_file="$1" second_env_file="$2" if ! test -f "$first_env_file"; then echo 'first arg must be an env fil...
I have this env file - 1.env contents: BARF_BAG=1 then another env file: - 2.env contents: BARF_BAG=2 I run comm and diff on the files to see the difference: #!/usr/bin/env bash ( set -e; first_env_file="$1" second_env_file="$2" if ! test -f "$first_env_file"; then echo 'first arg must be an env file'; exit 1; fi if ! test -f "$second_env_file"; then echo 'second arg must be an env file'; exit 1; fi echo -e '\n' echo -e 'displaying results from diff tool:' diff <(. "$first_env_file"; env | sort) <(. "$second_env_file"; env | sort) || true echo -e '\n' echo 'displaying results from comm tool:' comm -3 <(. "$first_env_file"; env | sort ) <(. "$second_env_file"; env | sort) || true echo 'finished diffing env files.' ) and I get nothing, just: displaying results from diff tool: displaying results from comm tool: finished diffing env files. what gives?
Alexander Mills (10734 rep)
Oct 5, 2023, 05:14 PM • Last activity: Oct 5, 2023, 08:02 PM
0 votes
1 answers
58 views
Group results using comm
Using `comm` I get results that look weird from this: comm -3 <(. "$first_env_file"; env) <(. "$second_env_file"; env) I get something like: ``` AUTH_LP_ACCOUNT_ID=xxx1 AUTH_LP_ACCOUNT_ID=xxx2 AWS_IMAGE_DOMAIN_NAME=abc AWS_IMAGE_DOMAIN_NAME=zyx NODE_ENV=local NODE_ENV=staging NODE_PORT=3000 NODE_POR...
Using comm I get results that look weird from this: comm -3 <(. "$first_env_file"; env) <(. "$second_env_file"; env) I get something like:
AUTH_LP_ACCOUNT_ID=xxx1
        AUTH_LP_ACCOUNT_ID=xxx2
        AWS_IMAGE_DOMAIN_NAME=abc
AWS_IMAGE_DOMAIN_NAME=zyx
NODE_ENV=local
        NODE_ENV=staging
        NODE_PORT=3000
NODE_PORT=4000
REDIS_HOST=localhost
        REDIS_HOST=redis
(and yes the spaces in front (prepended tabs/spaces) are there) what I would rather it look like is something like this: --begin-- AUTH_LP_ACCOUNT_ID=xx1 AUTH_LP_ACCOUNT_ID=xx2 --------- AWS_IMAGE_DOMAIN_NAME=abc AWS_IMAGE_DOMAIN_NAME=zyx --------- NODE_ENV=local NODE_ENV=staging --------- NODE_PORT=3000 NODE_PORT=4000 --------- REDIS_HOST=localhost REDIS_HOST=redis ---end--- is there a way to accomplish this? 1. to remove prepended lines we can pipe through `sed 's/^ *//. 2. putting --begin-- and ---end--- at begin/end is easy matter 3. but how to group results easily? My only guess at 3 is to loop over each line (skipping first) and if the next result has a different xxx= vs abc= then print a -------- but I am not in love with that.
Alexander Mills (10734 rep)
Oct 5, 2023, 04:09 PM • Last activity: Oct 5, 2023, 04:33 PM
0 votes
2 answers
2275 views
comm not working to find unique words to file1 from two files
I have two text files in which I have to use the comm command to extract all unique words from file 1. So just those that are not in file 2. I was asked to use the `comm` command (not `diff` nor `join`). I have tried a lot of things such as `comm -32 file1 file2`, but this returns all the words in f...
I have two text files in which I have to use the comm command to extract all unique words from file 1. So just those that are not in file 2. I was asked to use the comm command (not diff nor join). I have tried a lot of things such as comm -32 file1 file2, but this returns all the words in file 1.
hdb004 (1 rep)
Oct 29, 2013, 09:04 AM • Last activity: Aug 20, 2023, 01:40 PM
52 votes
2 answers
116169 views
Common lines between two files
I have the following code that I run on my Terminal. LC_ALL=C && grep -F -f genename2.txt hg38.hgnc.bed > hg38.hgnc.goi.bed This doesn't give me the common lines between the two files. What am I missing there?
I have the following code that I run on my Terminal. LC_ALL=C && grep -F -f genename2.txt hg38.hgnc.bed > hg38.hgnc.goi.bed This doesn't give me the common lines between the two files. What am I missing there?
Marwah Soliman (713 rep)
Oct 14, 2017, 06:46 PM • Last activity: May 16, 2023, 07:00 AM
-1 votes
1 answers
463 views
I need to compare/sort two text files
this is the scenario. I have File1 and File2 and i like to have the outcome in File3. I'm kind of new to Linux, but so far ive tried to use sort, diff, and comm. but no luck so far. File1.txt File2.txt > File3.txt File1.txt RB0009 8,89 RB0010 5,67 RB0015 4,32 RB0027 6,56 File2.txt RB0009 8,89 RB0010...
this is the scenario. I have File1 and File2 and i like to have the outcome in File3. I'm kind of new to Linux, but so far ive tried to use sort, diff, and comm. but no luck so far. File1.txt File2.txt > File3.txt File1.txt RB0009 8,89 RB0010 5,67 RB0015 4,32 RB0027 6,56 File2.txt RB0009 8,89 RB0010 5,67 RB0015 4,32 RB0027 6,56 RB0033 9,78 File3.txt RB0009 700111i 8,89 RB0010 700092i 5,67 RB0015 700148i 4,32 RB0027 700123i 6,56 help would be much appreciated.
conny (1 rep)
Feb 14, 2023, 12:46 PM • Last activity: Feb 14, 2023, 10:11 PM
1 votes
3 answers
1405 views
Recursively list path of files only
# Why I have two folders that should contain the exact same files, however, when I look at the number of files, they are different. I would like to know which files/folders are present in one, not the other. My thinking is I will make a list of all the files and then use comm to find differences bet...
# Why I have two folders that should contain the exact same files, however, when I look at the number of files, they are different. I would like to know which files/folders are present in one, not the other. My thinking is I will make a list of all the files and then use comm to find differences between the two folders. # Question How to make a list recursively of files and folders in the format /path/to/dir and /path/to/dir/file ? # Important notes OS: Windows 11, subsystem Ubuntu 20.04.4 LTS Locations folders: One network drive, one local Size of folders: ~2tb each
Oll (113 rep)
Jun 21, 2022, 08:49 AM • Last activity: Jun 21, 2022, 04:26 PM
0 votes
1 answers
287 views
Comparing Sets with 'comm'
Trying to get a list of available IP addresses based off all usable IPs in a range when compared to a device's ARP table. Basing what I'm doing with `comm` on this discussion: https://unix.stackexchange.com/questions/104837/intersection-of-two-arrays-in-bash Creating ranges of IPs to compare against...
Trying to get a list of available IP addresses based off all usable IPs in a range when compared to a device's ARP table. Basing what I'm doing with comm on this discussion: https://unix.stackexchange.com/questions/104837/intersection-of-two-arrays-in-bash Creating ranges of IPs to compare against - e..g 192.168.20.0/23
RANGE1=(192.168.20.{2..255})
RANGE2=(192.168.21.{0..254})
RANGE=("${RANGE1[@]}" "${RANGE2[@]}")
printf '%s\n' "${RANGE[@]}" | LC_ALL=C sort > "${IPSETS_DIR}/_set.txt"
$1 is an IP of a network device. OID is basically a device's ARP table. GREP_SEARCH example: "192.168.20|192.168.30|192.168.55"
$(which snmpbulkwalk) -v2c -c  "${1}" .1.3.6.1.2.1.4.35.1.4 > "${RESULTS_FILE}"
STRIPPED_RESULTS=( $(cut -d\" -f2 "${RESULTS_FILE}" | egrep -w "(^|\s)${GREP_SEARCH}") )
printf "%s\n" "${STRIPPED_RESULTS[@]}" | LC_ALL=C sort > "${STRIPPED_FILE}"
The walk returns results such as: IP-MIB::ipNetToPhysicalPhysAddress.118161416.ipv4."X.X.X.X" = STRING: XX:XX:XX:XX:XX:XX I then compare using the below. $1 is city-alias.
$(which comm) -13 "${STRIPPED_FILE}" "${IPSETS_DIR}/${1}_set.txt" > "${DIR}/${1}_stored_results.txt"
This MOSTLY works, but I'm still getting IPs that are in use. Not sure what I'm missing.
rannday (11 rep)
Jun 15, 2022, 04:49 PM • Last activity: Jun 15, 2022, 07:54 PM
1 votes
3 answers
2318 views
How to get the difference between files
I've found other links on the stackoverflow communities that were similar but they didn't answer my question exactly. I have 2 files with a different number of lines BUT I have them both sorted. My original files are hundreds of lines long but for troubleshooting purposes, I made file1 have 12 lines...
I've found other links on the stackoverflow communities that were similar but they didn't answer my question exactly. I have 2 files with a different number of lines BUT I have them both sorted. My original files are hundreds of lines long but for troubleshooting purposes, I made file1 have 12 lines and file2 have 5 lines. File2 is a subset of file1. What I want to do is run a command that outputs all the lines that are in file1 but are not in file2. I tried using the Unix commands diff and comm but they both list the full contents of file1, which is not what I want. A quick example of this would be: File1 File2 A B B E C I E N G O I L M N O X So here, we can see everything that's in file2 is in file1. For some reason, diff and comm both showed the full contents of file1. I assume it's because it's doing a line by line comparison and not searching thru the whole file. Is there another Unix command I can run that will output what I am expecting? EDIT: The commands I used to attempt to get what I needed were: a) diff file1 file2 This basically listed everything from file1 with a in front of it. Definitely not what I needed b) comm -23 file1 file2 This showed the whole content of file1 again and not the diff like I was expecting. I also c) comm -3 file1 file2 The help page for comm said this would print lines in file 1 but not in file 2 and vice versa but this also didn't show what I wanted b/c in my example, B appears in both files but on different lines. However, the output thinks it's in one but not the other and therefore prints it out. So the output looked like this: A B B C E E etc. And it wasn't what I was expecting. I was expecting A C G L M X
Classified (529 rep)
Nov 2, 2021, 10:36 PM • Last activity: Apr 18, 2022, 07:57 AM
3 votes
2 answers
109 views
comm for n files
I am looking for comm's functionality for n, i. e. more than two, files. [`man comm`][1] reads: ``` COMM(1) NAME comm - compare two sorted files line by line SYNOPSIS comm [OPTION]... FILE1 FILE2 DESCRIPTION Compare sorted files FILE1 and FILE2 line by line. With no options, produce three-column out...
I am looking for comm's functionality for n, i. e. more than two, files. man comm reads:
COMM(1)

NAME
       comm - compare two sorted files line by line

SYNOPSIS
       comm [OPTION]... FILE1 FILE2

DESCRIPTION
       Compare sorted files FILE1 and FILE2 line by line.

       With no options, produce three-column output.
       Column one contains lines unique to FILE1,
       column two contains lines unique to FILE2,
       and column three contains lines common to both files.
A first non-optimized and differently formatted approach in bash to illustrate the idea:
user@host MINGW64 dir
$ ls
abc  ac  ad  bca  bcd

user@host MINGW64 dir
$ tail -n +1 *
==> abc  ac  ad  bca  bcd &2 echo -en "${entry}\t"
   7   │     for file in "$@"; do
   8   │         foundentry=$(grep "$entry" "$file")
   9   │         echo -en "${foundentry}\t"
  10   │     done
  11   │     echo -en "\n"
  12   │ done
───────┴───────────────────────────────────────────────────────────────────────

user@host MINGW64 dir
$ time otherdir/ncomm.sh *
all     abc     ac      ad      bca     bcd
a       a       a       a       a
b       b                       b       b
c       c       c               c       c
d                       d               d

real    0m12.921s
user    0m0.579s
sys     0m4.586s

user@host MINGW64 dir
$
This displays column headers (to stderr), a first column "all" with all entries found in either file, sorted and then one column per file from the parameter list with their entries in the respective row. As for each cell outside of the first column and first row, grep is invoked once, this is really slow. As for comm, this output is only suitable for short lines/entries like ids. A more concise version could output an x or similar for each found entry in columns 2+. This should work on Git for Windows' MSYS2 and on RHEL. **How can this be achieved in a more performant manner?**
Julia (31 rep)
Jul 29, 2021, 04:17 AM • Last activity: Jan 31, 2022, 02:48 PM
1 votes
2 answers
37 views
How to get well-formed table from comm?
I want to use the output of `comm` in other table parsers. However it seems like it produces inconsistently delimited rows. For example: ``` $ comm <(echo "1\n2") <(echo "2\n3") | bat -A --style=plain 1␊ ├──┤├──┤2␊ ├──┤3␊ ``` Because it's not padding with remaining tabs, I can't convert it to a CSV:...
I want to use the output of comm in other table parsers. However it seems like it produces inconsistently delimited rows. For example:
$ comm <(echo "1\n2") <(echo "2\n3") | bat -A --style=plain
1␊
├──┤├──┤2␊
├──┤3␊
Because it's not padding with remaining tabs, I can't convert it to a CSV:
$ comm <(echo "1\n2") <(echo "2\n3") | tr \\t ,
1
,,2
,3
And can't ingest it as tab-delimited either:
$ comm <(echo "1\n2") <(echo "2\n3") | xsv input -d \\t
1
CSV error: record 1 (line: 2, byte: 2): found record with 3 fields, but the previous record has 1 fields
Is there a way to make comm produce a properly formatted table? The options I can see seem like more work than they should be: * Replace with a regex * Print each column separately
Haterind (189 rep)
Sep 20, 2021, 06:52 PM • Last activity: Sep 20, 2021, 08:09 PM
0 votes
2 answers
162 views
get only the unmatched list as an output
I want to know on which ports of firewalls from a particular customer the MAC Address Filtering is **not** active. So I have created 2 files: * `all.txt` contains a list of all firewalls of a customer and looks like this: ```none abc123 ahg578 dfh879 ert258 fgh123 huz546 jki486 lop784 mnh323 xsd451...
I want to know on which ports of firewalls from a particular customer the MAC Address Filtering is **not** active. So I have created 2 files: * all.txt contains a list of all firewalls of a customer and looks like this:
abc123 
    ahg578
    dfh879
    ert258
    fgh123
    huz546
    jki486
    lop784
    mnh323
    xsd451
    wqa512
    zas423
* active.txt contains a list of firewalls of the same customer in which the MAC Address filtering is active, and looks like this:
abc123: set macaddr 00:00:00:00:00:00
    ahg578: set macaddr 00:00:00:00:00:00
    dfh879: set macaddr 00:00:00:00:00:00
    ert258: set macaddr 00:00:00:00:00:00
    fgh123: set macaddr 00:00:00:00:00:00
    huz546: set macaddr 00:00:00:00:00:00
    mnh323: set macaddr 00:00:00:00:00:00
    xsd451: set macaddr 00:00:00:00:00:00
    wqa512: set macaddr 00:00:00:00:00:00
    zas423: set macaddr 00:00:00:00:00:00
I have compared the two lists using
comm -3 ~/active.txt ~/all.txt
and get this result: result-list How can I get _only the unmatched list_ as an output? So I want the output to be only
jki486
lop784
I have tried using sdiff, grep -rL, grep -vxFf but none of them works. FYI, I'm using GNU. Linux version 3.2.0-6-amd64; gcc version 4.9.2 I would really appreciate your help! Thank you! :)
Ella Widya (19 rep)
Aug 25, 2021, 08:36 AM • Last activity: Aug 26, 2021, 11:48 AM
1 votes
1 answers
2054 views
How do I find duplicate lines in multiple files within folders
when i want to find duplicate lines between two files i use this command comm -12 <(sort file1.txt) <(sort file2.txt) or sort file1.txt file2.txt | awk 'dup[$0]++ == 1' But, how do I find duplicate lines in multiple files within folders. example: mainfolder folder1 file1-1.txt file1-2.txt etc folder...
when i want to find duplicate lines between two files i use this command comm -12 <(sort file1.txt) <(sort file2.txt) or sort file1.txt file2.txt | awk 'dup[$0]++ == 1' But, how do I find duplicate lines in multiple files within folders. example: mainfolder folder1 file1-1.txt file1-2.txt etc folder2 file2-1.txt file2-2.txt etc and that the result in terminal is displayed by file (that is, the lines repeated in all files but specify which file is the one that contains it) to know the origin of the problem. PD: I tried this command and it didn't work for me file_expr="*.txt"; sort $file_expr | sed 's/^\s*//; s/\s*$//; /^\s*$/d' | uniq -d | while read dup_line; do grep -Hn "^\s*$dup_line\s*$" $file_expr; done| sort -t: -k3 -k1,2 | awk -F: '{ file=$1; line=$2; $1=$2=""; gsub(/(^[ \t]+)|([ \t]+$)/,"",$0); if (prev != "" && prev != $0) printf ("\n"); printf ("\033[0;33m%s (line %s)\033[0m: %s\n", file, line, $0); prev=$0; }'
acgbox (1010 rep)
Jun 16, 2021, 04:26 PM • Last activity: Jun 16, 2021, 05:44 PM
1 votes
2 answers
2681 views
Compare two files based on first column
I have two files, and I would like to get a new file with only the lines that we have in the first file, but not in the second one. Example: file1: ID firstname lastname 1 John Wilkens 2 Andrea Smith 3 Matthew Freberg 4 Brenda Something file2: ID firstname lastname 1 John Wilkens 2 Andrea Willems 3...
I have two files, and I would like to get a new file with only the lines that we have in the first file, but not in the second one. Example: file1: ID firstname lastname 1 John Wilkens 2 Andrea Smith 3 Matthew Freberg 4 Brenda Something file2: ID firstname lastname 1 John Wilkens 2 Andrea Willems 3 Jay Freberg 5 Mike Hart Output: ID firstname lastname 4 Brenda Something I tried using comm, but that also gives the rows where something was changed, so for example the ID 2 and 3. Can you please help me with this?
jazonpanczel (17 rep)
May 25, 2021, 10:42 AM • Last activity: May 26, 2021, 03:55 AM
3 votes
1 answers
4268 views
bash remove common lines from two files
I have two files, (no blank lines/Spaces/Tabs) ###_/tmp/all_ aa bb cc hello SearchText.json xyz.txt ###_/tmp/required_ SearchText.json ------- and the end output I want is : (all uncommon lines from /tmp/all) aa bb cc hello xyz.txt I have tried below commands :- `# comm -23 /tmp/required /tmp/all` S...
I have two files, (no blank lines/Spaces/Tabs) ###_/tmp/all_ aa bb cc hello SearchText.json xyz.txt ###_/tmp/required_ SearchText.json ------- and the end output I want is : (all uncommon lines from /tmp/all) aa bb cc hello xyz.txt I have tried below commands :- # comm -23 /tmp/required /tmp/all SearchText.json # comm -23 /tmp/all /tmp/required aa bb cc hello SearchText.json xyz.txt # comm -13 /tmp/all /tmp/required SearchText.json # comm -13 /tmp/required /tmp/all aa bb cc hello SearchText.json xyz.txt # grep -vf /tmp/all /tmp/required # grep -vf /tmp/required /tmp/all aa bb cc hello SearchText.json xyz.txt # comm -23 <(sort /tmp/all) <(sort /tmp/required) aa bb cc hello SearchText.json xyz.txt
Girish (133 rep)
Mar 18, 2019, 01:04 PM • Last activity: Apr 12, 2021, 05:35 AM
1 votes
0 answers
107 views
Comm requires --nocheck-order on ~10.000 lines file?
I want to use Comm to compare two log files. The files are around 1MB containing ~10.000 lines. While testing with a small portion of a log `comm -1 -3 a.log b.log > diff.log` worked as expected. However when testing with the full file I got the following message; `comm: file 2 is not in sorted orde...
I want to use Comm to compare two log files. The files are around 1MB containing ~10.000 lines. While testing with a small portion of a log comm -1 -3 a.log b.log > diff.log worked as expected. However when testing with the full file I got the following message; comm: file 2 is not in sorted order. When adding --nocheck-order to the command seems to work but why is this necessary? The man page doesn't really offer any insight. I just want to be sure my script outputs the correct data. I don't care in what order the logs are processed as long as it only outputs the lines that only exist in the second file.
sjaak (644 rep)
Feb 22, 2021, 06:40 AM
-1 votes
4 answers
119 views
Compare two files and generate another on matching condition
I have two files, a.txt and b.txt, where a.txt contains lines starting with "zn", e.g., zn12c5b or&#160; zn4i8l, while b.txt contains lines ending with a pattern "/number",&#160;e.g., t17v11/112 or 12c5b/450. My aim is to write in final.txt the strings in a.txt (but without "zn") which do not match...
I have two files, a.txt and b.txt, where a.txt contains lines starting with "zn", e.g., zn12c5b or  zn4i8l, while b.txt contains lines ending with a pattern "/number", e.g., t17v11/112 or 12c5b/450. My aim is to write in final.txt the strings in a.txt (but without "zn") which do not match with strings in b.txt (without the trailing "/number" pattern). For example: a.txt : zn12c5b zn4i8l b.txt: t17v11/112 12c5b/450 4i8ls/681 I should obtain the following output in final.txt: 4i8l note: 4i8l in file a.txt (without the "zn" prefix) does not equal 4i8ls from file (without the "/681" suffix). I am using an Ubuntu system.
lina32 (47 rep)
Jul 11, 2020, 03:40 AM • Last activity: Jul 13, 2020, 06:18 PM
Showing page 1 of 20 total questions