Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

-1 votes
1 answers
1066 views
How to get unique occurrence of words from a very large file?
I have been asked to write a word frequency analysis program using the unix/ shell-scripts with the following requirements: - Input is a text file with one word per line - Input words are drawn from the Compact Oxford English Dictionary New Edition - Character encoding is UTF-8 - Input file is 1 Peb...
I have been asked to write a word frequency analysis program using the unix/ shell-scripts with the following requirements: - Input is a text file with one word per line - Input words are drawn from the Compact Oxford English Dictionary New Edition - Character encoding is UTF-8 - Input file is 1 Pebibyte (PiB) in length - Output is of the format “ Word occurred N times” I am aware of one of the way to begin with as below --- cat filename | xargs -n1 | sort | uniq -c > newfilename What should be the best optimal way to do that considering performance as well?
Pratik Barjatiya (23 rep)
Dec 29, 2017, 06:27 AM • Last activity: Apr 15, 2025, 03:40 PM
2 votes
4 answers
9322 views
finding all non-unique lines in a file
I'm trying to use uniq to find all non-unique lines in a file. By non-unique, I mean any line that I have already seen on the previous line. I thought that the "-D" option would do this: -D print all duplicate lines But instead of just printing the duplicate lines, it prints *all* the lines when the...
I'm trying to use uniq to find all non-unique lines in a file. By non-unique, I mean any line that I have already seen on the previous line. I thought that the "-D" option would do this: -D print all duplicate lines But instead of just printing the duplicate lines, it prints *all* the lines when there is more than one. I want to only print the second and subsequent copies of a line. How can I do this?
Michael (544 rep)
Nov 7, 2019, 10:18 PM • Last activity: Apr 8, 2025, 06:45 PM
0 votes
2 answers
57 views
pipe to uniq from a variable not showing the desired output
I have a pipeline using array jobs and need to change the number of inputs for some steps. I thought about testing `uniq` since the only part changing in my folders are the last four characters (the *hap* part in the example). So, all my paths look something like: ``` /mnt/nvme/user/something1/hap1...
I have a pipeline using array jobs and need to change the number of inputs for some steps. I thought about testing uniq since the only part changing in my folders are the last four characters (the *hap* part in the example). So, all my paths look something like:
/mnt/nvme/user/something1/hap1
/mnt/nvme/user/something1/hap2
/mnt/nvme/user/something2/hap1
/mnt/nvme/user/something2/hap2
and what I'm doing is the following:
DIR=( "/mnt/nvme/ungaro/something1/hap1" "/mnt/nvme/ungaro/something1/hap2" "/mnt/nvme/ungaro/something2/hap1" "/mnt/nvme/ungaro/something2/hap2" )

for dir in "${DIR[@]}"; do echo $dir | sed 's#/hap[0-9]##' | uniq; done
But the resulting output always displays all the elements in the variable without collapsing the duplicate rows after removing the *hap* part of each one of them. Probably I'm missing something, could it be that the for forces to print all lines anyway. If so, is there a way to attained the desired result in a single line command?
Matteo (209 rep)
Feb 10, 2025, 04:25 PM • Last activity: Feb 10, 2025, 05:16 PM
121 votes
15 answers
57558 views
How can I remove duplicates in my .bash_history, preserving order?
I really enjoying using `control+r` to recursively search my command history. I've found a few good options I like to use with it: # ignore duplicate commands, ignore commands starting with a space export HISTCONTROL=erasedups:ignorespace # keep the last 5000 entries export HISTSIZE=5000 # append to...
I really enjoying using control+r to recursively search my command history. I've found a few good options I like to use with it: # ignore duplicate commands, ignore commands starting with a space export HISTCONTROL=erasedups:ignorespace # keep the last 5000 entries export HISTSIZE=5000 # append to the history instead of overwriting (good for multiple connections) shopt -s histappend The only problem for me is that erasedups only erases sequential duplicates - so that with this string of commands: ls cd ~ ls The ls command will actually be recorded twice. I've thought about periodically running w/ cron: cat .bash_history | sort | uniq > temp.txt mv temp.txt .bash_history This would achieve removing the duplicates, but unfortunately the order would not be preserved. If I don't sort the file first I don't believe uniq can work properly. How can I remove duplicates in my .bash_history, preserving order? ### Extra Credit: Are there any problems with overwriting the .bash_history file via a script? For example, if you remove an apache log file I think you need to send a nohup / reset signal with kill to have it flush it's connection to the file. If that is the case with the .bash_history file, perhaps I could somehow use ps to check and make sure there are no connected sessions before the filtering script is run?
cwd (46887 rep)
Sep 20, 2012, 02:55 PM • Last activity: Feb 3, 2025, 01:47 PM
1 votes
2 answers
825 views
How to sort or uniq a live feed
I'm looking to sort and isolate IP from a `tcpdump` live feed. tcpdump -n -i tun0 "tcp[tcpflags] & (tcp-syn) != 0" | grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3} works just fine but when I try to add the `uniq`program it fails: tcpdump -n -i tun0 "tcp[tcpflags] & (tcp-syn) != 0" | grep -E -o "([0-9]{1,...
I'm looking to sort and isolate IP from a tcpdump live feed. tcpdump -n -i tun0 "tcp[tcpflags] & (tcp-syn) != 0" | grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3} works just fine but when I try to add the uniqprogram it fails: tcpdump -n -i tun0 "tcp[tcpflags] & (tcp-syn) != 0" | grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3}" | uniq -u returns nothing. Same with sort -u. Any idea on how to fix this ?
ChiseledAbs (2301 rep)
Jul 8, 2016, 10:29 AM • Last activity: Jul 26, 2024, 05:54 AM
1 votes
1 answers
98 views
How can I find duplicate lines among files?
I have a software module which contains some files with same pattern. ` private static final long serialVersionUID = \dL;` How can I find files with the same value? ```shell $ grep -R serialVersionUID ./path/to/Some.java: private static final long serialVersionUID = 111L; ./path/to/Other.java: priva...
I have a software module which contains some files with same pattern. private static final long serialVersionUID = \dL; How can I find files with the same value?
$ grep -R serialVersionUID
./path/to/Some.java:    private static final long serialVersionUID = 111L;
./path/to/Other.java:        private static final long serialVersionUID = 222L;
./path/to/Another.java:        private static final long serialVersionUID = 111L;
Not that different preceding indent between columns. Now I want find those files with same value in the second column(private static final ...)?
$ grep -R serialVersionUID | .....
./path/to/Some.java:    private static final long serialVersionUID = 111L;
./path/to/Another.java:        private static final long serialVersionUID = 111L;
Thanks. This is all I could find, so far...
$ grep -R serialVersionUID | sed 's/[ ][ ]*/ /g' | sort -k 2
I have an improvement, yet it prints the second column only.
$ grep -R serialVersionUID | sed 's/[ ][ ]*/ /g' | sort -k 2 | uniq -f 2 -d
Jin Kwon (564 rep)
Jul 25, 2024, 06:44 AM • Last activity: Jul 26, 2024, 02:17 AM
1 votes
1 answers
177 views
Why is sorted uniq -c command showing duplicates
I am trying to count how many times I use a certain version of a library on my computer. For some reason, `uniq -c` is outputing duplicates, despite sorting it, and despite the sort order seeming in order. Any ideas or feedback? Thanks for your time. ### With `uniq -c` Input: ``` rg --no-line-number...
I am trying to count how many times I use a certain version of a library on my computer. For some reason, uniq -c is outputing duplicates, despite sorting it, and despite the sort order seeming in order. Any ideas or feedback? Thanks for your time. ### With uniq -c Input:
rg --no-line-number --no-filename -g '*.csproj' "GitVersion.MsBuild" | sed -E '/GitVersion\.MsBuild" Version/!d;s/^\s\+//g;//\1   \2/g' | sort -n | uniq -c
Output:
3 GitVersion.MsBuild      5.10.1
      1 GitVersion.MsBuild      5.10.1
      3 GitVersion.MsBuild      5.10.3
     11 GitVersion.MsBuild      5.11.1
      5 GitVersion.MsBuild      5.11.1
     25 GitVersion.MsBuild      5.12.0
      2 GitVersion.MsBuild      5.12.0
      1 GitVersion.MsBuild      5.6.11
      2 GitVersion.MsBuild      5.7.0
      4 GitVersion.MsBuild      5.8.1
### Without uniq -c Input:
rg --no-line-number --no-filename -g '*.csproj' "GitVersion.MsBuild" | sed -E '/GitVersion\.MsBuild" Version/!d;s/^\s\+//g;//\1   \2/g' | sort -n
Output:
GitVersion.MsBuild      5.10.1
GitVersion.MsBuild      5.10.1
GitVersion.MsBuild      5.10.1
GitVersion.MsBuild      5.10.1
GitVersion.MsBuild      5.10.3
GitVersion.MsBuild      5.10.3
GitVersion.MsBuild      5.10.3
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.11.1
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.12.0
GitVersion.MsBuild      5.6.11
GitVersion.MsBuild      5.7.0
GitVersion.MsBuild      5.7.0
GitVersion.MsBuild      5.8.1
GitVersion.MsBuild      5.8.1
GitVersion.MsBuild      5.8.1
GitVersion.MsBuild      5.8.1
--- I've updated my command to pipe to xxd as per @kos's suggestion. That helped in comparing.
rg --no-line-number --no-filename -g '*.csproj' "GitVersion.MsBuild" | sed -E '/GitVersion\.MsBuild" Version/!d;s/^\s\+//g;//\1     \2/g' | sort -n | uniq -c | xxd
That yielded (sorry for the screenshot, but it helps having the colors). enter image description here I then revised the regex slightly (sorry all, I didn't take all the suggestions on board, since one tiny tweak made it work, but I do have to say I learnt a lot by this, including using xxd) I simply added .* after the >:
rg --no-line-number --no-filename -g '*.csproj' "GitVersion.MsBuild" | sed -E '/GitVersion\.MsBuild" Version/!d;s/^\s\+//g;/.*$/\1  \2/g' | sort | uniq -c
And it now yields the correct (or satisfactory anyway) output:
4 GitVersion.MsBuild      5.10.1
      3 GitVersion.MsBuild      5.10.3
     16 GitVersion.MsBuild      5.11.1
     27 GitVersion.MsBuild      5.12.0
      1 GitVersion.MsBuild      5.6.11
      2 GitVersion.MsBuild      5.7.0
      4 GitVersion.MsBuild      5.8.1
Thanks team!
Albert (171 rep)
May 16, 2024, 03:53 AM • Last activity: May 17, 2024, 01:47 AM
-1 votes
2 answers
160 views
how to de-duplicate block (timestamp+command) from bash history?
I'm working with bash_history file containing blocks with the following format: `#unixtimestamp\ncommand\n` here's sample of the bash_history file: ``` #1713308636 cat > ./initramfs/init ./initramfs/init << "EOF" #!/bin/sh /bin/sh EOF #1713308642 file initramfs/init #1713308686 cpio -v -t -F init.cp...
I'm working with bash_history file containing blocks with the following format: #unixtimestamp\ncommand\n here's sample of the bash_history file:
#1713308636
cat > ./initramfs/init  ./initramfs/init << "EOF"
#!/bin/sh
/bin/sh
EOF
#1713308642
file initramfs/init
#1713308686
cpio -v -t -F init.cpio
#1713308690
ls
as a workaround, I add the delete functionality to this program . but I'm still open to other approaches that use existing commands.
ReYuki (33 rep)
May 13, 2024, 06:43 AM • Last activity: May 15, 2024, 04:42 PM
62 votes
5 answers
78521 views
How to get only the unique results without having to sort data?
$ cat data.txt aaaaaa aaaaaa cccccc aaaaaa aaaaaa bbbbbb $ cat data.txt | uniq aaaaaa cccccc aaaaaa bbbbbb $ cat data.txt | sort | uniq aaaaaa bbbbbb cccccc $ The result that I need is to **display all the lines from the original file removing all the duplicates (not just the consecutive ones), whil...
$ cat data.txt aaaaaa aaaaaa cccccc aaaaaa aaaaaa bbbbbb $ cat data.txt | uniq aaaaaa cccccc aaaaaa bbbbbb $ cat data.txt | sort | uniq aaaaaa bbbbbb cccccc $ The result that I need is to **display all the lines from the original file removing all the duplicates (not just the consecutive ones), while maintaining the original order of statements in the file**. Here, in this example, the result that I actually was looking for was aaaaaa cccccc bbbbbb How can I perform this generalized uniq operation in general?
Lazer (36085 rep)
Apr 24, 2011, 08:23 PM • Last activity: Jan 28, 2024, 07:06 AM
0 votes
3 answers
68 views
Does a command exist that lists all the directories where a word appears in a file or directory name?
When I don't remember where a file or a folder is, I sometime use the `locate` command (that finds more occurrences, allow more candidates than a `find`, in my mind. But maybe I'm mistaking). But then there's a lot of responses, of course: ```bash locate clang ``` ```log /data/sauvegardes/dev/Java/E...
When I don't remember where a file or a folder is, I sometime use the locate command (that finds more occurrences, allow more candidates than a find, in my mind. But maybe I'm mistaking). But then there's a lot of responses, of course:
locate clang
/data/sauvegardes/dev/Java/Experimentations/Angular4/bikes/node_modules/blocking-proxy/.clang-format
/data/sauvegardes/dev/Java/Experimentations/Angular4/bikes/node_modules/node-gyp/gyp/tools/Xcode/Specifications/gyp.xclangspec
/data/sauvegardes/dev/Java/Experimentations/Angular6/ng6-proj/node_modules/blocking-proxy/.clang-format
/data/sauvegardes/dev/Java/Experimentations/Angular6/ng6-proj/node_modules/node-gyp/gyp/tools/Xcode/Specifications/gyp.xclangspec
/data/sauvegardes/dev/Java/Experimentations/blog-demo/node/node_modules/npm/node_modules/node-gyp/gyp/tools/Xcode/Specifications/gyp.xclangspec
/data/sauvegardes/dev/Java/Experimentations/blog-demo/node_modules/blocking-proxy/.clang-format
/data/sauvegardes/dev/Java/Experimentations/blog-demo/node_modules/node-gyp/gyp/tools/Xcode/Specifications/gyp.xclangspec
/data/sauvegardes/dev/Java/Experimentations/ol-ext-angular/.metadata/.plugins/ts.eclipse.ide.server.nodejs.embed.win32.win32.x86_64/node-v6.9.4-win-x64/node_modules/npm/node_modules/node-gyp/gyp/tools/Xcode/Specifications/gyp.xclangspec

(201 responses)
I piped this command, with dirname, sort and uniq to list only directories having such word in their name or carrying one or more file having it.
locate clang | xargs -L1 dirname | sort | uniq
it works...
/home/lebihan/dev/Java/comptes-france/metier-et-gestion/AdapterInboundWebEtude/etude/node_modules/node-gyp/gyp/tools/Xcode/Specifications
/home/lebihan/dev/Java/comptes-france/metier-et-gestion/AdapterInboundWebEtude/etude/node/node_modules/npm/node_modules/node-gyp/gyp/tools/Xcode/Specifications
/usr/include/boost/align/detail
/usr/include/boost/config/compiler
/usr/include/boost/predef/compiler
/usr/lib/linux-kbuild-6.1/scripts
/usr/lib/llvm-14/lib
/usr/lib/postgresql/14/lib/bitcode/postgres/commands
/usr/lib/x86_64-linux-gnu
/usr/local/go/misc/ios
/usr/local/go/src/debug/dwarf/testdata
/usr/local/go/src/debug/elf/testdata
/usr/local/go/src/debug/macho/testdata
/usr/share/doc
/usr/share/doc/libclang1-14
/usr/share/doc/libclang-cpp14

(108 responses)
But does _Linux_ have a command doing the same, more easily?
Marc Le Bihan (2353 rep)
Oct 31, 2023, 07:34 AM • Last activity: Oct 31, 2023, 10:20 AM
1 votes
5 answers
485 views
Find and delete partially duplicate lines
https://www.domain.com/files/G5SPNDOF/AAA-1080p.mp4.html https://www.domain2.com/dl/G5SPNDOF/JHCGTS/AAA-1080p.mp4.html https://www.domain.com/files/ZQWL80BG/AAA-1080p.mp4.html https://www.domain.com/files/SVSRS0AD/BBB-1080p.mp4.html https://www.domain.com/files/UCIONEMA/BBB-1080p.mp4.html Given a fi...
Bogdan Nicolae Stoian (27 rep)
Oct 11, 2022, 07:27 AM • Last activity: Oct 9, 2023, 03:17 AM
11 votes
1 answers
1030 views
Use uniq to filter adjacent lines in pipeline
I'm trying to monitor theme changes using this command: ```lang-shell dbus-monitor --session "interface='org.freedesktop.portal.Settings', member=SettingChanged" | grep -o "uint32 ." ``` Output right now looks like this: ``` uint32 0 uint32 0 uint32 1 uint32 1 uint32 0 uint32 0 uint32 1 uint32 1 ```...
I'm trying to monitor theme changes using this command:
-shell
dbus-monitor --session "interface='org.freedesktop.portal.Settings', member=SettingChanged" | grep -o "uint32 ."
Output right now looks like this:
uint32 0
uint32 0
uint32 1
uint32 1
uint32 0
uint32 0
uint32 1
uint32 1
This output comes from theme toggling. The theme notification shows up twice for some reason. Now I want to pipe it to uniq so I only remain with a single entry like so:
uint32 0
uint32 1
uint32 0
uint32 1
However appending uniq at the end does not produce any output anymore.
-shell
dbus-monitor --session "interface='org.freedesktop.portal.Settings', member=SettingChanged" | grep -o "uint32 ." | uniq
From man uniq: > Filter adjacent matching lines from INPUT (or standard input), writing to OUTPUT (or standard output). uniq needs to buffer at least the last output line to be able to detect adjacent lines, I don't see any reason why it could not buffer it and pass it along the pipeline. I've tried tweaking line buffering as suggested [here](https://unix.stackexchange.com/questions/295814/uniq-is-not-realtime-when-piped) but the results are still the same for me.
-shell
dbus-monitor --session "interface='org.freedesktop.portal.Settings', member=SettingChanged" | grep -o "uint32 ." | stdbuf -oL -i0 uniq
Pavel Skipenes (235 rep)
Jun 18, 2023, 11:31 AM • Last activity: Jun 20, 2023, 10:57 PM
11 votes
2 answers
43675 views
sort and uniq in awk
I know there are "sort" and "uniq" out there, however, today's question is about how to utilise AWK to do that kind of a job. Say if I have a list of anything really (ips, names, or numbers) and I want to sort them; Here is an example I am taking the IP numbers from a mail log: awk 'match($0,/\[[[:d...
I know there are "sort" and "uniq" out there, however, today's question is about how to utilise AWK to do that kind of a job. Say if I have a list of anything really (ips, names, or numbers) and I want to sort them; Here is an example I am taking the IP numbers from a mail log: awk 'match($0,/\[[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+\]/) { if ( NF == 8 && $6 == "connect" ) {print substr($0, RSTART+1,RLENGTH-2)} }' maillog Is it possible to sort them, ips, "on the go" within the same awk command? I do not require a complete answer to my question but some hints where to start. Cheers!
Peter (309 rep)
Mar 30, 2015, 08:10 AM • Last activity: May 22, 2023, 12:01 PM
52 votes
2 answers
116169 views
Common lines between two files
I have the following code that I run on my Terminal. LC_ALL=C && grep -F -f genename2.txt hg38.hgnc.bed > hg38.hgnc.goi.bed This doesn't give me the common lines between the two files. What am I missing there?
I have the following code that I run on my Terminal. LC_ALL=C && grep -F -f genename2.txt hg38.hgnc.bed > hg38.hgnc.goi.bed This doesn't give me the common lines between the two files. What am I missing there?
Marwah Soliman (713 rep)
Oct 14, 2017, 06:46 PM • Last activity: May 16, 2023, 07:00 AM
0 votes
0 answers
26 views
numeric sort with unique option does not show 0!
I have a file with many redundant numbers in each row. Imagine something like the below: ``` echo "10 9 5 6 4 cell 3 2 0 7 0 1" > test ``` When I use `sort -un test` I get the following output: ``` cell 1 2 3 4 5 6 7 9 10 ``` while I expect the below (I mean `0` as a first row of the output): ``` 0...
I have a file with many redundant numbers in each row. Imagine something like the below:
echo "10
9
5
6
4
cell
3
2
0
7
0
1" > test
When I use sort -un test I get the following output:
cell
1
2
3
4
5
6
7
9
10
while I expect the below (I mean 0 as a first row of the output):
0
1
2
3
4
5
6
7
9
10
Applying the sort -n and then redirecting to uniq doesn't make such a mess, however, it shows the non-numeric lines. Is there any way to just use the sort with -nu to get 0 at the first line instead of an alphanumeric token?
javadr (131 rep)
Aug 29, 2022, 12:14 PM • Last activity: Aug 29, 2022, 12:21 PM
21 votes
3 answers
25738 views
Uniq won't remove duplicate
I was using the following command curl -silent http://api.openstreetmap.org/api/0.6/relation/2919627 http://api.openstreetmap.org/api/0.6/relation/2919628 | grep node | awk '{print $3}' | uniq when I wondered why `uniq` wouldn't remove the duplicates. Any idea why ?
I was using the following command curl -silent http://api.openstreetmap.org/api/0.6/relation/2919627 http://api.openstreetmap.org/api/0.6/relation/2919628 | grep node | awk '{print $3}' | uniq when I wondered why uniq wouldn't remove the duplicates. Any idea why ?
Matthieu Riegler (535 rep)
Feb 8, 2014, 02:41 AM • Last activity: Aug 1, 2022, 07:29 PM
0 votes
2 answers
307 views
tar processing files multiple times with find -newer
I'm trying to use tar(1) to create an archive of files newer than a specific file (`fileA`). However, when I use find(1) to obtain the list of files to pass to tar, some files are listed multiple times: ``` $ touch fileA $ mkdir test $ touch test/{fileB,fileC} $ tar -c -v $(find test -newer fileA) >...
I'm trying to use tar(1) to create an archive of files newer than a specific file (fileA). However, when I use find(1) to obtain the list of files to pass to tar, some files are listed multiple times:
$ touch fileA
$ mkdir test
$ touch test/{fileB,fileC}
$ tar -c -v $(find test -newer fileA) > test.tar
test/
test/fileC
test/fileB
test/fileC
test/fileB
Using xargs(1) to pass the list of files to tar results in similar behavior:
$ find test -newer fileA | xargs tar -c -v > test.tar
test/
test/fileC
test/fileB
test/fileC
test/fileB
Using sort(1) and uniq(1) to remove duplicates doesn't work either:
$ find test -newer fileA | sort | uniq | xargs tar -c -v > test.tar
test/
test/fileC
test/fileB
test/fileB
test/fileC
Is there a way for tar to only include each file newer than fileA once? **Edit:** I'm specifically looking for a solution that doesn't involve GNU extensions to tar (for example, which would work with suckless tar ).
Vilinkameni (1639 rep)
Jul 6, 2022, 02:19 PM • Last activity: Jul 6, 2022, 03:00 PM
179 votes
5 answers
293243 views
What is the difference between "sort -u" and "sort | uniq"?
Everywhere I see someone needing to get a sorted, unique list, they always pipe to `sort | uniq`. I've never seen any examples where someone uses `sort -u` instead. Why not? What's the difference, and why is it better to use uniq than the unique flag to sort?
Everywhere I see someone needing to get a sorted, unique list, they always pipe to sort | uniq. I've never seen any examples where someone uses sort -u instead. Why not? What's the difference, and why is it better to use uniq than the unique flag to sort?
Benubird (6082 rep)
May 16, 2013, 11:22 AM • Last activity: May 31, 2022, 04:15 AM
4 votes
1 answers
10296 views
Difference between sort -u and uniq -u
I always have been using `sort -u` to get rid of duplicates until now. But I am having a real doubt about a list generated by a software tool. The question is: is the output of `sort -u |wc` the same as `uniq -u |wc`? Because they don't yield the same results. The manual for `uniq` specifies: > -u,...
I always have been using sort -u to get rid of duplicates until now. But I am having a real doubt about a list generated by a software tool. The question is: is the output of sort -u |wc the same as uniq -u |wc? Because they don't yield the same results. The manual for uniq specifies: > -u, --unique only print unique lines My output consists of 1110 words for which sort -u keeps 1020 lines and uniq -u 1110 lines, the correct amount. The issue is that I cannot visually spot any duplicates on the list which is generated by using > at the end of the command line, and that there IS an issue with the total cracked passwords (in the context of customizing john the ripper).
Yvain (248 rep)
May 30, 2022, 07:03 PM • Last activity: May 30, 2022, 07:28 PM
0 votes
3 answers
77 views
de-duplicate list but group parts of it
I am compiling some access rules from failed logins and after some piping I arrived at this: ```shell cat <<INPUT | sort -k 3,3 --unique Deny from 13.42.98.142 # demo Deny from 13.42.98.142 # test Deny from 13.42.98.142 # user Deny from 133.142.200.152 # admin INPUT ``` Just out of interest, I would...
I am compiling some access rules from failed logins and after some piping I arrived at this:
cat <
Just out of interest, I would like to keep the tried login names (the last field). My test code would output:
Deny from 13.42.98.142 # demo
Deny from 133.142.200.152 # admin
I am looking for an output like:
Deny from 13.42.98.142 # demo, test, user
Deny from 133.142.200.152 # admin
or even better (because it would be valid .htaccess syntax):
# demo, test, user
Deny from 13.42.98.142
# admin
Deny from 133.142.200.152
**Note**: The input is just how I made it now - I am not stubborn with it and can change it if it suits an elegant solution better. I'll accept also general answers how grouping in lists can be achieved in shell.
Jonas Eberle (513 rep)
May 1, 2022, 11:10 AM • Last activity: May 2, 2022, 06:01 AM
Showing page 1 of 20 total questions