Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

2 votes
1 answers
1989 views
ZSH - PATH Duplication : Directory added at end of PATH keeps duplicating when re-opening Terminal Session
I have recently installed PIPX on MAC running Big Sur and ZSH shell. During the install it prompted for the following to be added to the `.zshrc` file.... # Created by `pipx` on 2021-03-20 14:22:23 export PATH="$PATH:/Users/xxxx/.local/bin" eval "$(register-python-argcomplete pipx)" Running echo `$P...
I have recently installed PIPX on MAC running Big Sur and ZSH shell. During the install it prompted for the following to be added to the .zshrc file.... # Created by pipx on 2021-03-20 14:22:23 export PATH="$PATH:/Users/xxxx/.local/bin" eval "$(register-python-argcomplete pipx)" Running echo $PATH showed /Users/xxxx/.local/bin added to the end of my PATH variable. However, when I close the terminal and open up a new session, running echo $PATH now shows the location duplicated at the end of the PATH :/Users/xxxx/.local/bin:/Users/xxxx/.local/bin Opening and closing new terminal sessions doesn't seem to create any more additions to PATH it just remains at these 2 entries.... I have run typeset -U PATH path to remove the duplicate but each time I open up new terminal sessions it just duplicates again. Does anybody know how I can stop this from happening.....I would really like to keep my PATH variable as clean as possible.
KB_cov (29 rep)
Mar 21, 2021, 03:27 PM • Last activity: Apr 25, 2025, 01:03 AM
0 votes
1 answers
400 views
iptables duplicate port traffic
I want to clone/duplicate all udp traffic incoming on port 8500 to port 8600. It is important that the source address is not modified. Also both ports must be accessible by applications (the packets must still arrive on the original port). This solution (https://unix.stackexchange.com/questions/7048...
I want to clone/duplicate all udp traffic incoming on port 8500 to port 8600. It is important that the source address is not modified. Also both ports must be accessible by applications (the packets must still arrive on the original port). This solution (https://unix.stackexchange.com/questions/704887/nftables-duplicate-udp-packets-for-specific-destination-ipport-to-a-second-d) does work on a newer system, unfortunately the machine in question is running kernel 3.10 on RHEL 7 and I am not allowed to update it.
mirokai (43 rep)
Apr 17, 2024, 10:34 AM • Last activity: Apr 17, 2024, 04:34 PM
3 votes
2 answers
2147 views
Duplicate stdin to stdout and stderr, but in a synchronized way
I need to duplicate the stdout of a producer and feed it to two consumer in a **synchronized** fashion. consumer 1 producer | duplicator | consumer 2 This can easily be accomplished for example via `tee`: ```bash ((cat f.txt | tee /dev/stderr | ./cons1.py >&3) 2>&1 | ./cons2.py) 3>&1 ``` or via name...
I need to duplicate the stdout of a producer and feed it to two consumer in a **synchronized** fashion. consumer 1 producer | duplicator | consumer 2 This can easily be accomplished for example via tee:
((cat f.txt | tee /dev/stderr | ./cons1.py >&3) 2>&1 | ./cons2.py) 3>&1
or via named pipes:
mkfifo fifo1 fifo2
cat f.txt | tee fifo1 fifo2 >/dev/null &

int main()
{
    char *line = NULL;
    size_t size;
    while (getline(&line, &size, stdin) != -1) {
        fprintf(stdout, "%s", line);
        fprintf(stderr, "%s", line);
    }
    return 0;
}
and then:
((cat f.txt | ./dup | ./cons1.py >&3) 2>&1 | ./cons2.py) 3>&1
However, **if consumer 1 is faster than consumer 2 we have a problem**. E.g., consumer 1 is already at line 50,000 while consumer 2 is at line 17,000. For my system **I need that both consumers are at the same line, hence the faster consumer needs to be restricted**. I know that this might be impossible via Linux standard tools. However, at least if we use that dup.c approach, it should be somehow possible. Any suggestions how to accomplish this? Thanks!
m33x (33 rep)
Sep 6, 2015, 12:06 PM • Last activity: Dec 14, 2023, 06:31 PM
0 votes
0 answers
25 views
Linux home dir has a "duplicate" under /home/music but when anything is deleted from one it disappears from both -- how can I delete 2nd home dir?
I don't know where the 2nd "phantom" home dir came from. Properties folder shows it's 4K. The "real" home directory is 128,000+ files and 208GB. User10489 had the answer: a symlink directory. In fact, using the ls -l he suggested, I found a second directory that was also a symlink. I used rm to remo...
I don't know where the 2nd "phantom" home dir came from. Properties folder shows it's 4K. The "real" home directory is 128,000+ files and 208GB. User10489 had the answer: a symlink directory. In fact, using the ls -l he suggested, I found a second directory that was also a symlink. I used rm to remove both. Thank you.
R E Brinson (1 rep)
Dec 2, 2023, 03:06 AM • Last activity: Dec 2, 2023, 09:59 PM
8 votes
7 answers
19353 views
How to find and delete duplicate files within the same directory?
I want to find duplicate files, within a directory, and then delete all but one, to reclaim space. How do I achieve this using a shell script? For example: pwd folder Files in it are: log.bkp log extract.bkp extract I need to compare log.bkp with all the other files and if a duplicate file is found...
I want to find duplicate files, within a directory, and then delete all but one, to reclaim space. How do I achieve this using a shell script? For example: pwd folder Files in it are: log.bkp log extract.bkp extract I need to compare log.bkp with all the other files and if a duplicate file is found (by it's content), I need to delete it. Similarly, file 'log' has to be checked with all other files, that follow, and so on. So far, I have written this, But it's not giving desired result. #!/usr/bin/env ksh count=ls -ltrh /folder | grep '^-'|wc -l for i in /folder/* do for (( j=i+1; j<=count; j++ )) do echo "Current two files are $i and $j" sdiff -s $i $j if [ echo $? -eq 0 ] then echo "Contents of $i and $j are same" fi done done
Su_scriptingbee (319 rep)
May 28, 2017, 02:41 PM • Last activity: Oct 24, 2023, 04:15 PM
0 votes
3 answers
34 views
Filter duplicated based on values of another column
I have the following example of a dataframe.Where you see that elements of 3nd column could be duplicated.I want to keep the entry which has the highest value in column 5 Meaning that for **AGCCCGGGG** I want to keep the second entry which the 5th column has the value of 49. A00643:620:HFM7YDSX5:1:1...
I have the following example of a dataframe.Where you see that elements of 3nd column could be duplicated.I want to keep the entry which has the highest value in column 5 Meaning that for **AGCCCGGGG** I want to keep the second entry which the 5th column has the value of 49. A00643:620:HFM7YDSX5:1:1124:7120:12352 ATCAGCCCGGGGCTTGGGCTAGGAC GGGTGTGTG 548476 0 Corynebacterium A00643:620:HFM7YDSX5:1:1150:15953:12524 CCTATCGTCGCTGGAATTCCCCGGG AGCCCGGGG 1458266 1 Bordetella A00643:620:HFM7YDSX5:1:1150:15628:12743 CCTATCGTCGCTGGAATTCCCCGGG AGCCCGGGG 1458266 49 Bordetella A00643:620:HFM7YDSX5:1:1450:4001:4507 GGCGATCGAAATGTCAAGCCCGGGG TCTTGTGGT 585529 0 Corynebacterium A00643:620:HFM7YDSX5:1:2124:8865:2472 ATCAGCCCGGGGCTTGGGCTAGGAC GGGTGTGTG 548476 0 Corynebacterium A00643:620:HFM7YDSX5:1:2476:4001:29496 ATTCACCCTATAGGAGCCCGGGGCA TGCCCCGGG 1458266 0 Bordetella
Anna Antonatou -Pappaioannou (1 rep)
May 15, 2023, 10:32 AM • Last activity: Oct 20, 2023, 09:45 AM
0 votes
3 answers
106 views
Need assistance with awk/sed to identify/mark duplicate IP addresses
Good day. I have a text file which contains pod/node names and associated IPv6 addresses of which two pods have the same IP address, first pod **k8-worker0001c-cif-9d86d6dd4-vf9b9** and last pod **k8-worker0001c-ctdc-5bc95b699f-xnmrn**, the IP address being **2001:1890:e00f:3900::6** ```$ cat /tmp/d...
Good day. I have a text file which contains pod/node names and associated IPv6 addresses of which two pods have the same IP address, first pod **k8-worker0001c-cif-9d86d6dd4-vf9b9** and last pod **k8-worker0001c-ctdc-5bc95b699f-xnmrn**, the IP address being **2001:1890:e00f:3900::6**
$ cat /tmp/dup_ip.txt
k8-worker0001c-cif-9d86d6dd4-vf9b9
         2001:1890:e00f:3900::4/64 global nodad
         2001:1890:e00f:3900::6/64 global 

k8-worker0001c-cifpartner-64c89f8bc8-8p5pq
         2001:1890:e00f:3900::10/64 global 

k8-worker0001c-ctd-7d759784ff-2gk5d
         2001:1890:e00f:3900::a/64 global nodad
         2001:1890:e00f:3900::d/64 global 

k8-worker0001c-ctd-7d759784ff-hd8jp
         2001:1890:e00f:3900::c/64 global 

k8-worker0001c-ctd-7d759784ff-qkk4t
         2001:1890:e00f:3900::8/64 global nodad
         2001:1890:e00f:3900::f/64 global 

k8-worker0001c-ctd-7d759784ff-t6lwz
         2001:1890:e00f:3900::5/64 global 

k8-worker0001c-ctd-7d759784ff-vl8x9
         2001:1890:e00f:3900::9/64 global nodad
         2001:1890:e00f:3900::b/64 global 

k8-worker0001c-ctdc-5bc95b699f-xnmrn
         2001:1890:e00f:3900::7/64 global nodad
         2001:1890:e00f:3900::6/64 global
All I need is a one-liner to identify the duplicate IP address while retaining the rest, including the pod names. I have tried using **awk** **!seen** but that deletes the duplicate which I don't want. Therefore I'd like something like this:
$ cat /tmp/dup_ip.txt
k8-worker0001c-cif-9d86d6dd4-vf9b9
         2001:1890:e00f:3900::4/64 global nodad
         2001:1890:e00f:3900::6/64 global          DUPLICATE!

k8-worker0001c-cifpartner-64c89f8bc8-8p5pq
         2001:1890:e00f:3900::10/64 global 

k8-worker0001c-ctd-7d759784ff-2gk5d
         2001:1890:e00f:3900::a/64 global nodad
         2001:1890:e00f:3900::d/64 global 

k8-worker0001c-ctd-7d759784ff-hd8jp
         2001:1890:e00f:3900::c/64 global 

k8-worker0001c-ctd-7d759784ff-qkk4t
         2001:1890:e00f:3900::8/64 global nodad
         2001:1890:e00f:3900::f/64 global 

k8-worker0001c-ctd-7d759784ff-t6lwz
         2001:1890:e00f:3900::5/64 global 

k8-worker0001c-ctd-7d759784ff-vl8x9
         2001:1890:e00f:3900::9/64 global nodad
         2001:1890:e00f:3900::b/64 global 

k8-worker0001c-ctdc-5bc95b699f-xnmrn
         2001:1890:e00f:3900::7/64 global nodad
         2001:1890:e00f:3900::6/64 global         DUPLICATE!
Thanks in advance, Bjoern
Bjoern (59 rep)
May 8, 2023, 07:16 PM • Last activity: May 9, 2023, 07:54 PM
0 votes
2 answers
146 views
awk is automatically duplicating some lines. Can someone explain?
My data looks like: A 4 G 1 G 1 C 4 C 2 C 2 T 6 T 5 T 5 A 6 T 2 T 2 C 6 T 2 T 2 T 6 G 2 G 2 I am trying the command: awk -F " " '$1==$3 {$7=$6; print $0;} $1==$5 {$7=$4; print $0;} ($1 != $3 && $1 != $5) {$7=$2; print $0}' test.txt While the data has only 5 lines the output has 7 lines and certain l...
My data looks like: A 4 G 1 G 1 C 4 C 2 C 2 T 6 T 5 T 5 A 6 T 2 T 2 C 6 T 2 T 2 T 6 G 2 G 2 I am trying the command: awk -F " " '$1==$3 {$7=$6; print $0;} $1==$5 {$7=$4; print $0;} ($1 != $3 && $1 != $5) {$7=$2; print $0}' test.txt While the data has only 5 lines the output has 7 lines and certain lines are randomly duplicated. Somehow it happens with only this dataset and not the other datasets that I have. Can someone please help. I don't understand what is happening
user563991 (9 rep)
Mar 3, 2023, 03:28 PM • Last activity: Mar 3, 2023, 03:50 PM
1 votes
1 answers
1171 views
Find duplicate IPs for different MACs
Using arp-scan to get a list of returned duplicate IP address. However, arp-scan will list duplicate IP with the same MAC address. I get a sorted output of asx.txt (shortened for brevity) ~~~ arp-scan 172.16.0.0/16 > as.txt sort as.txt > as2.txt cat as2.txt | uniq -D -w 36 > asx.txt kye-mgmt02:/data...
Using arp-scan to get a list of returned duplicate IP address. However, arp-scan will list duplicate IP with the same MAC address. I get a sorted output of asx.txt (shortened for brevity) ~~~ arp-scan 172.16.0.0/16 > as.txt sort as.txt > as2.txt cat as2.txt | uniq -D -w 36 > asx.txt kye-mgmt02:/data # cat asx.txt 172.16.150.68 d8:cb:8a:b0:6a:12 Micro-Star INTL CO., LTD. 172.16.150.68 d8:cb:8a:b0:6a:12 Micro-Star INTL CO., LTD. (DUP: 2) 172.16.150.69 00:23:24:9e:3d:32 G-PRO COMPUTER 172.16.150.69 00:23:24:9e:3d:32 G-PRO COMPUTER (DUP: 2) 172.16.150.70 00:23:24:9e:3d:82 G-PRO COMPUTER 172.16.150.70 00:23:24:9e:3d:82 G-PRO COMPUTER (DUP: 2) 172.16.150.71 d8:cb:8a:86:2f:56 Micro-Star INTL CO., LTD. 172.16.150.71 d8:cb:8a:86:2f:56 Micro-Star INTL CO., LTD. (DUP: 2) 172.16.150.72 d8:cb:8a:cf:f1:e8 Micro-Star INTL CO., LTD. 172.16.150.72 d8:cb:8a:cf:f1:e8 Micro-Star INTL CO., LTD. (DUP: 2) 172.16.150.73 d8:cb:8a:cf:f1:5d Micro-Star INTL CO., LTD. 172.16.150.73 d8:cb:8a:cf:f1:5d Micro-Star INTL CO., LTD. (DUP: 2) ~~~ So as you can see, all the IPs are really not duplicated because the IPs have the same MAC address to really find a duplicate IP with a different MAC, I edited the file and change the MAC of the last IP. ~~~ kye-mgmt02:/data # cat asx.txt 172.16.150.68 d8:cb:8a:b0:6a:12 Micro-Star INTL CO., LTD. 172.16.150.68 d8:cb:8a:b0:6a:12 Micro-Star INTL CO., LTD. (DUP: 2) 172.16.150.69 00:23:24:9e:3d:32 G-PRO COMPUTER 172.16.150.69 00:23:24:9e:3d:32 G-PRO COMPUTER (DUP: 2) 172.16.150.70 00:23:24:9e:3d:82 G-PRO COMPUTER 172.16.150.70 00:23:24:9e:3d:82 G-PRO COMPUTER (DUP: 2) 172.16.150.71 d8:cb:8a:86:2f:56 Micro-Star INTL CO., LTD. 172.16.150.71 d8:cb:8a:86:2f:56 Micro-Star INTL CO., LTD. (DUP: 2) 172.16.150.72 d8:cb:8a:cf:f1:e8 Micro-Star INTL CO., LTD. 172.16.150.72 d8:cb:8a:cf:f1:e8 Micro-Star INTL CO., LTD. (DUP: 2) 172.16.150.73 d8:cb:8a:cf:f1:5d Micro-Star INTL CO., LTD. 172.16.150.73 d8:cb:8a:cf:f1:55 Micro-Star INTL CO., LTD. (DUP: 2) ~~~ Looking on how to output the duplicate IPs with different MACs expected output ~~~ 172.16.150.73 d8:cb:8a:cf:f1:5d Micro-Star INTL CO., LTD. 172.16.150.73 d8:cb:8a:cf:f1:55 Micro-Star INTL CO., LTD. (DUP: 2) ~~~ I can't seem to find the right options to output the duplicate IPs with different MACs Help please. --- **tried cat asx.txt | uniq -D -s 15 -w 33 cat asx.txt | uniq -D -s 15 -w 17-33 cat asx.txt | uniq -D -f1 -w 33 cat asx.txt | uniq -D -f1 -w 32 cat asx.txt | uniq -D -f1 -w 31 cat asx.txt | uniq -D -f1 -w 30 cat asx.txt | uniq -D -f1 cat asx.txt | uniq -D -s 15 But none gives the desired output.
user552826
Dec 12, 2022, 03:58 PM • Last activity: Dec 12, 2022, 06:18 PM
0 votes
0 answers
189 views
Removing duplicate values based on two columns
I have a file that would like to filter duplicate values based column 1 and 6 ID,sample,NAME,reference,app_name,appession_id,workflow,execution_status,status,date_created 1,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 1,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Comple...
I have a file that would like to filter duplicate values based column 1 and 6 ID,sample,NAME,reference,app_name,appession_id,workflow,execution_status,status,date_created 1,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 1,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 1,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 1,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 2,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 2,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 2,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 2,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 and the final output should look like ID,sample,NAME,reference,app_name,appession_id,workflow,execution_status,status,date_created 1,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 2,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 So far this is what I have tried awk '!a[$1 $6]++ { print ;}' input.csv > output.csv I end up with ID,sample,NAME,reference,app_name,appession_id,workflow,execution_status,status,date_created 1,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 2,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 2,ABC,XYZ,DOP,2022-08-18 13:31:09Z,28997974,same,Complete,PASS,18/08/2022 Any suggestion would be helpful. Thank you
nbn (113 rep)
Oct 14, 2022, 03:59 PM • Last activity: Oct 17, 2022, 07:58 AM
1 votes
1 answers
347 views
How to sort a list of strings that contain a combination of letters and numbers
I want to sort following strings by the number and remove duplication in a file ``` cat311 celine434 celine434 celine5 jimmy12 john44 john41 ``` to be ``` celine5 jimmy12 john41 john44 cat311 celine434 ```
I want to sort following strings by the number and remove duplication in a file
cat311
celine434
celine434
celine5
jimmy12
john44
john41
to be
celine5
jimmy12
john41
john44
cat311
celine434
user8090410 (11 rep)
Sep 22, 2022, 06:28 PM • Last activity: Sep 22, 2022, 06:46 PM
2 votes
2 answers
1210 views
Find duplicate files based on first few characters of filename
I am looking for a way in Linux shell, preferably bash to find duplicates of files based on first few letters of the filenames. **Where this would be useful:** I build mod packs for Minecraft. As of 1.14.4 Forge no longer errors if there are duplicate mods in a pack of higher versions. It simply sto...
I am looking for a way in Linux shell, preferably bash to find duplicates of files based on first few letters of the filenames. **Where this would be useful:** I build mod packs for Minecraft. As of 1.14.4 Forge no longer errors if there are duplicate mods in a pack of higher versions. It simply stops the oldest versions from running. A script to help find these duplicates would be very advantageous. Example listing:
minecolonies-0.13.312-beta-universal.jar   
minecolonies-0.13.386-alpha-universal.jar
by quickly being able to identify the dupes i can keep the client pack small. **More information as requested** There is no specific format. However as you can see there at least 2 prevailing formats. Further there is no standard in community about what kind of characters to use or not use. Some use spaces (ick), some use [] (also ick), some use _'s (more ick), some use -'s (preferred but what can you do). https://gist.github.com/be3cc9a77150194476b2000cb8ee16e5 for sample mods list of the filenames. Has been cleaned so no dupes in it. https://gist.github.com/b0ac1e03145e893e880da45cf08ebd7a contains a sample where I deliberately made duplicates. It is an over-exaggeration of happens from time to time. **Deeper Explanation** I realize this might be resource heavy to do. I would like to arbitrarily specify a slice range start to finish of all filenames to sample. Find duplicates based on that slice, and then hilight the duplicates. I don't need the script to actually delete them. **Extra Credit** The script would present a menu for files that it suspects match the duplication criterion allowing for easy deleting or renaming.
Kreezxil (75 rep)
Oct 29, 2020, 04:43 PM • Last activity: Aug 17, 2022, 09:45 AM
17 votes
12 answers
41079 views
Remove all duplicate word from string using shell script
I have a string like "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc" I want to remove duplicate word from string then output will be like "aaa,bbb,ccc" I tried This code [Source][1] $ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs It is working fine with same value,but when I give my v...
I have a string like "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc" I want to remove duplicate word from string then output will be like "aaa,bbb,ccc" I tried This code Source $ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs It is working fine with same value,but when I give my variable value then it is showing all duplicate word also. How can I remove duplicate value. **UPDATE** My question is adding all corresponding value into a single string if user is same .I have data like this -> user name | colour AAA | red AAA | black BBB | red BBB | blue AAA | blue AAA | red CCC | red CCC | red AAA | green AAA | red AAA | black BBB | red BBB | blue AAA | blue AAA | red CCC | red CCC | red AAA | green In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code - while read the records if [ "$c" == "" ]; then #$c I defined global c="$colour1" else c="$c,$colour1" fi When I print this $c variable i get the output (For User AAA) "red,black,blue,red,green,red,black,blue,red,green," I want to remove duplicate color .Then desired output should be like "red,black,blue,green" For this desired output i used above code echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs but it is displaying the output with duplicate values .Like "red,black,blue,red,green,red,black,blue,red,green," Thanks
Urvashi (343 rep)
Mar 23, 2017, 12:41 PM • Last activity: Aug 3, 2022, 11:27 PM
10 votes
6 answers
7548 views
How to delete all duplicate hardlinks to a file?
I've got a directory tree created by `rsnapshot`, which contains multiple snapshots of the same directory structure with all identical files replaced by hardlinks. I would like to delete all those hardlink duplicates and keep only a single copy of every file (so I can later move all files into a sor...
I've got a directory tree created by rsnapshot, which contains multiple snapshots of the same directory structure with all identical files replaced by hardlinks. I would like to delete all those hardlink duplicates and keep only a single copy of every file (so I can later move all files into a sorted archive without having to touch identical files twice). Is there a tool that does that? So far I've only found tools that find duplicates and *create* hardlinks to replace them… I guess I could list all files and their inode numbers and implement the deduplicating and deleting myself, but I don't want to reinvent the wheel here.
n.st (8378 rep)
May 31, 2016, 02:21 PM • Last activity: May 19, 2022, 09:25 AM
5 votes
3 answers
6385 views
Find and list duplicate directories
I have directory that has a number of sub-directories and would like to find any duplicates. The folder structure looks something like this: └── Top_Dir └── Level_1_Dir ├── standard_cat │ ├── files.txt ├── standard_dog │ └── files.txt └── standard_snake └── files.txt └── Level_2_Dir ├── standard_moo...
I have directory that has a number of sub-directories and would like to find any duplicates. The folder structure looks something like this: └── Top_Dir └── Level_1_Dir ├── standard_cat │ ├── files.txt ├── standard_dog │ └── files.txt └── standard_snake └── files.txt └── Level_2_Dir ├── standard_moon │ ├── files.txt ├── standard_sun │ └── files.txt └── standard_cat └── files.txt └── Level_3_Dir ├── standard_man │ ├── files.txt ├── standard_woman │ └── files.txt └── standard_moon └── files.txt With the above example I would like to see an output of: /top_dir/Level_1_Dir/standard_cat /top_dir/Level_2_Dir/standard_cat /top_dir/Level_2_Dir/standard_moon /top_dir/Level_3_Dir/standard_moon I have been doing some searching on how to get this done via bash and I got nothing. Anyone know a way to do this?
dino (51 rep)
Jun 9, 2016, 03:21 AM • Last activity: Apr 21, 2022, 12:04 AM
0 votes
0 answers
64 views
Delete duplicated contents from files
I have many backups of a same file. Is there a way to transform that into incremental backup? Those files aren't exactly the same (not same timestamps sometimes, sometimes new data appended here and there) I can't just search for dupes files, and I can't just delete old files for the new one, becaus...
I have many backups of a same file. Is there a way to transform that into incremental backup? Those files aren't exactly the same (not same timestamps sometimes, sometimes new data appended here and there) I can't just search for dupes files, and I can't just delete old files for the new one, because sometimes the old one have data not here anymore I want a way to delete duplicated content from files. So there will be unique data across all the files. Ideally that would be merging, because if I just delete bunch of datas, the file would be unopenable, because sometimes theres duplicated formatting datas The problem is idk if new datas are purely by lines, or sometimes in the same line. It's not just a story about dupe lines, sometimes it's a part of the line who is duplicated Do you have any ideas?
aac (145 rep)
Mar 3, 2022, 08:31 AM
-1 votes
1 answers
124 views
Skip line from console if equal than line before, adding count (in realtime)
Using uniq it is possible to filter out sequential duplicate lines. while (true) do echo 1; echo 2; echo 2; echo 1; sleep 1; done | uniq becomes: 1 2 1 Is there a way to have duplicated sequential lines removed, while adding the number of repetitions? E.g. in the example above 1 2 (2) 1 And if a new...
Using uniq it is possible to filter out sequential duplicate lines. while (true) do echo 1; echo 2; echo 2; echo 1; sleep 1; done | uniq becomes: 1 2 1 Is there a way to have duplicated sequential lines removed, while adding the number of repetitions? E.g. in the example above 1 2 (2) 1 And if a new "1" line arrives, the above should become: 1 2 (2) 1 (2) This is not for a file but for a stream (such as tail -f), where new lines are being added in real time.
Jose G&#243;mez (101 rep)
Jan 25, 2022, 01:49 PM • Last activity: Jan 25, 2022, 06:33 PM
-1 votes
1 answers
1912 views
remove duplicate lines across multiple txt files
I have 12 text files all in one folder, each with about 5 million lines, each file has no duplicate line on its own but there are duplicated across multiple files, I want to remove the duplicate lines in each file but still save them separately, I have tried many Linux sort command and it keep mergi...
I have 12 text files all in one folder, each with about 5 million lines, each file has no duplicate line on its own but there are duplicated across multiple files, I want to remove the duplicate lines in each file but still save them separately, I have tried many Linux sort command and it keep merging the file together, I have Windows, Linus, and Mac, Is there any code or application that can do this?
Surprise Awofemi (41 rep)
Jan 14, 2022, 05:52 PM • Last activity: Jan 15, 2022, 06:24 AM
1 votes
2 answers
981 views
Remove duplicates of specific line keeping only the first appearance of each without touching other unspecified duplicates
I'm trying to edit a text file containing several duplicates. The goal is to keep only the first match of a string and remove the rest duplicate lines of the same string. In the example file ``` * Title 1 ** Subtitle 01 #+begin_src Line 001 Line 002 #+end_src * Title 1 ** Subtitle 02 #+begin_src Lin...
I'm trying to edit a text file containing several duplicates. The goal is to keep only the first match of a string and remove the rest duplicate lines of the same string. In the example file
* Title 1
** Subtitle 01
#+begin_src
  Line 001
  Line 002
#+end_src

* Title 1
** Subtitle 02
#+begin_src
  Line 001
  Line 002
#+end_src

* Title 2
** Subtitle 01
#+begin_src
  Line 001
  Line 002
#+end_src

* Title 2
** Subtitle 02
#+begin_src
  Line 001
  Line 002
#+end_src
I'd like to keep one of each
* Title N
and *keep all other unrelated/unspecified duplicate lines* on the file. So the result would be:
* Title 1
** Subtitle 01
#+begin_src
  Line 001
  Line 002
#+end_src

** Subtitle 02
#+begin_src
  Line 001
  Line 002
#+end_src

* Title 2
** Subtitle 01
#+begin_src
  Line 001
  Line 002
#+end_src

** Subtitle 02
#+begin_src
  Line 001
  Line 002
#+end_src
The traditional solutions for removing duplicates like
uniq file.txt
[Useful AWK One-Liners to Keep Handy](https://linoxide.com/useful-awk-one-liners-to-keep-handy/) :
awk '!a[$0]++' contents.txt
[shell - How to delete duplicate lines in a file without sorting it in Unix - Stack Overflow](https://stackoverflow.com/questions/1444406/how-to-delete-duplicate-lines-in-a-file-without-sorting-it-in-unix/32513573#32513573)
perl -ne 'print if ! $x{$_}++' file
delete every duplicate indiscriminately. I tried using variations of these solutions and also GNU
in a loop format like
duplicateLines=$(grep -E "^\* .*" file.org | uniq)
  printf '%s\n' "$duplicateLines" | while read -r line; do
  sed "s/$line//g2" file.org
done
with no success. I don't mind absolute performance so doing multiple iterations like calling
inside a loop to remove one specified string at a time is no problem. Any insight would be very much appreciated. It would be nice to be able to do this inside a shell script but I'm open to alternative solutions like Python, C, Java, etc., just tell me what the function/library name is and I'm searching for it there. Thanks.
yeyin33455 (13 rep)
Dec 30, 2021, 01:40 AM • Last activity: Jan 2, 2022, 12:44 AM
0 votes
0 answers
179 views
I have a `raspivid` stream, which I'm piping to `ffmpeg`, now i'd like to also stream a raw version of it to a socket?
I have a process outputing an MJPEG video stream, which I pipe into `ffmpeg` to reduce framerate and then to a socket. ``` raspivid -t 999999 -cd MJPEG -w 1920 -h 1080 -o - | ffmpeg -i - -f mjpeg -r 2 - | nc -l 9010 ``` Now I need to also split the original raw stream into another socket. I've tried...
I have a process outputing an MJPEG video stream, which I pipe into ffmpeg to reduce framerate and then to a socket.
raspivid -t 999999 -cd MJPEG -w 1920 -h 1080 -o - | ffmpeg -i - -f mjpeg -r 2 - | nc -l 9010
Now I need to also split the original raw stream into another socket. I've tried tee command, including with named fifos, but I cant seem to make it work.
Ivan Koshelev (131 rep)
Feb 7, 2021, 12:06 AM
Showing page 1 of 20 total questions