Unix & Linux Stack Exchange
Q&A for users of Linux, FreeBSD and other Unix-like operating systems
Latest Questions
0
votes
1
answers
1197
views
Selecting n random files from one directory and copying them to another folder + other files with same same, but different filetype
I have two directories, lets call them **X** and **Y** Within them I have 100k+ files, .jpg files in **X** and .txt files in **Y** I want to randomly select **N** files from **X** and copy to folder **Z** This should be manageable using find + shuffle. I then want to find all of the files in **Y** w...
I have two directories, lets call them **X** and **Y**
Within them I have 100k+ files, .jpg files in **X** and .txt files in **Y**
I want to randomly select **N** files from **X** and copy to folder **Z**
This should be manageable using find + shuffle.
I then want to find all of the files in **Y** with the same names of the files that were copied to **Z**, but they are .txt files and copy them to directory **W**
To visualize:
N files from **X** >> **Z**
Same N files from **Y** >> **W**
How would I go about doing that?
Martin Pedersen
(3 rep)
Aug 2, 2022, 06:54 AM
• Last activity: Aug 2, 2022, 08:55 AM
4
votes
2
answers
4307
views
How can I add a new line after the output of a command?
I'm working in Bash using this nested shuf command to get an alphanumeric string of variable length: ``` shuf -erz -n $(shuf -e -n 1 {0..5}) {A..Z} {a..z} {0..9} ``` **Current output:** This outputs the alphanumeric string on a new line in front of the shell prompt. ``` MyPrompt:$ shuf -erz -n $(shu...
I'm working in Bash using this nested shuf command to get an alphanumeric string of variable length:
shuf -erz -n $(shuf -e -n 1 {0..5}) {A..Z} {a..z} {0..9}
**Current output:**
This outputs the alphanumeric string on a new line in front of the shell prompt.
MyPrompt:$ shuf -erz -n $(shuf -e -n 1 {0..5}) {A..Z} {a..z} {0..9}
OcxjrMyPrompt:$
**Desired output:**
I'd like to have the output on it's own line.
MyPrompt:$ shuf -erz -n $(shuf -e -n 1 {0..5}) {A..Z} {a..z} {0..9}
Ocxjr
MyPrompt:$
Is there a way to alter the shuf command or append another standard Bash command after the shuf command to get the desired output? Other approaches are welcome too, but I'd like to make the change easy to add/remove.
I tried searching for previous answers, but I'm pretty new to Bash and I can't seem to identify the right terms to lead to helpful results. Likewise, I've experimented with my own approaches (mostly using echo and piping), but haven't had success there either.
LoopedLine
(43 rep)
Feb 14, 2022, 04:25 PM
• Last activity: Feb 14, 2022, 08:18 PM
0
votes
5
answers
200
views
Can a shell script find all groups of consecutive lines matching the same regex and shuffle them?
I'm writing quizzes for my students in a markdown language. One of the quizzes might look like this: % QUESTION Who played drums for The Beatles? (X) Ringo ( ) John ( ) Paul ( ) George % QUESTION What is the first line of MOBY DICK? (X) Call me Ishmael. ( ) foo ( ) bar ( ) spam ( ) eggs I'd like to...
I'm writing quizzes for my students in a markdown language. One of the quizzes
might look like this:
% QUESTION
Who played drums for The Beatles?
(X) Ringo
( ) John
( ) Paul
( ) George
% QUESTION
What is the first line of MOBY DICK?
(X) Call me Ishmael.
( ) foo
( ) bar
( ) spam
( ) eggs
I'd like to randomize all of these multiple choice options. So, I think I need a
shell script that:
1) Finds all blocks of consecutive lines that start with (X) or ( ).
2) Shuffles each of these blocks of lines.
Is this possible? I know that
shuf
and sort -R
will randomize the lines of
any text but I'm not sure of how to go about isolating these blocks of options.
Brian Fitzpatrick
(2907 rep)
Jan 6, 2021, 08:24 AM
• Last activity: Jan 7, 2021, 03:46 AM
0
votes
2
answers
416
views
How to sample without replacement from a script that randomly extracts 200characters using shuf?
I have this script that extracts 200 random characters from a set: ``` #!/usr/bin/bash n=$(stat -c "%s" newfile.txt) r=$(shuf -i1-"$((n-200+1))" -n1) output.txt ``` I know `shuf` is very powerful but I want to include a sampling without replacement. This means that each 200 character extraction has...
I have this script that extracts 200 random characters from a set:
#!/usr/bin/bash
n=$(stat -c "%s" newfile.txt)
r=$(shuf -i1-"$((n-200+1))" -n1)
output.txt
I know shuf
is very powerful but I want to include a sampling without replacement. This means that each 200 character extraction has only one chance to be selected when sampling.
Output should look like this:
>1
GAACTCTACCAAAAGGTATGTTGCTTTCACAAAAAGCTGCATTCGATCATGTGTATAATCTAGCAAAACTAGTAGGAGGAGCAAAATACCCCGAAATTGTTGCTGCTCAGGCAATGCACGAATCAAACTACCTAGATCCTAGG
ACTAATAGTGTTTATAATGCCACAAATAGAACTAATGCTTTCGGTCAAACTGGTGAC
>2
GCCTACCGCATAAAACAGCATCACCGCCACGGCTTCAGGGTATTCTCCAATGGCAAAGGCTCCCATGGTCGCGATGGACATTAAGAGAAATTCAGTAAAGAAATCTCCATTTAGAATACTTTTGAATCCTTCTTTTATCACCG
GAAAACCAACTGGGAGATAGGCCACAATGTACCAACCTACTCGCACCCAATCTGTAA
>3
GCACGTGTCACCGTCAGCATCGCGGCAGCGGAACGGGTCACCCGGATTGCTGTCGGGACCATCGTTTACGCCGTCATTGTCGTTATCGGGATCGCCCGGATTACAAATGCCGTCGCCATCGACGTCGTTACCGTCGTTCGCGG
CATCGGGGAAGCCGGCACCGGCGGCACAGTCATCGCAACCGTCGCCATCGGCATCGA
>4
GCGTTCGAAGCAATTGCACGAGACCCAAACAACGAATTGCTGGTTGTTGAACTGGAAAACTCTCTAGTCGGAATGCTTCAAATTACTTATATTCCCTACCTGACACATATTGGCAGTTGGCGTTGTCTTATAGAAGGTGTTCG
AATCCATAGTGACTATCGTGGACGAGGTTTTGGTGAGCAAATGTTCGCACATGCGAT
>5
GTTTAAGACTAACAGCAATCTGTAAGGACATAGGTGCTGGAGTTGAGGTTAGTCTGGAAGATATGATCTGGGCAGAGAAATTGTCCAAAGCAAACACCGCAGCAAGAGGTATGCTAAACACAGCAAGAAGAATAAGTAATGAT
CCTACTGATTCTTTTCTGAATGAGTTGAATATAGGAGACCCCGACTCAACTCATCAT
The input file is a ~8G file that looks like this:
CCAAGATCGCTGGTTGGCGAATCAATTTCATAAACGCCTACGCTTTCAAGGAACGTGTTAAGAATGTTCT
GGCCGAGTTCCTTATGAGACGTTTCGCGTCCCTTAAATCGAATAACGACACGAACCTTGTCGCCGTCATT
AAGAAAACCCTTTGCCTTCTTGGCCTTAATCTGAATATCACGGGTGTCCGTTACAGGTCGCAACTGGATT
TCCTTGACTTCAGAAACAGACTTACGTGAATTCTTCTTGATTTCTTTCTGACGCTTTTCATTTTCATACT
GGAACTTGCCGTAATCAATGATCTTACAAACAGGAATATCACCCTTATCAGAGATCAATACCAAATCAAG
TTCGGCATCAAAAGCGCGATCAAGTGCGTCTTCAATGTCGAGGACCGTTGTTTCTTCACCGTCAACCAAA
CGAATTGTGGAGGACTTGATGTCGTCTCGGGTACTAATTTTATTCACGTATATGTTACTCCTTATGTTGT
Any help would be appreciated. Thanks in advance.
GSQ
(37 rep)
Dec 19, 2020, 03:23 PM
• Last activity: Dec 25, 2020, 11:12 AM
0
votes
1
answers
337
views
How to replace Shuf with rand (from C++) with a seed of time in order to make my script more random
I have this script that extracts 200 random characters from a set: ``tail -n+2 file.fasta | tr -d '\n' > newfile`` ``n=$(stat -c "%s" newfile)`` ``r=$(shuf -i1-"$((n-200+1))" -n1)`` ``newfile tail -c+"$r" | head -c200`` Do anyone knows if it is possible to change shuf for rand() using a seed of time...
I have this script that extracts 200 random characters from a set:
`
tail -n+2 file.fasta | tr -d '\n' > newfile
`
`n=$(stat -c "%s" newfile)
`
`r=$(shuf -i1-"$((n-200+1))" -n1)
`
`newfile tail -c+"$r" | head -c200
`
Do anyone knows if it is possible to change shuf for rand() using a seed of time (srand(time(0))? I tried to change my script without any success...
Any suggestions? thanks in advance
GSQ
(37 rep)
Dec 10, 2020, 04:02 PM
• Last activity: Dec 10, 2020, 06:48 PM
2
votes
2
answers
1525
views
Does the size of the random_source file matter?
Some GNU coreutils utilities like `sort` and `shuf` use a file as what effectively serves a seed. Does the size of the file matter? The recommended way, https://www.gnu.org/software/coreutils/manual/html_node/Random-sources.html, uses an openssl-based method that takes a rather long time. What if I...
Some GNU coreutils utilities like
sort
and shuf
use a file as what effectively serves a seed. Does the size of the file matter?
The recommended way, https://www.gnu.org/software/coreutils/manual/html_node/Random-sources.html , uses an openssl-based method that takes a rather long time.
What if I just used a 6-letter word as below? Does this affect the ability of said utilities to create pseudo-randomness?
shuf -i1-10 --random-source=<(echo durian)
flow2k
(651 rep)
Jan 26, 2019, 12:58 AM
• Last activity: Dec 5, 2020, 12:36 PM
0
votes
1
answers
121
views
Write all numbers between 0 and large number (both inclusive) to a file in random order
For simulation I am trying to do, I want a text file with numbers ranging from 0 to 2^33 which is a huge number. I have used this command: ``` seq 0 Number >> OUTPUT FILE ``` But this is very slow. The file is nearly 94 GB, so we can't use `shuf`.Then I have used [terashuf](https://github.com/alexan...
For simulation I am trying to do, I want a text file with numbers ranging from 0 to 2^33 which is a huge number. I have used this command:
seq 0 Number >> OUTPUT FILE
But this is very slow. The file is nearly 94 GB, so we can't use shuf
.Then I have used [terashuf](https://github.com/alexandres/terashuf) by Alexandres which is also taking quite a lot of time. Even though I have done what I wanted to do, I wanted to know if there is a faster way to do this in a single command and whether there is any way in which we can truly randomize the order of these numbers
**NOTE:** Even though I have been using Linux from quite a log time, I have very limited knowledge on bash scripting. So please try to give answers which a beginner can understand.
Uday
(101 rep)
May 2, 2020, 01:26 PM
• Last activity: May 2, 2020, 01:48 PM
2
votes
0
answers
266
views
shuffle two parallel text files not giving same lines even if same random source
This is similar to https://unix.stackexchange.com/questions/220390/shuffle-two-parallel-text-files I have: - two large csv files with parallel lines. (they represent 'before' and 'after' states for particular items). The fields are sometimes strings, sometimes numbers. - a sufficiently long random d...
This is similar to https://unix.stackexchange.com/questions/220390/shuffle-two-parallel-text-files
I have:
- two large csv files with parallel lines. (they represent 'before' and 'after' states for particular items). The fields are sometimes strings, sometimes numbers.
- a sufficiently long random data file to use with
shuf
when I want to get a matching random sample I thought of:
shuf -n10 --random-source="random.csv" "file1"
shuf -n10 --random-source="random.csv" "file2"
but these files no longer match.
However, if I put line-numbers in front, it solves the problem:
shuf -n10 --random-source="random.csv" <(cat -n "file1")
shuf -n10 --random-source="random.csv" <(cat -n "file2")
Can someone explain why?
here is sample of random.csv
0.293076138
0.446732207
0.552989654
0.16141527
0.099383023
...
Here is a snippet from the two files:
VA,DEFAULT,72.8027,11.9534.....
VA,DEFAULT,61.8356,11.9342....
VA,DEFAULT,61.8356,....
Note that the first two fields are identical in most of the rows in both files. Maybe this is the issue? I don't know shuf
well enough.
Tim
(237 rep)
Dec 4, 2019, 11:56 PM
• Last activity: Dec 5, 2019, 09:43 PM
4
votes
4
answers
15193
views
How to pick a random file from a folder without repetition using bash?
I can select a random file using this command find ./ -type f | shuf -n 1 But it's showing the same file some times. Is it possible to stop picking duplicate files? Is there any other utility for this task? I have around 50k txt files in a folder which may have recursive subfolders and I want to pic...
I can select a random file using this command
find ./ -type f | shuf -n 1
But it's showing the same file some times.
Is it possible to stop picking duplicate files?
Is there any other utility for this task?
I have around 50k txt files in a folder which may have recursive subfolders and I want to pick a random file to see it and I don't want to see it again + there are new files added to the folder every day...
Akhil
(1370 rep)
Nov 22, 2019, 07:24 PM
• Last activity: Nov 25, 2019, 06:18 AM
0
votes
2
answers
618
views
Piping Output of Shuf Command
Fairly new to Linux. I have a nixie clock that runs off of a Raspberry Pi. I would like to send a random sequence of six digits to it every so often to help prolong the life of the Nixie tubes. There is a CLITool program I found on GitHub that lets me display any six digits using the command `CLIToo...
Fairly new to Linux. I have a nixie clock that runs off of a Raspberry Pi. I would like to send a random sequence of six digits to it every so often to help prolong the life of the Nixie tubes.
There is a CLITool program I found on GitHub that lets me display any six digits using the command
CLITool xxxxxx
, where x is any digit 0-9. So I tried creating a bash file with the line shuf -zer -n6 {0..9}|CLITool
.
The shuf
command produces a random six digit string of numbers, but it does not seem to get piped to the CLITool. Like I mentioned, fairly new to Linux, so it could be something basic I am missing.
Joey29
(9 rep)
Nov 2, 2019, 04:44 PM
• Last activity: Nov 2, 2019, 05:08 PM
3
votes
1
answers
536
views
What does shuf -e means in bash
I have string save as ``` test="test1 test2 test3 test4 test5 test6" ``` and ``` echo $(shuf -e $test) ``` it gives me the same output as $test, why? I expect the different order of the original string
I have string save as
test="test1 test2
test3 test4
test5 test6"
and
echo $(shuf -e $test)
it gives me the same output as $test, why? I expect the different order of the original string
Tiger
(367 rep)
Sep 11, 2019, 05:37 PM
• Last activity: Sep 11, 2019, 06:15 PM
0
votes
0
answers
178
views
Is this segment fault raised when running `shuf`?
I have a script, where there is a line: eval for i in \{"$1".."$2"\}\; do [ ! -e "$3"/\$i.\* ] \&\& echo \"\$i\" \; done \| shuf \| mycommand "$3" which means: first create a sequence of numbers where no files named after the numbers exist, pipe them to `shuf`, and then pipe them to `mycommand` whic...
I have a script, where there is a line:
eval for i in \{"$1".."$2"\}\; do [ ! -e "$3"/\$i.\* ] \&\& echo \"\$i\" \; done \| shuf \| mycommand "$3"
which means: first create a sequence of numbers where no files named after the numbers exist, pipe them to
shuf
, and then pipe them to mycommand
which is an ELF executable.
Most of time the script runs fine, but sometimes it gets segment fault error, i.e. The segment fault error is not reproducible.
$ myscript 0001 734 XMJ
/home/tim/bin/myscript: line 25: 10170 Exit 1 for i in {0001..734};
do
[ ! -e XMJ/$i.* ] && echo "$i";
done
10171 Done | shuf
10172 Segmentation fault (core dumped) | mycommand XMJ
Does that mean that the segment fault is raised when running shuf
?
What can we deduce from the error message and possibly correct it?
Thanks.
Tim
(106440 rep)
Oct 29, 2018, 09:36 PM
• Last activity: Oct 29, 2018, 10:36 PM
3
votes
1
answers
2670
views
Moving random files using shuf and mv - Argument list too long
I have a directory containing nearly 250K files, which are lots of files, and I want to move x random files to another directory. I searched and I got the solution of using the `shuf` and `mv` commands from [here][1] and [here][2], so basically I am using this command $ shuf -n 5533 -e trainB/* | xa...
I have a directory containing nearly 250K files, which are lots of files, and I want to move x random files to another directory.
I searched and I got the solution of using the
shuf
and mv
commands from here and here , so basically I am using this command
$ shuf -n 5533 -e trainB/* | xargs -i mv {} testB/
But I'm receiving this error:
bash: /usr/bin/shuf: Argument list too long
I believe because of the large number of files, so accordingly, the argument list is too long, is there another way to do this?
I'm running on SLES12 SP2.
Mostafa Hussein
(273 rep)
Sep 3, 2018, 04:07 PM
• Last activity: Sep 3, 2018, 04:37 PM
0
votes
1
answers
945
views
How to select random sample of n lines from each file in a directory
I have a directory with many files. From each of these files I want a random sample and copy to a new directory with file names same as from which random sample was drawn.
I have a directory with many files. From each of these files I want a random sample and copy to a new directory with file names same as from which random sample was drawn.
Abhishek Gupta
(1 rep)
Apr 17, 2018, 04:11 PM
• Last activity: Apr 17, 2018, 04:46 PM
2
votes
2
answers
464
views
Randomly select a line in every block of N lines
I'd like to randomly select a line after a given number of lines. For example here's my input: 8 blue 8 red 8 yellow 8 orange 3 pink 3 white 3 cyan 3 purple 1 magenta 1 black 1 green 1 brown and with random selection a line from every four lines, my output would be: 8 orange 3 pink 1 green The best...
I'd like to randomly select a line after a given number of lines. For example here's my input:
8 blue
8 red
8 yellow
8 orange
3 pink
3 white
3 cyan
3 purple
1 magenta
1 black
1 green
1 brown
and with random selection a line from every four lines, my output would be:
8 orange
3 pink
1 green
The best I've come up with is:
awk '!(NR%4){a=NR+4};NR<=a|"shuf -n 1"'
but it doesn't work.
mtherk16
(21 rep)
Nov 17, 2017, 12:32 PM
• Last activity: Nov 17, 2017, 03:37 PM
Showing page 1 of 15 total questions