Unix & Linux Stack Exchange
Q&A for users of Linux, FreeBSD and other Unix-like operating systems
Latest Questions
32
votes
8
answers
14362
views
Simultaneously calculate multiple digests (md5, sha256)?
Under the assumption that disk I/O and free RAM is a bottleneck (while CPU time is not the limitation), does a tool exist that can calculate multiple message digests at once? I am particularly interested in calculating the MD-5 and SHA-256 digests of large files (size in gigabytes), preferably in pa...
Under the assumption that disk I/O and free RAM is a bottleneck (while CPU time is not the limitation), does a tool exist that can calculate multiple message digests at once?
I am particularly interested in calculating the MD-5 and SHA-256 digests of large files (size in gigabytes), preferably in parallel. I have tried
openssl dgst -sha256 -md5
, but it only calculates the hash using one algorithm.
Pseudo-code for the expected behavior:
for each block:
for each algorithm:
hash_state[algorithm].update(block)
for each algorithm:
print algorithm, hash_state[algorithm].final_hash()
Lekensteyn
(21600 rep)
Oct 23, 2014, 10:00 AM
• Last activity: Aug 4, 2025, 06:51 AM
2
votes
3
answers
2809
views
Parallelize recursive deletion with find
I want to recursively delete all files that end with `.in`. This is taking a long time, and I have many cores available, so I would like to parallelize this process. From [this thread][1], it looks like it's possible to use `xargs` or `make` to parallelize `find`. Is this application of find possibl...
I want to recursively delete all files that end with
.in
. This is taking a long time, and I have many cores available, so I would like to parallelize this process. From this thread , it looks like it's possible to use xargs
or make
to parallelize find
. Is this application of find possible to parallelize?
Here is my current serial command:
find . -name "*.in" -type f -delete
kilojoules
(169 rep)
Mar 6, 2017, 04:57 PM
• Last activity: Jul 26, 2025, 03:16 PM
18
votes
6
answers
27026
views
Is there parallel wget? Something like fping but only for downloading?
I've found only puf (Parallel URL fetcher) but I couldn't get it to read urls from a file; something like puf < urls.txt does not work either. The operating system installed on the server is Ubuntu.
I've found only puf (Parallel URL fetcher) but I couldn't get it to read urls from a file; something like
puf < urls.txt
does not work either.
The operating system installed on the server is Ubuntu.
Moonwalker
(333 rep)
Apr 7, 2012, 04:18 PM
• Last activity: Jul 24, 2025, 12:51 PM
0
votes
3
answers
1773
views
One-liner to run commands from a file in parallel using xargs
I have a script like this: ```lang-shell #!/bin/csh command 1 \ -f \"input1\" \ -l input2 -other_swithes1 command 2 \ -f \"input1\" \ -m input2 \ -l input3 -other_swithes1 command 3 \ -f \"input1\" \ -l input2 -other_swithes2 ``` so any idea for an one-liner with `xargs` to run these commands in par...
I have a script like this:
-shell
#!/bin/csh
command 1 \
-f \"input1\" \
-l input2 -other_swithes1
command 2 \
-f \"input1\" \
-m input2 \
-l input3 -other_swithes1
command 3 \
-f \"input1\" \
-l input2 -other_swithes2
so any idea for an one-liner with xargs
to run these commands in parallel. I tried various variants but all failed. I do not really want to write a script, I think that should have been possible with -d switch and -c, not sure though.
To simplify and extend the problem further, what I have is
cat file | grep -v "#.*" | sed -z 's/[\]\n/ /g' | xargs -I {} -n1 -P10 sh -c '{}'
and while this does the job there is a particular problem and that is that \"
get removed. So any clue how to solve that?!
inman
(9 rep)
Aug 23, 2021, 12:31 PM
• Last activity: Jun 30, 2025, 01:48 PM
1
votes
1
answers
432
views
`pzstd` = parallelized Zstandard, how to watch progress in 4TB large file/disk?
I am brand new to `zstd`/`pzstd`, trying out its features, compression, benchmarking it, and so on. (I run Linux Mint 22 Cinnamon.) This computer has 32 GB RAM. The basic command appears to be working, but I found out it's not fully multi-threaded / parallelized: ```none # zstd --ultra --adapt=min=5...
I am brand new to
zstd
/pzstd
, trying out its features, compression, benchmarking it, and so on. (I run Linux Mint 22 Cinnamon.) This computer has 32 GB RAM.
The basic command appears to be working, but I found out it's not fully multi-threaded / parallelized:
# zstd --ultra --adapt=min=5,max=22 --long --auto-threads=logical --progress --keep --force --verbose /dev/nvme0n1 -o /path/to/disk/image/file.zst
As you can see for yourself, I am trying to compress my **NVMe 4TB drive** with only Timeshifts on its ext4 fs. Could you recommend me some tweaks to my zstd
command, I would welcome it.
But the real question here is how to make it multi-threaded / parallelized?
***
I am trying this:
# pzstd --ultra -22 --processes 8 --keep --force --verbose /dev/nvme0n1 -o /path/to/disk/image/file.zst
because of this parallel version of ZSTD does not have apparently the --progress
option. I need to find another way to watch it. 4TB will take some time and I don't intend to be totally blind.
My tries with pv
ended as not working properly. Please help, I'll appreciate it. Thanks.
Vlastimil Burián
(30505 rep)
Oct 10, 2024, 09:22 AM
• Last activity: Jun 18, 2025, 06:54 AM
7
votes
8
answers
18898
views
Remove numbers from the start of filenames
I've a problem modifying the files' names in my `Music/` directory. I have a list of names like these: $ ls 01 American Idiot.mp3 01 Articolo 31 - Domani Smetto.mp3 01 Bohemian rapsody.mp3 01 Eye of the Tiger.mp3 04 Halo.mp3 04 Indietro.mp3 04 You Can't Hurry Love.mp3 05 Beautiful girls.mp3 16 Apolo...
I've a problem modifying the files' names in my
Music/
directory.
I have a list of names like these:
$ ls
01 American Idiot.mp3
01 Articolo 31 - Domani Smetto.mp3
01 Bohemian rapsody.mp3
01 Eye of the Tiger.mp3
04 Halo.mp3
04 Indietro.mp3
04 You Can't Hurry Love.mp3
05 Beautiful girls.mp3
16 Apologize.mp3
16 Christmas Is All Around.mp3
Adam's song.mp3
A far l'amore comincia tu.mp3
All By My Self.MP3
Always.mp3
Angel.mp3
And similar and I would like to cut all the numbers in front of the filenames (not the 3 in the extension).
I've tried first to grep
only the files with the number with find -exec
or xargs
but even at this first step I had no success. After being able to grep
I'd like doing the actual name change.
This is what I tried by now:
ls > try-expression
grep -E '^[0-9]+' try-expression
and with the above I got the right result. Then I tried the next step:
ls | xargs -0 grep -E '^[0-9]+'
ls | xargs -d '\n' grep -E '^[0-9]+'
find . -name '[0-9]+' -exec grep -E '^[0-9]+' {} \;
ls | parallel bash -c "grep -E '^[0-9]+'" - {}
And similar but I got error like 'File name too long' or no output at all. I guess the problem is the way I'm using xargs
or find
as expressions in separate commands work well.
Thank you for your help
Luigi Tiburzi
(887 rep)
May 29, 2012, 08:07 AM
• Last activity: May 14, 2025, 11:19 AM
3
votes
1
answers
870
views
How do you fetch a large file over http in parallel?
**Question:** Since HTTP supports resuming at an offset, are there any tools (or existing options for commands like wget or curl) that will launch multiple threads to fetch the file in parallel with multiple requests at different file offsets? This could help with performance of each socket is throt...
**Question:**
Since HTTP supports resuming at an offset, are there any tools (or existing options for commands like wget or curl) that will launch multiple threads to fetch the file in parallel with multiple requests at different file offsets? This could help with performance of each socket is throttled separately.
I could write a program to do this, but I'm wondering if the tooling already exists.
**Background:**
Recently I wanted to download a large iso, **but!** ... Somewhere between the server and my internet provider the transfer rate was limited to 100 kilobit! However, I noticed that the first 5 to 10 seconds had great throughput, hundreds of megabits. So I wrote a small bash script to restart after a few seconds:
while ! timeout 8 wget -c http://example.com/bigfile.iso ; do true; done
(I hope it was not my provider . . . But maybe it was. Someone please bring back net neutrality!)
KJ7LNW
(525 rep)
Feb 3, 2023, 01:28 AM
• Last activity: May 9, 2025, 08:03 PM
1
votes
2
answers
2996
views
Curl Parallel requests using links source file
I have this script to go through a list of URLs and the check return codes using Curl. Links file goes like this: https://link1/... https://link2/... https://link200/... (...) The script: INDEX=0 DIR="$(grep [WorkingDir file] | cut -d \" -f 2)" WORKDIR="${DIR}/base" ARQLINK="navbase.txt" for URL in...
I have this script to go through a list of URLs and the check return codes using Curl.
Links file goes like this:
https://link1/ ...
https://link2/ ...
https://link200/ ...
(...)
The script:
INDEX=0
DIR="$(grep [WorkingDir file] | cut -d \" -f 2)"
WORKDIR="${DIR}/base"
ARQLINK="navbase.txt"
for URL in $(cat $WORKDIR/$ARQLINK); do
INDEX=$((INDEX + 1))
HTTP_CODE=$(curl -m 5 -k -o /dev/null --silent --head --write-out '%{http_code}\n' $URL)
if [ $HTTP_CODE -eq 200 ]; then
printf "\n%.3d => OK! - $URL" $INDEX;
else
printf "\n\n%.3d => FAIL! - $URL\n" $INDEX;
fi
done
It takes a little while to run through every URL, so I was wondering how to speed those up curl requests.
Maybe I could use some parallel Curl requests, but using "xargs" inside a "for" loop while also printing a message doesn't seem the way to go.
I was able to use "xargs" out of the script and it sort of works, although not showing the correct HTTP code.
cat navbase.txt | xargs -I % -P 10 curl -m 5 -k -o /dev/null --silent --head --write-out '%{http_code}\n' %
I couldn't find a way to insert that into the script.
Any tips?
markfree
(425 rep)
Jan 4, 2022, 02:05 PM
• Last activity: Mar 19, 2025, 11:10 AM
86
votes
6
answers
114673
views
How to determine the maximum number to pass to make -j option?
I want to compile as fast as possible. Go figure. And would like to automate the choice of the number following the `-j` option. How can I programmatically choose that value, e.g. in a shell script? Is the output of `nproc` equivalent to the number of threads I have available to compile with? `make...
I want to compile as fast as possible. Go figure. And would like to automate the choice of the number following the
-j
option. How can I programmatically choose that value, e.g. in a shell script?
Is the output of nproc
equivalent to the number of threads I have available to compile with?
make -j1
make -j16
tarabyte
(4506 rep)
Jun 9, 2015, 09:36 PM
• Last activity: Mar 8, 2025, 10:07 AM
0
votes
1
answers
1926
views
slurm: srun & sbatch different performance with the same settings
In a slurm system, when I use **srun** command to run the program. It runs very slow and seems like only one processor works. srun --pty -A free -J test -N 1 -n 1 -c 1 mpirun -np 16 $FEAPHOME8_3/parfeap/feap -log_summary lu.log But if I write a **sbatch** script, it can run very quickly and looks li...
In a slurm system, when I use **srun** command to run the program. It runs very slow and seems like only one processor works.
srun --pty -A free -J test -N 1 -n 1 -c 1 mpirun -np 16
$FEAPHOME8_3/parfeap/feap -log_summary lu.log
But if I write a **sbatch** script, it can run very quickly and looks like all the processors work.
#!/bin/sh -l
#SBATCH --job-name=test
#SBATCH --account=free
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=1
#SBATCH --exclusive
#SBATCH --time=6:00:00
echo ' '
echo ' ****** START OF MAIN-JOB ******'
date
srun -n 16 echo y | mpirun -np 16 $FEAPHOME8_3/parfeap/feap -log_summary lu.log
echo ' ****** END OF MAIN-JOB ******'
#End of script
Could anybody please tell me what's going on?
Rilin Shen
(33 rep)
Sep 18, 2017, 01:29 AM
• Last activity: Mar 3, 2025, 05:06 PM
2
votes
1
answers
2788
views
Job Control: How to save output of background job in a variable
Using Bash in OSX. My script has these 2 lines: nfiles=$(rsync -auvh --stats --delete --progress --log-file="$SourceRoot/""CopyLog1.txt" "$SourceTx" "$Dest1Tx" | tee /dev/stderr | awk '/files transferred/{print $NF}') & nfiles2=$(rsync -auvh --stats --delete --progress --log-file="$SourceRoot/""Copy...
Using Bash in OSX.
My script has these 2 lines:
nfiles=$(rsync -auvh --stats --delete --progress --log-file="$SourceRoot/""CopyLog1.txt" "$SourceTx" "$Dest1Tx" | tee /dev/stderr | awk '/files transferred/{print $NF}') &
nfiles2=$(rsync -auvh --stats --delete --progress --log-file="$SourceRoot/""CopyLog2.txt" "$SourceTx" "$Dest2Tx" | tee /dev/stderr | awk '/files transferred/{print $NF}')
When I use the
&
after the first line (to run the two rsync commands in parallel), my later call to $nfiles
returns nothing.
Code:
osascript -e 'display notification "'$nfiles' files transferred to MASTER," & return & "'$nfiles2' transferred to BACKUP," & return & "Log Files Created" with title "Copy Complete"'
Can't figure out what's going on. I need the 2 rsyncs to run simultaneously.
user192259
(51 rep)
Sep 30, 2016, 12:48 AM
• Last activity: Mar 1, 2025, 02:01 PM
3
votes
2
answers
4923
views
Run GNU Octave script on multiple cores
I am computing Monte-Carlo simulations using GNU Octave 4.0.0 on my 4-core PC. The simulation takes almost 4 hours to compute the script for 50,000 times (specific to my problem), which is a lot of time spent for computation. I was wondering if there is a way to run Octave on multiple cores simultan...
I am computing Monte-Carlo simulations using GNU Octave 4.0.0 on my 4-core PC. The simulation takes almost 4 hours to compute the script for 50,000 times (specific to my problem), which is a lot of time spent for computation. I was wondering if there is a way to run Octave on multiple cores simultaneously to reduce the time of computations.
Thanks in advance.
North
(31 rep)
May 18, 2016, 01:53 PM
• Last activity: Feb 26, 2025, 08:48 AM
3
votes
3
answers
7088
views
Concurrency and Parallelism on Bash
I'm a bit confused about concurrency and parallelism in the bash shell. As I understand it, when we run commands in more than one subshells at the same time, these commands run in parallel on individual processor cores. **For example;** ``` cmd1 & cmd2 & cmd3 & ``` Here, the "ampersand" sign is run...
I'm a bit confused about concurrency and parallelism in the bash shell. As I understand it, when we run commands in more than one subshells at the same time, these commands run in parallel on individual processor cores.
**For example;**
cmd1 & cmd2 & cmd3 &
Here, the "ampersand" sign is run in the background (aka the subshells) of each command at the same time.It could be create in subshells in other ways. (Like writing in parentheses or using a pipe ..).
In this direction, the questions I wonder the answers of;
- Bash provides parallelism through subshells, other hand is concurrency also achieved using another method on bash? As far as I know, concurrency works in a way that a single CPU executes operations intermittently. Do I need to implement a method externally to achieve this, or is bash already working this way (concurrency ) anyway.
- If I occupy all CPU cores using parallelism, will the system crash or is there a protection mechanism against this situation?
- What is the difference between the parallel I provide with subshells and the
GNU Parallel tool? If the GNU Parallel tool works better, how does it achieve this?
- Which of the "Parallel" or "Concurrency" operations works more efficiently?
- What kind of losses are experienced when making "parallel" or "concurrency" operation, unlike normal (sequential execution of commands)?
testter
(1510 rep)
Oct 22, 2020, 04:21 PM
• Last activity: Jul 2, 2024, 02:47 PM
5
votes
2
answers
1976
views
Pipeline as parallel command
Normally, pipelines in Unix are used to connect two commands and use the output of the first command as the input of the second command. However, I recently come up with the idea (which may not be new, but I didn't find much Googling) of using pipeline to run several commands in parallel, like this:...
Normally, pipelines in Unix are used to connect two commands and use the output of the first command as the input of the second command. However, I recently come up with the idea (which may not be new, but I didn't find much Googling) of using pipeline to run several commands in parallel, like this:
command1 | command2
This will invoke
command1
and command2
in parallel **even if command2
does not read from standard input and command1
does not write to standard output**. A minimal example to illustrate this is (please run it in an interactive shell)
ls . -R 1>&2|ls . -R
My question is, are there any downsides to use pipeline to parallelize the execution of two commands in this way? Are there anything that I have missed in this idea?
Thank you very much in advance.
Weijun Zhou
(3548 rep)
Dec 8, 2017, 09:31 PM
• Last activity: Jun 4, 2024, 02:50 AM
9
votes
2
answers
8474
views
How can I run two commands in parallel and terminate them if ONE of them terminates with exit code 0?
I have 2 commands which are to be run simultaneously. And I want the script to terminate if one of them either exits with code 0 or 1. How can I achieve this in Linux(Ubuntu) cmd1 & cmd2 & wait
I have 2 commands which are to be run simultaneously. And I want the script to terminate if one of them either exits with code 0 or 1. How can I achieve this in Linux(Ubuntu)
cmd1 &
cmd2 &
wait
Sam
(91 rep)
Jan 10, 2017, 08:33 PM
• Last activity: Feb 16, 2024, 08:05 PM
1
votes
1
answers
2650
views
How to run one command on several cores
I have a command which I want to run on all free cores to speed up the execution time. Specifically I am running the Pitman-Yor Adaptor-Grammar Sampler software I downloaded from [here][1] ./py-cfg/py-cfg-mp -r 0 -d 10 -x 10 -D -E -e 1 -f 1 -g 10 -h 0.1 -w 1 -T 1 -m 0 -n 500 -G x.tgt y.tgt < z.tgt I...
I have a command which I want to run on all free cores to speed up the execution time. Specifically I am running the Pitman-Yor Adaptor-Grammar Sampler software I downloaded from here
./py-cfg/py-cfg-mp -r 0 -d 10 -x 10 -D -E -e 1 -f 1 -g 10 -h 0.1 -w 1 -T 1 -m 0 -n 500 -G x.tgt y.tgt < z.tgt
I tried adding
parallel -j "$(nproc)"
before the command as specified in this answer
but it is generating the following error:
Error in ./py-cfg/py-cfg-mp, argc = 29, optind = 27
M.A.G
(271 rep)
Feb 3, 2022, 09:32 AM
• Last activity: Feb 15, 2024, 03:08 AM
0
votes
1
answers
172
views
How to verify size of pigz (parallel gzip) archive contents?
I created some `pigz` (parallel `gzip`) - [home page][1] - compressed archives of my SSD disk drives. (compiled version 2.8) I called one of them **4TB-SATA-disk--Windows10--2024-Jan-21.img.gz** which says the size of the drive and the OS installed on it. Alternatively, the size is of course smaller...
I created some
pigz
(parallel gzip
) - home page - compressed archives of my SSD disk drives. (compiled version 2.8)
I called one of them **4TB-SATA-disk--Windows10--2024-Jan-21.img.gz** which says the size of the drive and the OS installed on it. Alternatively, the size is of course smaller in TiB which may be comfortably shown by fdisk
and tools like it (Disk /dev/sda: 3.64 TiB
).
I knew from the past, that listing the compressed file cannot show me the real size of contents. It will show 2.2GB or similar nonsense, even in Archive Manager for GNOME. It likely has something to do with gzip
structure limitations.
However I had some doubts about the real size of the contents, therefore my question states how may I verify it?
Vlastimil Burián
(30505 rep)
Jan 27, 2024, 02:57 PM
1
votes
1
answers
100
views
Curious lscpu Output Related to CPU Calculation
I have been trying to understand `lscpu`'s output and came across several threads dedicated to concepts of CPUs, physical cores, and threads. Based on those threads, to get the total number of CPUs (logical units), you would do the following: `Thread(s) per core` x `Core(s) per socket` x `Socket(s)`...
I have been trying to understand
lscpu
's output and came across several threads dedicated to concepts of CPUs, physical cores, and threads. Based on those threads, to get the total number of CPUs (logical units), you would do the following: Thread(s) per core
x Core(s) per socket
x Socket(s)
.
Given the partial output from my machine shown below, I would expect CPU(s)
to be 28, but it is listed as 20.
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 20
On-line CPU(s) list: 0-19
Vendor ID: GenuineIntel
Model name: 12th Gen Intel(R) Core(TM) i9-12900H
CPU family: 6
Model: 154
Thread(s) per core: 2
Core(s) per socket: 14
Socket(s): 1
Can someone help me to understand the apparent disconnect between my expectation for the CPU count and what lscpu
actually produced in this instance? I can't seem to find similar cases.
Jason
(13 rep)
Jan 26, 2024, 05:47 PM
• Last activity: Jan 26, 2024, 07:24 PM
0
votes
1
answers
67
views
Starting another OS to run parallel with another system
So at work we have a few systems running on their own hardware that is identical to all of the systems. If one of the hardware fails, one of the running system will take over and start the system that failed, doing the work of both. All Linux and Intel cpu based. How would that be possible to have a...
So at work we have a few systems running on their own hardware that is identical to all of the systems. If one of the hardware fails, one of the running system will take over and start the system that failed, doing the work of both. All Linux and Intel cpu based.
How would that be possible to have a OS start another OS to run together? It is not just running a program, it is running the entire thing as well.
I have heard of things like how you can have windows let another OS use one of the cores on the cpu for parallel computing. But I am curious to how a system would immediately take over another system. It was super cool and my co workers unfortunately don't understand how it is done either.
Travis Hunt
(1 rep)
Jan 7, 2024, 04:04 AM
• Last activity: Jan 7, 2024, 05:30 AM
0
votes
3
answers
123
views
Problem in writing a parallel version of my Bash code
I am trying to parallelise my sample Bash script and have tried commands like `&` and `wait`. Please let me know what is an effective way to make it parallel My current code is working fine for limited entries in the reg2 variable. But I have millions of entries in the reg2 variable. So I want to ma...
I am trying to parallelise my sample Bash script and have tried commands like
&
and wait
. Please let me know what is an effective way to make it parallel
My current code is working fine for limited entries in the reg2 variable. But I have millions of entries in the reg2 variable. So I want to make my outermost loop parallel. To get the same output i.e., 0,1,2,:,3,4,:,5,6, after parallelizing the code
#!/bin/bash
# array1=$1
# array2=($2)
# reg2=($3)
array1=('bam1' 'bam2' 'bam3' 'bam4' 'bam5' 'bam6' 'bam7')
array2=('cell1' 'cell1' 'cell1' 'cell2' 'cell2' 'cell3' 'cell3')
reg2=('chr1:10484-10572' 'chr1:10589-10632' 'chr1:10636-10661' 'chr1:10665-10690' 'chr1:10694-10719')
start=date +%s.%N
l=${#reg2[@]} # number of regions is 30 million on real data
reg_cov=()
j=0
for r in ${reg2[@]}; do
(cov_array=()
old_array2_element=${array2}
for i in ${!array1[*]}; do
new_array2_element=${array2[$i]}
if [[ "$new_array2_element" != "$old_array2_element" ]]; then
cov_array+=(":")
old_array2_element=$new_array2_element
fi
cov_array+=($i) # in actual code this step takes 4-5 seconds to process
sleep 2
done
reg_cov+=($(IFS=, ; echo "${cov_array[*]}")) )
wait
((j++))
echo "$j/$l"
done
#echo ${reg_cov[@]}
cov=()
cov+=(${reg_cov[@]})
echo $cov
end=date +%s.%N
; runtime=$( echo "$end - $start" | bc -l ); runtime=${runtime%.*}; hours=$((runtime / 3600)); minutes=$(( (runtime % 3600) / 60 )); seconds=$(( (runtime % 3600) % 60 ))
echo "==> completed Runtime: $hours:$minutes:$seconds (hh:mm:ss)"
user96368
(11 rep)
Dec 21, 2023, 09:53 AM
• Last activity: Jan 3, 2024, 08:04 AM
Showing page 1 of 20 total questions