Problem in writing a parallel version of my Bash code
0
votes
3
answers
123
views
I am trying to parallelise my sample Bash script and have tried commands like
&
and wait
. Please let me know what is an effective way to make it parallel
My current code is working fine for limited entries in the reg2 variable. But I have millions of entries in the reg2 variable. So I want to make my outermost loop parallel. To get the same output i.e., 0,1,2,:,3,4,:,5,6, after parallelizing the code
#!/bin/bash
# array1=$1
# array2=($2)
# reg2=($3)
array1=('bam1' 'bam2' 'bam3' 'bam4' 'bam5' 'bam6' 'bam7')
array2=('cell1' 'cell1' 'cell1' 'cell2' 'cell2' 'cell3' 'cell3')
reg2=('chr1:10484-10572' 'chr1:10589-10632' 'chr1:10636-10661' 'chr1:10665-10690' 'chr1:10694-10719')
start=date +%s.%N
l=${#reg2[@]} # number of regions is 30 million on real data
reg_cov=()
j=0
for r in ${reg2[@]}; do
(cov_array=()
old_array2_element=${array2}
for i in ${!array1[*]}; do
new_array2_element=${array2[$i]}
if [[ "$new_array2_element" != "$old_array2_element" ]]; then
cov_array+=(":")
old_array2_element=$new_array2_element
fi
cov_array+=($i) # in actual code this step takes 4-5 seconds to process
sleep 2
done
reg_cov+=($(IFS=, ; echo "${cov_array[*]}")) )
wait
((j++))
echo "$j/$l"
done
#echo ${reg_cov[@]}
cov=()
cov+=(${reg_cov[@]})
echo $cov
end=date +%s.%N
; runtime=$( echo "$end - $start" | bc -l ); runtime=${runtime%.*}; hours=$((runtime / 3600)); minutes=$(( (runtime % 3600) / 60 )); seconds=$(( (runtime % 3600) % 60 ))
echo "==> completed Runtime: $hours:$minutes:$seconds (hh:mm:ss)"
Asked by user96368
(11 rep)
Dec 21, 2023, 09:53 AM
Last activity: Jan 3, 2024, 08:04 AM
Last activity: Jan 3, 2024, 08:04 AM