Sample Header Ad - 728x90

Using tee and paste results in a deadlock

2 votes
1 answer
210 views
I am trying to redirect stdout of a command into two "branches" using tee for separate processing. Finally I need to merge results of both "branches" using paste. I came up with the following code for the producer:
bash
mkfifo a.fifo b.fifo
python -c 'print(("0\t"+"1"*100+"\n")*10000)' > sample.txt
cat sample.txt | tee >(cut -f 1 > a.fifo) >(cut -f 2 > b.fifo) | awk '{printf "\r%lu", NR}'
# outputs ~200 lines instantly
# and then ~200 more once I read from pipes
and then in a separate terminal I start the consumer:
bash
paste a.fifo b.fifo | awk '{printf "\r%lu", NR}'
# outputs ~200 once producer is stopped with ctrl-C
The problem is that it hangs. This behaviour seems to depend on the input length: 1. If input lines are smaller (i.e. if second column contains 30 characters instead of 100) it works fine. 2. If a.fifo and b.fifo are fed with the same (or similar in length) input it looks like it also works fine. The problem seemingly arises when I feed short chunks in say a.fifo and long in b.fifo. This behaviour does not depend on the order in which I specify pipes in paste. I am not very familiar with Linux and its piping logic but it seems that somehow it deadlocks. My question is whether this can be reliably implemented somehow? If so, how? Maybe there are other ways without using tee and paste?
Asked by HollyJolf (21 rep)
May 6, 2020, 11:57 PM
Last activity: May 9, 2020, 02:34 AM