`mpirun -n 2 ./a.x`, the two processes was stuck by epoll_wait, why?
0
votes
0
answers
142
views
I run a mpi progrem with
mpirun -n 2 ./a.x
. However, these two processes was stuck. And it is always get stuck and seldom(actually only once) pass through.
I find follow information by strace
and lsof
, and what I know is that these two processes was waiting for reading or writing(?) a same file but it is not be prepared. Then, how to find what the file is and why it is always not ready to be access?
If you have any thoughts or need anything else, just please tell me, thank you!
```
//use strace -p 31352
epoll_wait(18, [], 100, 0) = 0
epoll_wait(18, [], 100, 0) = 0
epoll_wait(18, [], 100, 0) = 0
//use strace -p 31351
epoll_wait(19, [], 100, 0) = 0
epoll_wait(19, [], 100, 0) = 0
epoll_wait(19, [], 100, 0) = 0
//use lsof -p 31352
pfci.x 31352 jslo 18u a_inode 0,13 0 11815 [eventpoll]
//use lsof -p 31351
pfci.x 31351 jslo 19u a_inode 0,13 0 11815 [eventpoll]
Asked by Runfeng Jin
(1 rep)
Jul 9, 2021, 01:01 PM