Sample Header Ad - 728x90

About mem and vmem

6 votes
1 answer
11266 views
I am working with a cluster machine running under linux. I have a shell script that uses mpirun to submit my jobs to the cluster machine. In that same script, I can choose the number of nodes that will be assigned to the job. So far, so good. My issue arises after: when I submit a few jobs, all works well, however, when I fill the capacity of the nodes, some of the submitted jobs won't be completed. I am consequently suspecting that the available memory on the cluster is not sufficient to deal with all of my jobs at the same time. This is why I want to check the memory usage of each job over time, I then use the qstat -f command, but it displays a lot of things, and most of them I cannot understand. **So here is my question:** In the sample output of the qstat -f command below, we can see two types of memory: mem and vmem. I would like to know what is the difference between these two and what is the real amount of memory used? resources_used.cput = 00:21:04 resources_used.mem = 2099860kb resources_used.vmem = 40505676kb resources_used.walltime = 00:21:08 Additionally, I would appreciate any reference where the output of this command is detailed. I tried man qstat but it doesn't go into the details of each returned line.
Asked by Mary (61 rep)
Nov 14, 2014, 03:18 AM
Last activity: Nov 2, 2023, 04:16 PM