Possible effects of slurmstepd: error: Exceeded step memory limit at some point?
4
votes
1
answer
2331
views
I have a question for those of you familiar with the scheduler Slurm. Sometimes I get the following error message slurmstepd: error: Exceeded step memory limit at some point.
I know it means the memory allocated to my process wasn't enough. Nonetheless, the process isn't killed by the scheduler and often times it seems innocuous: The program runs to completion and the output files look in good shape.
Should I **always** assume that output is faulty and rerun the programs if I get that error message? Why sometimes the allocated memory can be exceeded but the program isn't killed?
Asked by j91
(161 rep)
Apr 26, 2017, 12:58 PM
Last activity: Apr 21, 2025, 06:04 AM
Last activity: Apr 21, 2025, 06:04 AM