how to configure or tune infiniband redhat 8
0
votes
1
answer
62
views
I asked a similar question here : https://unix.stackexchange.com/questions/788297/nfs-v4-2-tuning
with less than 50 servers on a closed infiniband [mellanox] HDR network switch all running RHEL-8.10:
1. is there anything other than
systemctl start opensm
that is needed on any one server to make this network run **optimally** in terms of speed?
2. Can someone either reply with specific concise instructions here or link to a website that provides testing one can do to validate a properly configured infiniband network?
*I read stuff (nvidia forums, reddit, etc) about how people complain their infiniband is no better than their 10 GbE network. How can that be?*
*personally, if I do a scp
of a single tar file between two servers on infiniband I see no better performance on infiniband versus over 10Gbe; scp is nice because it displays the transfer speed. Referring back to my original NFS tuning question, is there any other protocol or network mechanism besides NFS with regard to an infiniband network? How much could NFS [v4.2] be the factor in bad performance?*
use case is cfd and other commercial software written to run on clusters over MPI, for which intel oneapi is installed, no errors or warnings, same ~12 hour job over infiniband took 2 hours longer. I use scp as a simple way to get numbers across. Things get worse over NFS. When I need to manage data in the 20+ TB range I specifically tar things up and do a scp of it because it's faster than doing cp over NFS.
Asked by ron
(8647 rep)
Mar 10, 2025, 01:44 PM
Last activity: Mar 10, 2025, 04:44 PM
Last activity: Mar 10, 2025, 04:44 PM