Sample Header Ad - 728x90

Infiniband HCA, Physical State is stuck Disabled

1 vote
1 answer
4696 views
While setting up a point-to-point infiniband connection between two servers, I ran the command ibportstate -G [my port GUID] disable. Now when I try to get the port polling, or do anything with the device at all, I get the following error. [user@server1 ~]$ perfquery -vvv -ddd ibwarn: umad_init: umad_init ibwarn: umad_open_port: ca (null) port 0 ibwarn: umad_get_cas_names: max 32 ibwarn: umad_get_cas_names: return 1 cas ibwarn: resolve_ca_name: checking ca 'qib0' ibwarn: resolve_ca_port: checking ca 'qib0' ibwarn: umad_get_ca: ca_name qib0 ibwarn: umad_get_ca: opened qib0 ibwarn: resolve_ca_port: checking port 0 ibwarn: resolve_ca_port: checking port 1 ibwarn: resolve_ca_port: checking port 0 ibwarn: resolve_ca_port: checking port 1 ibwarn: resolve_ca_name: phys found -1 on (null) port 0 ibwarn: umad_open_port: opening mthca0 port 1 ibwarn: mad_rpc_open_port: can't open UMAD port ((null):0) perfquery: iberror: [pid 16059] main: failed: Failed to open '(null)' port '0' Any command that interacts with the infiniband device responds with the exact same output, no exceptions. The physical state of the port is just stuck. [user@server1 ~]$ cat /sys/class/infiniband/qib0/ports/1/phys_state 3: Disabled And here's the state on the other server so I know it's at least trying. [user@server0 ~]$ cat /sys/class/infiniband/qib0/ports/1/phys_state 2: Polling I've rebooted, restarted opensm, and even pulled and replaced the card. The second machine in the pair is hosting services that I can't take offline any time soon, so I can't switch the HCAs. I've read a few other threads on various websites describing a similar issue, but none of them were resolved in the thread. QLogic IBA7322 CentOS 7, Kernel 3.10.0-514.26.2.el7.x86_64 infiniband-diags 1.6.5
Asked by PSpacer (163 rep)
Jul 11, 2017, 03:52 PM
Last activity: Aug 28, 2017, 06:57 PM