Sample Header Ad - 728x90

IO wait/failure timeout on iscsi device with multipath enablement

1 vote
0 answers
68 views
- I'm accessing a remote iscsi based SAN using multipath. - The network on the server side has known intermittent issues such that there are session failures and path failures/IO failures. I'm not trying to beat this problem as it's already a WIP. - Now, the issue i have is let's say I'm trying to format or partition the device via a process/service, the parted/mkfs cmd gets hung causing Kernel panic. This value is set to 240 secs. - Now, what i want to avoid is the kernel panic, i want parted/mkfs command to fail and return than cause kernel panic. - I have searched and tried changing various parameters ( iscsid, sysfs, multipath ) to no avail. This is my iscsid config
iscsid.startup = /bin/systemctl start iscsid.socket iscsiuio.socket
node.startup = automatic
node.leading_login = No
node.session.timeo.replacement_timeout = 30
node.conn.timeo.login_timeout = 30
node.conn.timeo.logout_timeout = 15
node.conn.timeo.noop_out_interval = 5
node.conn.timeo.noop_out_timeout = 5
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 30
node.session.err_timeo.tgt_reset_timeout = 30
node.session.initial_login_retry_max = 8
node.session.cmds_max = 128
node.session.queue_depth = 2
node.session.xmit_thread_priority = -20
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 262144
node.conn.iscsi.MaxRecvDataSegmentLength = 262144
node.conn.iscsi.MaxXmitDataSegmentLength = 262144
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 32768
node.conn.iscsi.HeaderDigest = CRC32C
node.conn.iscsi.DataDigest = CRC32C
node.session.nr_sessions = 1
node.session.reopen_max = 0
node.session.iscsi.FastAbort = Yes
node.session.scan = auto
multipath conf
defaults {
        path_checker none
        user_friendly_names yes          # To create ‘mpathn’ names for multipath devices
        path_grouping_policy multibus    # To place all the paths in one priority group
        path_selector "round-robin 0"    # To use round robin algorithm to determine path for next I/O operation
        failback immediate               # For immediate failback to highest priority path group with active paths
        no_path_retry 1                  # To disable I/O queueing after retrying once when all paths are down
    }
And I've set all sysfs timeout values of all slave paths to be 30 seconds. But still parted/mkfs never fail and return when there's network issue ( simulated ). What am i missing? My multipath version is tad old but i can't upgrade as this is supported version on Rocky 8. multipath-tools v0.8.4 (05/04, 2020) iscsid version 6.2.1.4-1
Asked by Neetz (111 rep)
Jan 21, 2025, 09:38 PM