MSA Storage
1783201 Members
1681 Online
109144 Solutions
New Discussion юеВ

RVH : Losing iscsi connection with MSA1040

 
wodel_youchi
Occasional Contributor

RVH : Losing iscsi connection with MSA1040

Hi,

We have a virtualization platform with 8 hypervisors : RHV 4.3 which uses RHEL7 as a base system.

Our primary storage is a MSA1040 iscsi 10Gb/s, the plateform is connetcted with two stacked 10Gb/s swicthes, the swicthes are stacked using "Front plane Stacking", in other words, we're using 2x10Gb/s to stack the two switches.

We are experiencing lot of problems with sotrage part, when the load on I/O increses, some of our hypervisors start to loose connection to the storage array, we start getting this type of errors :

Dec 20 08:27:17 hyperv02 kernel: connection7:0: ping timeout of 5 secs expired, recv timeout 5, last rx 5797877693, last ping 5797882696, now 5797887713
Dec 20 08:27:17 hyperv02 kernel: connection7:0: detected conn error (1022)
Dec 20 08:27:17 hyperv02 iscsid: iscsid: Kernel reported iSCSI connection 7:0 error (1022 - ISCSI_ERR_NOP_TIMEDOUT: A NOP has timed out) state (3)
Dec 20 08:27:17 hyperv02 kernel: sd 8:0:6:3: [sdan] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=3s
Dec 20 08:27:17 hyperv02 kernel: sd 8:0:6:3: [sdan] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 08:27:17 hyperv02 kernel: sd 8:0:6:4: [sdap] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=2s
Dec 20 08:27:17 hyperv02 kernel: sd 8:0:6:4: [sdap] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 08:27:17 hyperv02 kernel: sd 8:0:6:5: [sdaq] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=2s
Dec 20 08:27:17 hyperv02 kernel: sd 8:0:6:5: [sdaq] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 08:27:17 hyperv02 sanlock[1333]: 2020-12-20 08:27:17 1503220 [31461]: s16 delta_renew read timeout 10 sec offset 0 /dev/0d97092e-030c-4aef-9d8a-7d3ca18d676d/ids
Dec 20 08:27:17 hyperv02 sanlock[1333]: 2020-12-20 08:27:17 1503220 [31461]: s16 renewal error -202 delta_length 10 last_success 1503190
Dec 20 08:27:21 hyperv02 kernel: connection7:0: bnx2i: conn update - MBL 0x200000 FBL 0x40000MRDSL_I 0x40000 MRDSL_T 0x40000
Dec 20 08:27:21 hyperv02 iscsid: iscsid: connection7:0 is operational after recovery (1 attempts)
Dec 20 08:28:56 hyperv02 kernel: connection3:0: ping timeout of 5 secs expired, recv timeout 5, last rx 5797977372, last ping 5797982373, now 5797987392
Dec 20 08:28:56 hyperv02 kernel: connection3:0: detected conn error (1022)
Dec 20 08:28:56 hyperv02 iscsid: iscsid: Kernel reported iSCSI connection 3:0 error (1022 - ISCSI_ERR_NOP_TIMEDOUT: A NOP has timed out) state (3)
Dec 20 08:29:00 hyperv02 kernel: connection3:0: bnx2i: conn update - MBL 0x200000 FBL 0x40000MRDSL_I 0x40000 MRDSL_T 0x40000

 

But sometimes the situation gets worst and all the links (multipath) are lost and the hypervisor crashes, and the VMs are stopped, so we have to recycle it.

Dec 20 09:47:03 hyperv02 kernel: connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 5802663910, last ping 5802668911, now 5802673920
Dec 20 09:47:03 hyperv02 kernel: connection1:0: detected conn error (1022)
Dec 20 09:47:03 hyperv02 kernel: scsi_io_completion: 12 callbacks suppressed
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:4: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=3s
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:4: [sdf] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:1: [sdc] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=3s
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:1: [sdc] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:5: [sdg] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=3s
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:5: [sdg] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:0: [sdb] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=3s
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:0: [sdb] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:2: [sdd] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=3s
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:2: [sdd] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:3: [sde] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=3s
Dec 20 09:47:03 hyperv02 kernel: sd 8:0:0:3: [sde] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:03 hyperv02 iscsid: iscsid: Kernel reported iSCSI connection 1:0 error (1022 - ISCSI_ERR_NOP_TIMEDOUT: A NOP has timed out) state (3)
Dec 20 09:47:03 hyperv02 kernel: connection3:0: ping timeout of 5 secs expired, recv timeout 5, last rx 5802663910, last ping 5802668928, now 5802673936
Dec 20 09:47:03 hyperv02 kernel: connection3:0: detected conn error (1022)
Dec 20 09:47:03 hyperv02 iscsid: iscsid: Kernel reported iSCSI connection 3:0 error (1022 - ISCSI_ERR_NOP_TIMEDOUT: A NOP has timed out) state (3)
Dec 20 09:47:07 hyperv02 iscsid: iscsid: connection1:0 is operational after recovery (1 attempts)
Dec 20 09:47:07 hyperv02 kernel: connection1:0: bnx2i: conn update - MBL 0x200000 FBL 0x40000MRDSL_I 0x40000 MRDSL_T 0x40000
Dec 20 09:47:07 hyperv02 sanlock[1333]: 2020-12-20 09:47:07 1508011 [31455]: s15 delta_renew read timeout 10 sec offset 0 /dev/248554b6-da6c-41aa-8e6b-a18266e04bf6/ids
Dec 20 09:47:07 hyperv02 sanlock[1333]: 2020-12-20 09:47:07 1508011 [31455]: s15 renewal error -202 delta_length 10 last_success 1507980
Dec 20 09:47:08 hyperv02 kernel: session3: session recovery timed out after 5 secs
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:0: rejecting I/O to offline device
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:0: [sdh] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:0: [sdh] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=14s
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Failing path 8:112.
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:0: [sdh] CDB: Write(16) 8a 00 00 00 00 01 65 af 81 70 00 00 00 10 00 00
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: 11 callbacks suppressed
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdh, sector 6000968048
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:1: rejecting I/O to offline device
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:1: [sdj] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:2: rejecting I/O to offline device
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:2: [sdl] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:2: [sdl] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:2: [sdl] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:3: rejecting I/O to offline device
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: rejecting I/O to offline device
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=14s
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=13s
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] CDB: Read(16) 88 00 00 00 00 00 7b 32 d1 30 00 00 00 78 00 00
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] CDB: Write(16) 8a 00 00 00 00 00 65 1e 29 d0 00 00 00 78 00 00
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=14s
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdp, sector 2066927920
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdp, sector 1696475600
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] CDB: Read(16) 88 00 00 00 00 00 7b 32 f1 a8 00 00 00 78 00 00
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdp, sector 2066936232
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] killing request
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:5: rejecting I/O to offline device
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:5: [sdr] killing request
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Failing path 8:240.
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=15s
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=13s
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] CDB: Write(16) 8a 00 00 00 00 00 44 4c 06 20 00 00 00 78 00 00
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] CDB: Write(16) 8a 00 00 00 00 00 65 1e 24 e0 00 00 00 78 00 00
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdp, sector 1145832992
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdp, sector 1696474336
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=15s
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] CDB: Write(16) 8a 00 00 00 00 00 44 4c 0a 88 00 00 00 78 00 00
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdp, sector 1145834120
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=13s
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] CDB: Write(16) 8a 00 00 00 00 00 65 1e 1f f0 00 00 00 78 00 00
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdp, sector 1696473072
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=14s
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:4: [sdp] CDB: Read(16) 88 00 00 00 00 00 7b 32 cf 20 00 00 00 30 00 00
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdp, sector 2066927392
Dec 20 09:47:08 hyperv02 kernel: blk_update_request: I/O error, dev sdp, sector 2066926144
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:1: [sdj] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=13s
Dec 20 09:47:08 hyperv02 kernel: sd 7:0:2:1: [sdj] CDB: Read(10) 28 00 00 84 08 f0 00 00 78 00
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Failing path 8:144.
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Failing path 8:208.
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Failing path 65:16.
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Failing path 8:176.
Dec 20 09:47:08 hyperv02 multipathd: sdh: mark as failed
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4af96aff5801000000: remaining active paths: 19
Dec 20 09:47:08 hyperv02 multipathd: sdj: mark as failed
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4a6145725a01000000: remaining active paths: 19
Dec 20 09:47:08 hyperv02 multipathd: sdn: mark as failed
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4a597f715a01000000: remaining active paths: 19
Dec 20 09:47:08 hyperv02 multipathd: sdr: mark as failed
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4af97a685b01000000: remaining active paths: 19
Dec 20 09:47:08 hyperv02 multipathd: sdp: mark as failed
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4af87a685b01000000: remaining active paths: 19
Dec 20 09:47:08 hyperv02 multipathd: sdl: mark as failed
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4a74ce575a01000000: remaining active paths: 19
Dec 20 09:47:08 hyperv02 iscsid: iscsid: connection3:0 is operational after recovery (1 attempts)
Dec 20 09:47:08 hyperv02 kernel: connection3:0: bnx2i: conn update - MBL 0x200000 FBL 0x40000MRDSL_I 0x40000 MRDSL_T 0x40000
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4af96aff5801000000: sdh - tur checker reports path is up
Dec 20 09:47:08 hyperv02 multipathd: 8:112: reinstated
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4af96aff5801000000: remaining active paths: 20
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Reinstating path 8:112.
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4a6145725a01000000: sdj - tur checker reports path is up
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Reinstating path 8:144.
Dec 20 09:47:08 hyperv02 multipathd: 8:144: reinstated
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4a6145725a01000000: remaining active paths: 20
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4a74ce575a01000000: sdl - tur checker reports path is up
Dec 20 09:47:08 hyperv02 multipathd: 8:176: reinstated
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Reinstating path 8:176.
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4a74ce575a01000000: remaining active paths: 20
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4a597f715a01000000: sdn - tur checker reports path is up
Dec 20 09:47:08 hyperv02 multipathd: 8:208: reinstated
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Reinstating path 8:208.
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4a597f715a01000000: remaining active paths: 20
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4af87a685b01000000: sdp - tur checker reports path is up
Dec 20 09:47:08 hyperv02 multipathd: 8:240: reinstated
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4af87a685b01000000: remaining active paths: 20
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Reinstating path 8:240.
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4af97a685b01000000: sdr - tur checker reports path is up
Dec 20 09:47:08 hyperv02 multipathd: 65:16: reinstated
Dec 20 09:47:08 hyperv02 multipathd: 3600c0ff00026da4af97a685b01000000: remaining active paths: 20
Dec 20 09:47:08 hyperv02 kernel: device-mapper: multipath: Reinstating path 65:16.
Dec 20 09:47:13 hyperv02 systemd: Created slice User Slice of root.
Dec 20 09:47:13 hyperv02 systemd: Started Session c62357 of user root.
Dec 20 09:47:19 hyperv02 kernel: connection3:0: ping timeout of 5 secs expired, recv timeout 5, last rx 5802680333, last ping 5802685336, now 5802690352
Dec 20 09:47:19 hyperv02 kernel: connection3:0: detected conn error (1022)
Dec 20 09:47:19 hyperv02 iscsid: iscsid: Kernel reported iSCSI connection 3:0 error (1022 - ISCSI_ERR_NOP_TIMEDOUT: A NOP has timed out) state (3)
Dec 20 09:47:19 hyperv02 kernel: scsi_io_completion: 6 callbacks suppressed
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:5: [sdr] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=6s
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:5: [sdr] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:3: [sdn] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=6s
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:3: [sdn] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:2: [sdl] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=6s
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:2: [sdl] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:0: [sdh] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=6s
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:0: [sdh] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:4: [sdp] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=6s
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:4: [sdp] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:1: [sdj] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=6s
Dec 20 09:47:19 hyperv02 kernel: sd 7:0:2:1: [sdj] CDB: Test Unit Ready 00 00 00 00 00 00
Dec 20 09:47:23 hyperv02 kernel: connection3:0: bnx2i: conn update - MBL 0x200000 FBL 0x40000MRDSL_I 0x40000 MRDSL_T 0x40000
Dec 20 09:47:23 hyperv02 iscsid: iscsid: connection3:0 is operational after recovery (1 attempts)
Dec 20 09:47:23 hyperv02 systemd: Removed slice User Slice of root.
Dec 20 09:47:23 hyperv02 systemd: Created slice User Slice of root.
Dec 20 09:47:23 hyperv02 systemd: Started Session c62358 of user root.
Dec 20 09:47:24 hyperv02 systemd: Removed slice User Slice of root.
Dec 20 09:49:21 hyperv02 systemd: Created slice User Slice of root.
Dec 20 09:49:21 hyperv02 systemd: Started Session c62359 of user root.
Dec 20 09:49:30 hyperv02 kernel: connection3:0: ping timeout of 5 secs expired, recv timeout 5, last rx 5802810820, last ping 5802815824, now 5802820832
Dec 20 09:49:30 hyperv02 kernel: connection3:0: detected conn error (1022)
Dec 20 09:49:30 hyperv02 iscsid: iscsid: Kernel reported iSCSI connection 3:0 error (1022 - ISCSI_ERR_NOP_TIMEDOUT: A NOP has timed out) state (3)
Dec 20 09:49:34 hyperv02 kernel: connection3:0: bnx2i: conn update - MBL 0x200000 FBL 0x40000MRDSL_I 0x40000 MRDSL_T 0x40000

Our volumes on the disk array are, eather a RAID5 or RAID6 volumes. We are using a read-cache ssd to boot performence, yet we are facing these problems.

Reading about the I/O supported on the MSA1040 with RAID5 using SATA disks, it is said the array can deliver up to 28000 IOPs.

But when looking in the statistics window, on the webui of the MSA1040 when the last problem happended, the pic was at 2000 IOPs, so from my understanding the controller was not stressed enough to be the cause of the problem.

Could you give advice on how to find the culprit? is it the disk array, the switch or the hypervisors?

Regards.