StoreVirtual Storage
1751892 Members
5588 Online
108783 Solutions
New Discussion

Re: P4000 not presenting Luns to vSphere

 
silus
Advisor

P4000 not presenting Luns to vSphere

Hi,

I am trying to troubleshoot a strange issue with our P4300 cluster and vSphere 5.0 u1.

 

On the storage side we have all the ESXi servers set up as a cluster, and this cluster is assigned read/write access to LUNs.

 

On the VMware side, the VIP address is entered into the dynamic discovery option for the software iscsi initiator as normal.  Each node is set up with iSCSI port binding, with 2 vmkernel ports with a nic assigned to each one as active and the other not used.

 

We know this set up is working as we have several luns set up and are online. What we are finding is, if we create a new LUN, set the appropriate permissions on the LUN, and then do a rescan of the initatior and then try and add storage - nothing.  Nothing shows up for quite a while.


Often we can wait a while, and then try again, and it will show up without any changes having being made.  It just seems like for a period after rescanning it simply does not see any new LUNs presented and then after a while it does.


The cluster is not under particularly high IO load.


Is there anything that can be recommended to help troubleshoot this or has anyone come across this odd behaviour adding iSCSI LUNs to ESXi?

 

thanks.

6 REPLIES 6
silus
Advisor

Re: P4000 not presenting Luns to vSphere

Just to add the error we are seeing is 'iSCSI discovery to x.x.x.x on vmhba36 failed. The iSCSI initiator could not establish a network connection to the discovery address'

 

This however was working and the LUNs already presented are working fine. Just that adding new ones seems to fail.

 

The security is definitely configured correctly on the P4000 side and am able to ssh to the esx node and vmkping the iscsi vip?

Bryan McMullan
Trusted Contributor

Re: P4000 not presenting Luns to vSphere

Things are usually instant on my Vsphere cluster when connecting to my P4000 groups, though the rescan does take a little bit of time (at times).

Is there any network configuration issues? It sounds like the discovery is not making it to (or perhaps from) the VIP of the SAN cluster. Have you done a tcp dump of the traffic? Is there any routing or firewalls involved?

Alternately, what if you try from another system? Do the new volumes show instantly?
silus
Advisor

Re: P4000 not presenting Luns to vSphere

I have in conjunction with VMware support extracted some logs which indicate there are iscsi login errors. For example:

 

2012-09-17T15:36:11Z iscsid: Notice: Assigned (H37 T7 C1 session=78, target=8/8)
2012-09-17T15:36:11Z iscsid: DISCOVERY: transport_name=iscsi_vmk Pending=2 Failed=0
2012-09-17T15:36:11Z iscsid: connect failed (111,Connection refused)
2012-09-17T15:36:11Z iscsid: Login Failed: iqn.2003-10.com.lefthandnetworks:hvtiscsi:2506:testlun

 

The strange thing is that the logins sometimes work. If I keep trying eventually it will find the disks presented to it. And the existing ones are working OK.

 

I am struggling to understand why logins would fail sometimes and not others. Surely it would be not at all if there was something misconfigured?

 

Perhaps time to contact HP support...

Bryan McMullan
Trusted Contributor

Re: P4000 not presenting Luns to vSphere

Would be a smart move to contact HP support.  Do you have any special characters in the iqn of the VMware node?  Does the issue happen with all nodes in your VMWare cluster? 

 

Sometimes the connection refused happens when a volume is disconnected irregularly and you need to wait for the hung sessions to time-out.  Other times, a restart of the iSCSI service (or reboot) of the VMWare node may help.

 

I thought I read something about hung sessions in a patch note.  Do you have Patch set 3 installed on your P4000's?

silus
Advisor

Re: P4000 not presenting Luns to vSphere

P4000 is fully patched.  After some more investigation one line in the log says this:

 

if=iscsi_vmk@vmk2 addr=x.x.x.x:3260 (TPGT:1 ISID:0x2) Reason: 00040000 (Initiator Connection Failure)

 

The 00040000 error seems to indicate this is a network layer issue. I am struggling to think what would cause this though.

 

The only misconfiguration I can find is that by default the P4000 nodes have flow control enabled by default for receive.  I think the ESX nodes by default have flow control enabled for all interfaces.

 

The switches that are connected to the P4000 do not have flow control enabled on the ports. 

 

We will investigate enabling this but I can't see not having flow control enabled would cause this issue? I think it is a bit of a red herring.


Also, should delayed ack be disabled?

 

thanks

silus
Advisor

Re: P4000 not presenting Luns to vSphere

Just in case anyone else runs into this thread for the same issue, it looks like we have fixed this by disabling Delayed Ack on all of our ESX nodes and rebooting them.

 

We can now provision further luns and they show up on the ESX hosts after one rescan, and then propogate to all nodes as expected once a vmfs datastore is created.

 

It seems that on high IO clusters the P4000 nodes do not play nicely with Delayed Ack which is enabled by default.