BladeSystem - General
cancel
Showing results for 
Search instead for 
Did you mean: 

Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

Ludwig Penny
Occasional Advisor

Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

Current Configuration

Device Information
Blade Type Server Blade
Manufacturer HP
Product Name ProLiant BL460c G6
Part Number [Unknown]
System Board Spare Part Number 531221-001
Serial Number CZJ01007WD
UUID 38373035-3436-5A43-4A30-313030375744
BIOS Asset Tag
Server Name emdcesx04.jse.co.za
ROM Version I24 10/01/2009

Server NIC Information
Ethernet FlexNIC LOM:1-a 00:17:A4:77:28:40
Ethernet FlexNIC LOM:2-a 00:17:A4:77:28:42
Port: iLO D8:D3:85:67:0C:B2

Mezzanine Card Information

Q Logic HBA
bios 212
F code 2.03
EFI version 2.05
Flash FW 4.04

Mezzanine Slot Mezzanine Device Mezzanine Device Port Device ID
1 QLogic QMH2462 4Gb FC HBA for HP c-Class BladeSystem
Port 1 50:06:0b:00:00:c2:8a:20
Port 2 50:06:0b:00:00:c2:8a:22
2 NC326m Dual Port 1Gb NIC for c-Class BladeSystem
Port 1 00:17:a4:77:28:44
Port 2 00:17:a4:77:28:46

CPU and Memory Information CPU 1 Quad-Core Intel Xeon, 2667 MHz
CPU 2 Quad-Core Intel Xeon, 2667 MHz
Memory 32768 MB

C7000 Enc
Virtual Connect
Interconnect Bay Information - Bay 1 (HP 1/10Gb VC-Enet Module) Part Number: 399593-B22
Product Name: HP 1/10Gb VC-Enet Module
Serial Number: TW280700B2
Dip Switch Setting: 0x0
Spare Part Number: 399725-001
Manufacturer: HP
Firmware Rev. 2.30 2009-09-29T01:44:36Z

Interconnect Bay Information - Bay 3(HP 4Gb VC-FC Module) Part Number: 409513-B21
Product Name: HP 4Gb VC-FC Module
Serial Number: MY58510346
Spare Part Number: 410152-001
Manufacturer: HP


Software
ESX 4.0 update 1

Storage
EMC Symmetrix DMX4

Scenario:

Experiencing intermittent disconnects from all ESX servers to all LUNs.
Error Description :

Lost access to volume 4bcc91bd-37fe4b68-71b8-
002481aa9b18
(DMX3238_LUN00AC1_SQL02_XISHIS04) due to
connectivity issues. Recovery attempt is in
progress and outcome will be reported shortly.
info
2010/08/24 01:23:54 AM

9 REPLIES
Jan Soska
Honored Contributor

Re: Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

Hello, have you contacted HP support? Your problem is quite serious!?

what to check:
1) have you other servers/OS's conected the same way (same SAN) to the same EMC array?
If yes, is it working?
2) if it is only ESX issue, raise supporl call to vmware as well.
3) consider upgrading your HBA - there is newer firmware/bios - 1.89 -2.15 BIOS/ 2.20 EFI released 26 May 2010
4) for sure upgrade your VC-FC firmware - there is much newer version at http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=3201247&prodTypeId=3709945&prodSeriesId=3201246&swLang=8&taskId=135&swEnvOID=4040, your version is almost 1 year old..

Jan
cjb_1
Trusted Contributor

Re: Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

What's the OA firmware revision? Think minimum of 2.60 for the G6 460.
Do you only have modules in bays 1 and 3?
Ludwig Penny
Occasional Advisor

Re: Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

Hi to answer your questions...

Yes we have contacted HP and our third party for support. HP engineed recommended posting a thread here for support. This problem is cirticle as we are running our live production systems on these servers.

We had 12 BL460c G6 servers running three Windows Cluster Servers in two seperate enclosure across two sites with no issues. The ESX servers replaced the cluster servers. In the same enclosures.

We have logged a call with VMware and EMC

The reply from EMC is this :SUPPORT COMMUNICATION - CUSTOMER NOTICE
Document ID: c01519875

Version: 2

Notice: (Revision) VMware ESX Server - The HP Fibre Channel Agent cmafcad Should Not Be Enabled on Unsupported Storage Subsystems and Third-Party Fibre Storage Enclosures Running VMware ESX Server
NOTICE: The information in this document, including products and software versions, is current as of the Release Date. This document is subject to change without notice.
Release Date: 2010-08-11

Last Updated: 2010-08-11

To answer the question about OA Firmware we are running the following.

Device Name BladeSystem c7000 OA
Firmware Version 2.60 Aug 31 2009
Hardware Version B1

We are running

4 x HP 1/10Gb VC-Enet Module with Firmware Version 2.30 (Slot 1,2,4,5)

2 x Product Name HP 4Gb VC-FC Module with Firmware Version 1.40 (Slot 3 & 4)
Ludwig Penny
Occasional Advisor

Re: Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

Sorry I we are running

4 x HP 1/10Gb VC-Enet Module with Firmware Version 2.30 (Slot 1,2,5,6)

2 x Product Name HP 4Gb VC-FC Module with Firmware Version 1.40 (Slot 3 & 4)
cjb_1
Trusted Contributor

Re: Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

That's good. Shouldn't be any need to upgrade enclosure FW unless HP can give you good reason.

Good luck.
cjb_1
Trusted Contributor

Re: Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

Just a thought, can you post the VC log for one of these events?
Ludwig Penny
Occasional Advisor

Re: Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

2010-08-07T03:03:21-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showDomainStatus ([UNKNOWN]@[LOCAL])
2010-08-07T03:03:21-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-07T12:59:54-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showManagedObjects ([UNKNOWN]@[LOCAL])
2010-08-07T12:59:54-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-07T13:01:59-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showManagedObjects ([UNKNOWN]@[LOCAL])
2010-08-07T13:01:59-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-17T16:02:14-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1010:Info] VCM user login : exacct@172.17.3.206
2010-08-17T18:21:25-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1011:Info] VCM user logout : exacct@172.17.3.206
2010-08-18T16:06:26-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1010:Info] VCM user login : exacct@172.17.50.34
2010-08-18T19:10:01-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1011:Info] VCM user logout : exacct@172.17.3.206
2010-08-19T15:52:19-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1010:Info] VCM user login : exacct@172.17.3.206
2010-08-19T17:17:32-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1011:Info] VCM user logout : exacct@172.17.3.206
2010-08-20T11:36:34-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showManagedObjects ([UNKNOWN]@[LOCAL])
2010-08-20T11:36:34-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-20T11:38:13-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showManagedObjects ([UNKNOWN]@[LOCAL])
2010-08-20T11:38:13-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-21T03:03:27-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showDomainStatus ([UNKNOWN]@[LOCAL])
2010-08-21T03:03:27-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-23T22:53:24-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showManagedObjects ([UNKNOWN]@[LOCAL])
2010-08-23T22:53:24-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-23T22:56:18-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showManagedObjects ([UNKNOWN]@[LOCAL])
2010-08-23T22:56:18-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-23T22:59:57-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showManagedObjects ([UNKNOWN]@[LOCAL])
2010-08-23T22:59:57-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-23T23:02:19-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1032:Warning] VCM remote session is invalid or has expired : hpvcd:showManagedObjects ([UNKNOWN]@[LOCAL])
2010-08-23T23:02:19-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1031:Warning] VCM remote request has no security header : hpvcm:retrieveStateChangeCounters ([UNKNOWN]@[LOCAL])
2010-08-24T17:21:52-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1010:Info] VCM user login : exacct@172.17.3.206
2010-08-24T18:16:37-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1011:Info] VCM user logout : exacct@172.17.3.206
2010-08-25T15:35:21-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1012:Warning] VCM user authentication failure : exacct@172.17.3.206
2010-08-25T15:35:45-02:00 VCETW280700B2 vcmd: [VCD:Prod-C-ENC4-Mirror-Site_vc_domain:1010:Info] VCM user login : exacct@172.17.3.206
cjb_1
Trusted Contributor

Re: Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

Can't see any physical interruptions in this log at all. If the VC FC was losing connection to the storage I'd expect degradation messages for the assigned profiles or something. I reckon you'll need to pursue the ESX people for this one.

Tijn
Advisor

Re: Virtual Connect : experiencing intermittent disconnects from all ESX servers to all LUNs

Hi Ludwig,

I'm expiriencing the same problem at 2 customer sites.

I also noticed on a couple hosts that the WWNN on 1 port of the FC HBA was changed.
I even got on 2 hosts exactly the same wrong WWNN on port 1 of their HBA's.

Do you have an update for this problem ?

regards,

Martijn