Disk Enclosures
cancel
Showing results for 
Search instead for 
Did you mean: 

Issues with VSS and VDS on Win 2008 R2 SP1 cluster MSA2324i

NickDorak
Collector

Issues with VSS and VDS on Win 2008 R2 SP1 cluster MSA2324i

We are having an issue using "Storage MAanger for SANs" where the controller never shows up no matter which host we try from (I believe this is using the VDS driver).

The other issue we have is when performing VSS based backups, only one host appears to be able to create hardware VSS based snapshots and all other hosts fail back to using software VSS snapshots.

 

These issues are occuring on our hosts in a cluster using the VDS and VSS services for our MSA2324i Storage Array. The iSCSI is working and we are able to access the drives within Disk Manager and Windows Explorer.  All of the servers we are using are running Windows 2008 R2 SP1 with the latest Windows Updates and HP Drivers.  The HP SAN has 2 volumes on all hosts and is setup for MPIO.

 

I have a feeling this stems from the CAPI driver, but am unsure how to troubleshoot this issue further.

 

I have tried to identify what I feel are the important parts of the logs to save space, if more are needed let me know.

 

From C:\Program Files (x86)\Hewlett-Packard\HardwareProviders\MSA2000\CAPI\ssproxy.log…

HOST2

Jul 24, 2012 14:23:42.058 [3048]: Finding Controllers...

Jul 24, 2012 14:23:42.058 [3048]: SCSI32LMX_FindNextController: first time, table has 2 entries

Jul 24, 2012 14:23:42.058 [3048]: ######## Found 2 controller LUN(s). ########

Jul 24, 2012 14:23:42.061 [3048]: myCallBack: at top, errorCode is not 0

Jul 24, 2012 14:23:42.061 [3048]: Response: ERROR; code = 1(dec) or x1

Jul 24, 2012 14:23:42.061 [3048]: Error: CAPI_ERROR_COMMAND_FAILED.

Jul 24, 2012 14:23:42.061 [3048]: Response: REPLY; code = 300(dec) or x12C

Jul 24, 2012 14:23:42.061 [3048]: Reply: CAPI_REPLY_CAPI_VERSION_MISMATCH

Jul 24, 2012 14:23:42.061 [3048]:  [co xB1BBFC] [ar xFFFFFFFF] [ch xFFFFFFFF] [dr xFFFFFFFF] [p1 0] [p2 0]

Jul 24, 2012 14:23:42.061 [3048]: !!! CAPI_REPLY_CAPI_VERSION_MISMATCH !!!

Jul 24, 2012 14:23:42.061 [3048]:     External controller is using v0.0

Jul 24, 2012 14:23:42.061 [3048]:     We're using v3.10

 

HOST3

Jul 24, 2012 14:26:35.682 [0820]: Finding Controllers...

Jul 24, 2012 14:26:35.682 [0820]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 14:26:35.682 [0820]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 14:26:35.682 [0820]: Freed g_NumControllers=0 controller=0 isPartner=0

Jul 24, 2012 14:26:35.682 [0820]: Freed 18457940, 18530048, 00000000...

Jul 24, 2012 14:26:35.682 [0820]: Couldn't find any controllers

Jul 24, 2012 14:26:35.682 [0820]:

Jul 24, 2012 14:26:35.682 [0820]: Initialization Complete.

Jul 24, 2012 14:26:35.682 [0820]: g_NumControllers 0 allowDuplicateControllers 0

 

HOST4 - only host able to do Hardware VSS snapshots

Jul 24, 2012 12:35:36.824 [3296]: Finding Controllers...

Jul 24, 2012 12:35:36.824 [3296]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 12:35:36.824 [3296]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 12:35:36.824 [3296]: Freed g_NumControllers=2 controller=2 isPartner=0

Jul 24, 2012 12:35:36.824 [3296]: Freed 189B0048, 189C1CA8, 00000000...

Jul 24, 2012 12:35:36.824 [3296]: ######## Found 2 controller LUN(s). ########

 

HOST7

Jul 24, 2012 14:31:05.511 [2012]: Finding Controllers...

Jul 24, 2012 14:31:05.512 [2012]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 14:31:05.512 [2012]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 14:31:05.512 [2012]: Freed g_NumControllers=0 controller=0 isPartner=0

Jul 24, 2012 14:31:05.512 [2012]: Freed 18960048, 1735FFC8, 00000000...

Jul 24, 2012 14:31:05.512 [2012]: Couldn't find any controllers

Jul 24, 2012 14:31:05.512 [2012]:

Jul 24, 2012 14:31:05.512 [2012]: Initialization Complete.

Jul 24, 2012 14:31:05.512 [2012]: g_NumControllers 0 allowDuplicateControllers 0

 

HOST8

Jul 24, 2012 14:31:27.723 [0368]: Finding Controllers...

Jul 24, 2012 14:31:27.723 [0368]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 14:31:27.723 [0368]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 14:31:27.724 [0368]: Freed g_NumControllers=0 controller=0 isPartner=0

Jul 24, 2012 14:31:27.724 [0368]: Freed 16FDF120, 16FFF180, 00000000...

Jul 24, 2012 14:31:27.724 [0368]: Couldn't find any controllers

Jul 24, 2012 14:31:27.724 [0368]:

Jul 24, 2012 14:31:27.724 [0368]: Initialization Complete.

Jul 24, 2012 14:31:27.725 [0368]: g_NumControllers 0 allowDuplicateControllers 0

 

HOST9

Jul 24, 2012 14:41:16.672 [2844]: Finding Controllers...

Jul 24, 2012 14:41:16.672 [2844]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 14:41:16.672 [2844]: SCSI32LMX_FindNextController: first time, table is EMPTY (return now)

Jul 24, 2012 14:41:16.672 [2844]: Freed g_NumControllers=0 controller=0 isPartner=0

Jul 24, 2012 14:41:16.672 [2844]: Freed 1714AFC8, 171CB368, 00000000...

Jul 24, 2012 14:41:16.672 [2844]: Couldn't find any controllers

Jul 24, 2012 14:41:16.672 [2844]:

Jul 24, 2012 14:41:16.672 [2844]: Initialization Complete.

Jul 24, 2012 14:41:16.672 [2844]: g_NumControllers 0 allowDuplicateControllers 0

 

 

From C:\Program Files\Hewlett-Packard\HardwareProviders\MSA2000\VSS\vss.log…

All non-working Hosts

Jul 24, 2012 14:00:54.661: discoverRaidControllers() returning 0x00000000

Jul 24, 2012 14:00:54.661: capiGetHbaWwpnList(): returning 0x00000000

Jul 24, 2012 14:00:54.768: StartingLun = 1

Jul 24, 2012 14:00:54.768: StandardVolumesAreSupported = 2

Jul 24, 2012 14:00:54.768: m_deviceIdDescriptor.m_cIdentifiers=2, m_deviceIdDescriptor.m_rgIdentifiers=0x008dd180

Jul 24, 2012 14:00:54.768:     VendorID='HP', ProductID='MSA2324i', ProductRevision='M113', SerialNumber='00c0ff1045ba0000a992684e01000000', BusType = 9.

Jul 24, 2012 14:00:54.768:     ID#0=(CodeSet=0x01, Type=0x00000003, Length=16, Id=0x600C0FF0001045BAA992684E01000000)

Jul 24, 2012 14:00:54.770:     ID#1=(CodeSet=0x01, Type=0x00000000, Length=32, Id=0x116461326331620000C0FF1045BA00000A0A0A4100C0FF101F3400000A0A0A40)

Jul 24, 2012 14:00:54.770:     m_cInterconnects=0, m_rgInterconnects=0x00000000

Jul 24, 2012 14:00:54.770: Entering: DVSS_HardwareProvider::getPartitionSerialNumber

Jul 24, 2012 14:00:54.770: DVSS_StringToSerialNumber(): SerialNumberString='00c0ff1045ba0000a992684e01000000', SerialNumberOut=000000000263EAE0, SerialNumberBufferLength=32.

Jul 24, 2012 14:00:54.770: returning 0x00000000

Jul 24, 2012 14:00:54.770:    PartitionSerialNumber=00C0FF1045BA0000A992684E01000000, device name=\\.\PHYSICALDRIVE3.

Jul 24, 2012 14:00:54.770: findPartitionArrayAndController() returning 0x80042308

Jul 24, 2012 14:00:54.770: findPartitionAndController() returning 0x80042308

Jul 24, 2012 14:00:54.789: nSubSystem = 0

Jul 24, 2012 14:00:54.789: Couldn't find any controllers

Jul 24, 2012 14:00:54.790: Logging Error Event "No RAID Controllers found.".

Jul 24, 2012 14:00:54.790: discoverRaidControllers() returning 0x00000000

Jul 24, 2012 14:00:54.790: findPartitionAndController() returning 0x80042308

Jul 24, 2012 14:00:54.791: Logging Error Event "AreLunsSupported(): called with nonexistent partition for LUN ('00c0ff1045ba0000a992684e01000000'), device name=\\.\PHYSICALDRIVE3".

Jul 24, 2012 14:00:54.791: returning 0x00000000, supported=FALSE

Jul 24, 2012 14:00:54.791: Exiting: DVSS_HardwareProvider::AreLunsSupported

 

HOST4

Jul 24, 2012 09:21:28.584: discoverRaidControllers(): Found controller #1:

Jul 24, 2012 09:21:28.584:     discoverRaidControllers(): manufacturer=HP StorageWorks, model=MSA2324i, fw rev=M113R11, failover ID=1,

Jul 24, 2012 09:21:28.584:     discoverRaidControllers(): SN=00C0FF101F34, WWNN=208000C0FFDA2C1B,

Jul 24, 2012 09:21:28.584:     discoverRaidControllers(): baselevel Rev=M113R11, board Rev=53, CPLD Rev 8,F300R22, loader rev=19.009, CAPI rev 3.11, capabilities=EBA1553B,7AE435C1,80200FF4,

Jul 24, 2012 09:21:28.585: StartingLun = 1

Jul 24, 2012 09:21:28.585: StandardVolumesAreSupported = 2

Jul 24, 2012 09:21:28.585:     discoverRaidControllers(): supports licensing, supports snapshots,

Jul 24, 2012 09:21:28.585:     discoverRaidControllers(): maxSnapshots=8, maxSnapshotsPerControllerPair=8, baseMaxSnapshots=0, platformMaxSnapshots=256,

Jul 24, 2012 09:21:28.585: maxSnapshotsPerVolume=2148005664, baseMaxMasterVolumes=32, maxMasterVolumes=32, baseMaxBackingStores=32, maxBackingStores=32,

Jul 24, 2012 09:21:28.585:     discoverRaidControllers(): other controller, SN=00C0FF1045BA, WWNN=0000000000000000, fw rev=.

Jul 24, 2012 09:21:28.585:     discoverRaidControllers(): supports snapshot, so we will use this controller.

Jul 24, 2012 09:21:28.587: discoverRaidControllers(): CtrlHandle 0x009ECBD5

Jul 24, 2012 09:21:28.587: Entering: DVSS_CapiRaidController::capiUpdateController

Jul 24, 2012 09:21:28.748: Exiting: DVSS_CapiRaidController::capiUpdateController

Jul 24, 2012 09:21:28.748: discoverRaidControllers(): Found controller #2:

Jul 24, 2012 09:21:28.748:     discoverRaidControllers(): manufacturer=HP StorageWorks, model=MSA2324i, fw rev=M113R11, failover ID=0,

Jul 24, 2012 09:21:28.748:     discoverRaidControllers(): SN=00C0FF1045BA, WWNN=208000C0FFDA2C1B,

Jul 24, 2012 09:21:28.748:     discoverRaidControllers(): baselevel Rev=M113R11, board Rev=53, CPLD Rev 8,F300R22, loader rev=19.009, CAPI rev 3.11, capabilities=EBA1553B,7AE435C1,80200FF4,

Jul 24, 2012 09:21:28.748:     discoverRaidControllers(): supports licensing, supports snapshots,

Jul 24, 2012 09:21:28.748:     discoverRaidControllers(): maxSnapshots=8, maxSnapshotsPerControllerPair=8, baseMaxSnapshots=0, platformMaxSnapshots=256,

Jul 24, 2012 09:21:28.748: maxSnapshotsPerVolume=2148005664, baseMaxMasterVolumes=32, maxMasterVolumes=32, baseMaxBackingStores=32, maxBackingStores=32,

Jul 24, 2012 09:21:28.748:     discoverRaidControllers(): other controller, SN=00C0FF101F34, WWNN=0000000000000000, fw rev=.

Jul 24, 2012 09:21:28.748:     discoverRaidControllers(): supports snapshot, so we will use this controller.

Jul 24, 2012 09:21:28.748: Found 2 controller LUN(s).

 

From C:\Program Files\Hewlett-Packard\HardwareProviders\MSA2000\VDS\iSCSI\vds.log…

All Hosts

Jul 24, 2012 15:33:16: [4328/0628] CProvider::GetProperties()

Jul 24, 2012 15:33:16: [4328/0628] CProvider::GetProperties() out

Jul 24, 2012 15:33:16: [4328/0628] CProvider::GetProperties()

Jul 24, 2012 15:33:16: [4328/0628] CProvider::GetProperties() out

Jul 24, 2012 15:33:16: [4328/0628] CProvider::GetProviderType

Jul 24, 2012 15:33:16: [4328/0628] CProvider::GetProviderType out

Jul 24, 2012 15:33:16: [4328/0628] CProvider::QuerySubSystems() enter. m_mapSubSystems.size()=0

Jul 24, 2012 15:33:16: [4328/0628] Refreshable() : enter !

Jul 24, 2012 15:33:16: [4328/0628] CProvider::IsNeedRefresh() returns FALSE

Jul 24, 2012 15:33:16: [4328/0628] Refreshable() : OUT!

Jul 24, 2012 15:33:16: [4328/0628] CProvider::QueryObjects()

Jul 24, 2012 15:33:16: [4328/0628] CProvider::QueryObjects() out

Jul 24, 2012 15:33:16: [4328/0628] CProvider::QuerySubSystems() exit. m_mapSubSystems.size()=0

 

Thanks for any assistance,

 

Nick Dorak

2 REPLIES
Manfri
Frequent Advisor

Re: Issues with VSS and VDS on Win 2008 R2 SP1 cluster MSA2324i

i didn't have a solution but  merging info from http://h30499.www3.hp.com/t5/Disk-Array/Issues-with-VSS-and-VDS-on-Win-2008-R2-SP1-cluster-MSA2324i/m-p/5737973#.UA-pR7Q0OSp and this http://h30499.www3.hp.com/t5/Disk-Array/CAPI-VDS-VSS-problems/td-p/5695987 I concluded that:

 

The capi ( wich is used by vss and vds to communicate with the san ) get info from the first lun about the san capabilities.

if the lun is owned by the node ok, otherwise if it is reserved by another node fail.

 

the thing that pointed me to this was the combination of this from the first thread

 

Jul 24, 2012 14:00:54.661: discoverRaidControllers() returning 0x00000000

Jul 24, 2012 14:00:54.661: capiGetHbaWwpnList(): returning 0x00000000

Jul 24, 2012 14:00:54.768: StartingLun = 1

Jul 24, 2012 14:00:54.768: StandardVolumesAreSupported = 2

Jul 24, 2012 14:00:54.768: m_deviceIdDescriptor.m_cIdentifiers=2

 

and the the second thread quoted

 

in the second thread  thread quoted above the quorum is ther first lun mapped  and i suspect that for the second thread  the first lun was owned by the node who is able to do hw snapshot.

 

i'had the same problem but none was unable to fix it ( even HP support ) but the mix of this two thread, a new p2000 install and a sleepless night i think sheds new light on the problem.

 

the first test in my environment ( moving the cluster resorce to the second node ) made the storage for manager show lun info in the second node, but i'm sure that is not a real solution.

in my next test i think i will map a dedicated ( non clustered ) lun as first lun on every node and i believe that work around fix somewhat the problem.

 

I understand that it's not a even a solution especially  if you have lot's on node on a iscsi msa2000 (pleas read this even if maybe it's resolved on newer firmware ) http://www.hyper-v.nu/archives/hvredevoort/2011/01/array-firmware-as-a-limiting-factor-in-r2-clusters/ but if this workaround works maybe we are able to contact HP support with a clear problem and something that we can claim is a bug to be fixed

 

may the force be with us

Manfri
Frequent Advisor

Re: Issues with VSS and VDS on Win 2008 R2 SP1 cluster MSA2324i

Update: from a quick test (only vds capability but previously that test was conclusive) i've seen that the problem is fixed

Tested with firmware T250R16-01 with capi proxy/vds/vss 2.8.4.39 

 

from releases notes also

 


» Version: 2.8.3.27 (20 Jun 2012)
Fixes

Upgrade Requirement:
Recommended - HP recommends users update to this version at their earliest convenience.

Version : 2.8.3.27

Addressed a buffer overrun on systems with a large number of master volumes
Enhancements

The following enhancements and fixes were incorporated in 2.8.2.27:

Added support for HP P2000 G3 MSA Controller TS240 firmware.
Added support for up to 32 arrays and 32 HBA ports on a single system.
Improved device discovery and debug logging.