HPE StoreVirtual Storage / LeftHand
cancel
Showing results for 
Search instead for 
Did you mean: 

Lefthand upgrade 8.5>9.0 results in Windows 2008r2 cluster node crashes

SOLVED
Go to solution
ictgvbsb
Frequent Advisor

Lefthand upgrade 8.5>9.0 results in Windows 2008r2 cluster node crashes

I`ve have done a upgrade of Lefthand SanIq 8.5 > 9.0

During this upgrade several windows 2008r cluster nodes randomly crashed.

After booting the HP DSM favorite targets ip-adresse where changed into other NIC adresses(LAN and/or Cluster NIC)

 

Anyone have had the same problem?

 

I have followed the best practise guide of HP with configuringHP DSM on my servers. Are there additional windows settings to do?

How can I  prevent mix-up of iscsi favorites and prevent reboot/crash of windows 2008 cluster nodes during future upgrades.

 

 

2 REPLIES
ictgvbsb
Frequent Advisor

Re: Lefthand upgrade 8.5>9.0 results in Windows 2008r2 cluster node crashes

I have posted  a support case for this strange issue at HP Storage support.

They have ellevated the case already....

 

to be continued.

ictgvbsb
Frequent Advisor
Solution

Re: Lefthand upgrade 8.5>9.0 results in Windows 2008r2 cluster node crashes

Hi there,

 

Hp advises to bring the 12 node cluster back top 10 nodes and applie the

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\

MaxRequestHoldTime too 600 seconds(eq 10 minutes)

 

See also the time out doc from HP, this is an old document and is still usable also for windows 2008 R2

..

 

This is the reaction of L2 HP:

 

Below are the analysis from the L-2 engineer 

  1. No errors reported during the upgrade to point any hardware failures
  2. I see volumes going unavailable during the time of upgrade. This is important as it can cause Nodes crash. I suspect initiator and disk timeout settings may not be appropriate from hosts. Customer to confirm this
  3. The nodes reboot time is too close 8-10 minutes gap so I suspect there would be occasions where more than 1 node were not synced and led to this. Recommendation is to have fewer nodes in cluster , 12 nodes are present  in this cluster where issue was reported , I would recommend less than 10 nodes if feasible . NOTE : Customer did not report issue with other clusters with fewer nodes
  4. The Best practices guide clearly tells to bring nodes offline during upgrade or atleast have minimal work load. I suspect the load wouldn’t be very less as it was Thursday morning .