Server Clustering
cancel
Showing results for 
Search instead for 
Did you mean: 

DL380 MSA500 packaged Cluster power on order

SOLVED
Go to solution
Butault
Occasional Visitor

DL380 MSA500 packaged Cluster power on order

hi,

i have a question about automatic power on after a general power failure we had on an DL380 packaged cluster with MSA500.

We had to retart again both nodes (Win2003) for them to see the MSA500 because everything (Servers an MSA500) did start automaticly at the same time. The problem is that we have to start manually the cluster and it can not restart fully by itself.

any solutions?
10 REPLIES
Doug de Werd
Honored Contributor
Solution

Re: DL380 MSA500 packaged Cluster power on order

the only way to really solve this problem is with an intelligent UPS that can stagger the power up times. The MSA500 needs to come on first, and then the servers.

Thanks,
Doug
Expert in ProLiant Clusters
MC1903
Regular Advisor

Re: DL380 MSA500 packaged Cluster power on order

Hi,

In order to stagger the OS startup for each server, edit the TIMEOUT value in the C:\boot.ini file on each node.

I suggest setting node-A to 300 seconds (5 mins) and node-B to 420 seconds (7 mins).

This way the MSA 500 should have sufficient time to complete its start-up processes before node-A loads. By the time node-B starts, the cluster should be up and running on node-A.

I have used this many times in the past on NT4 EE & W2K A/S clusters and I dont see why it would not work on W2K3 clusters.

Hope this help you.

Martin

Butault
Occasional Visitor

Re: DL380 MSA500 packaged Cluster power on order

Thanks martin, but my problems comes from the MSA500.

The MSA500 is a SCSI shared storage, the LUNS are discovered in bypass of the internal SA5i+. In order to work, I would have to be able to put a delay before the servers bios in order to see the LUNs if everyting start at the same time.

but anyway thanks for your answer.

Louis-Marie
MC1903
Regular Advisor

Re: DL380 MSA500 packaged Cluster power on order

Mmmm - Fatal flaw in my plan then!

I completely overlooked the fact that it was a SCSI array. Upto now I have been using fibre channel (RA4x00, MSA1000, etc.) arrays. This problem does not occur with FC arrays, as they are not scanned during the server's POST.

I have just built an MSA500 windows cluster and I have now encountered this problem.

If anyone from HP ProLiant/ISS Engineering is monitoring this forum - Is there any way you could give us a delayed/timed start option?

Perhaps as the server begins POST (somewhere before the PCI bus/card initialisation) start a countdown delay, before resuming the rest of the normal POST.

This way it would allow the MSA500 to complete its start-up, before the SMART Array controllers scan for logical drives.

If this is feasible, could someone let us know the likelihood of it being added in to a future enhancement release?

Cheers,

Martin
Doug de Werd
Honored Contributor

Re: DL380 MSA500 packaged Cluster power on order

As I mentioned in my previous post, the only way to really do this is with an intelligent UPS that delays the bootup of the servers.

As you noticed, you can't put this into the boot.ini file because you have to have the MSA500 complete its startup routine entirely before the servers even begin their POST process.

The only other possibility, which does not exist today, would be to have a setting in NVRAM to suspend POST at a point prior to the Smart 5i initialization. But that would also delay it on every reboot, since the BIOS can't tell whether or not the bootup is a standard one or one after a failure.

So you could do this with an HP R3000 UPS. You need to use the HP Rack and Power Managee utility and have the UPS connected to one of the servers via a serial cable. The UPS power outlets can be configured in "banks", and you can specify a wait period before applying power. So you would connect the cluster servers to one bank and the storage to another. If power goes down, the first thing that will happen is that the UPS will recognize this. It can be configured for several types of action, but assuming the power was down for a long period (a couple hours), after about 10-15 minutes the UPS would initiate a graceful server shutdown (via the UPS Service running on the servers - not just a power off). At that point the UPS would then cut off all power. Once power has been restored, the UPS checks its configuration and sees that the bank with the storage on it should be powered up first. Then, after the predetermined wait period (which you have to manually specify when you configure the UPS - say 3 or 4 minutes), the UPS will apply power to the second bank that has the servers attached.

You can get more info on this here:

http://h18004.www1.hp.com/products/servers/proliantstorage/power-protection/rackups/index.html

http://h18004.www1.hp.com/products/servers/proliantstorage/power-protection/software/index.html

Thanks,
Doug
Expert in ProLiant Clusters
Uwe Zessin
Honored Contributor

Re: DL380 MSA500 packaged Cluster power on order

Why not fix the device drivers and have them issue a re-scan of the bus when the operating system initializes?
.
Doug de Werd
Honored Contributor

Re: DL380 MSA500 packaged Cluster power on order

You could probably do something like that, but it also would not be without problems.

You would have to link the rescan behavior to the cluster service, because if the cluster service tries to start and doesn't see any disks, it will fail. Also, how would you put in a timer into the driver to know WHEN to rescan the bus? 2 minutes? 30 seconds? 5 minutes? Would you have to create a utility to manage the driver? Would you have it rescan the bus every time it boots? If so, what about the times when you do a manual startup - now you have your application running and a couple of minutes later you have a disruptive rescan. What if it is not a cluster and you are booting from the external storage? Then the driver rescan wouldn't work anyways because the driver is on the storage!

There are probably several ways to solve this problem, but applying engineering resources to it (and ongoing maintenance, testing, and support) probably would not be worth it in the long term, especially since there is a simple solution already available.

Plus, if you are really concerned about high availability (isn't that why you got a cluster in the first place?), why would you not already have a UPS?

So yes, it could be probably be technically done, but it probably would be more work than it is worth.

Thanks,
Doug
Expert in ProLiant Clusters
Uwe Zessin
Honored Contributor

Re: DL380 MSA500 packaged Cluster power on order

OK, now I've a better understanding - thanks!
.
Corbett Enders_1
Occasional Visitor

Re: DL380 MSA500 packaged Cluster power on order

While the suggestion of using the UPS is valid, in a larger datacenter this is likely not possible.

A solution which is viable to everyone is using a "switched PDU" available from someone like APC. A switched PDU gives you the ability to have "power on delay" on each outlet. With that, the shared storage can power on immediately and the nodes can stay off for several minutes. Even further to that, this would give something like a SQL server to complete a startup before a utility server boots to access the database.
Zdenek Sychra
Occasional Visitor

Re: DL380 MSA500 packaged Cluster power on order

we have tried APC PDU AP7921 but if we unplugged PDU unit and then plugged again to 230V (we simulated the process of starting servers and storage after general power failure), servers started without any delay.

Is it standard behaviour of PDUs? Any ideas?