ProLiant Servers (ML,DL,SL)
Showing results for 
Search instead for 
Did you mean: 

Online Recovery

Ayman Altounji
Valued Contributor

Online Recovery

We have several pairs of redundant servers installed on a few sites. Currently they are configured as using the Standby recovery option, where the shared RAID is used by the primary, and the secondary server waits after POST for the primary to fail, after which it assumes control of the RAID and boots normally. This works, but takes up to 4 minutes to change over.

We have attempted to use the Online recovery option, where each server is online with its own OS drive, with the primary in charge or the RAID until it dies, when the secondary server takes over it.

The time taken to switch over in this configuration is still excessive, as the services need to be restarted (including SQL Server) and a version CHKDSK is run when the RAID is switched over. This still leaves us with about a 4 minute changeover. This is not acceptable to the customer (understandably).

Does anyone know where the latest drivers for this option are, and also whether we can cut down on the time taken to perform the switchover. For example, the heartbeat monitor recognizes the loss of the primary server in under 10 seconds, but the other agents appear to take another 3 1/2 minutes to do anything about it...

Many thanks,

Nigel Quinnin
Mark Cloutier
Respected Contributor

Re: Online Recovery

yes this is something that unfortunately is how it goes...the kind of setup you seem to be leaning towards though is might want to look into some sort of clustering solution for your servers
We are here for a good time, not a long time!