1830413 Members
2696 Online
110002 Solutions
New Discussion

Re: Server Replacement

 
Rich Beadles
Occasional Contributor

Server Replacement

We are investigating replacement of our two K9000 HPUX servers. We are very concerned with server redundancy and disaster recovery, which we don't really have now. We have looked into MC ServiceGuard, but could not afford this option. Another option that occurred to us was purchasing three servers to replace our existing two servers. One of the servers would be located at another site connected to our primary site via fiber optic cable. All three servers would be connected to our Compaq SAN via fibre channel cards, where all of the database information would be stored. The third server would act as a spare (and we could use it to test) in case of a failure or planned outage of one of the other two servers. While automatic failover would be fairly complicated a manual failover would require us only to make a simple change on the SAN using its management software and to rename and change the IP address the third Unix server to the name and address of the failed Unix server. What would be the implications of doing this using HPUX 11? How difficult would it be to do? How long would it take to complete the changes? Thanks in advance for your help!
5 REPLIES 5
harry d brown jr
Honored Contributor

Re: Server Replacement

do a man on set_parms

live free or die
harry
Live Free or Die
harry d brown jr
Honored Contributor

Re: Server Replacement

Rich,

Another thing, after you change the IP of the "spare" server, and "rezone" the disks to be available to the "spare" server, you will probably have to do at least some of these things:

o ioscan the devices

o "insf -e" the devices

o import the VG's
o mount the LV's

o fsk the filesystems

o do some kind of data validation - database check

o start the applications

o check to make sure your routers know how to get to the "spare" server - you might need to flush the arp table in the routers.

You also need to consider OS related issues, like PASSWORDS and ACCOUNTS. Some how you need to make sure these stay "in-sync" (no not the "performers").



live free or die
harry
Live Free or Die
Wodisch
Honored Contributor

Re: Server Replacement

Hi,

since your athinking about a "manual" failover, if your three servers will be *hardware-look-alikes* (i.e. same interface-cards in the same slots, etc.) then you could simply install "make_tape_recovery" tapes on the third one.
To automate that, you could install "Ignite/UX server" on the two "productive" systems, and schedule (vie crontab-jobs) "make_net_recovery" to the opposite station, and then if case of disaster restore the third server from the *surviving* Ignite/UX server of the network.
You could even create a periodic job on the third system to check the *others* and start the restore from the surving Ignite/UX server automatically...

But all that will not really come close to MC/ServiceGuard :-(
And your *third* machine will be "wasted" as a warm backup, not being able to do anything productivly in between... so MC/SG might be even cheaper than a third server!

Just my $0.02,
Wodisch
Thomas Schler_1
Trusted Contributor

Re: Server Replacement

Rich:

MC ServiceGuard is the right software you need. It is designed for server redundancy. For disaster recovery, you should use
Ignite UX (tool for creating system images and bootable
tapes).

Reading your text, security and high availability has high
priority in your company. So, you really should think off
replacing two servers by three servers. The advantage is that
at least one server is placed at a different place, best would be if this server is placed in a different building. But then, you also should think off archiving your database backups on
completely different places, so that fire or any disaster at one place won't stop your company's activity, since the second place (one server including restored resent database files) is still working.

You should not use the third server just for testing or acting as a spare. The chances are high that you "forget" something so that all works fine on your testing machine but not on the others, that are much more critical (just think of forgetting creating necessary mount points or other things). In the case
that the two other servers are down for any reason, you need a full functioning server doing all the jobs that have to be done without loss of performance. You really don't know *today*, why both the other servers could be down. But, really, there are hundreds of reasons why two servers could be down at the same time.

The best choice is to have all three servers at the same level (same hardware and same software configuration at the same time). Then, you should introduce a kind of circular usage of all servers. E.g. server 1 and 2 are running applications, while server 3 stands by. Then server 3 can be used for testing or installing patches. Next time when you had to reboot your systems (e.g. after patch installations), server 2
and 3 should run applications, and server 1 should stand by or should be used for testing. At the end, server 1 and 3 are
running applications, while server 2 stands by or can be used for testing.

This way using the servers always ensures that all servers are running well. And in case of a server failure, you really know, that the application switch from one server to the other works quite well, even at night time when you are sleeping in
your bed (and need not to be awaken). This works quite well
using MC ServiceGuard.

If your SAN management software can be configured to have two servers, why not configuring your SAN to have three servers connecting to your SAN? If this is not possible by any reason
you have to use set_parms as Harry wrote. (But you should
throw away your SAN management software as soon as possible.)

set_parms:

> What would be the implications of doing this using HPUX 11?

Execute '/sbin/set_parms hostname' and '/sbin/set_parms
ip_address'. Reboot your system.

> How difficult would it be to do?

Very easy.

> How long would it take to complete the changes?

Not longer than you need for changing two parameters and
rebooting your system.
no users -- no problems
Dave Wherry
Esteemed Contributor

Re: Server Replacement

I've tried to go this route myself. I call it Poor Man's ServiceGuard. With either an EMC frame or and XP256/512 I've proposed this and felt confident it would work. Just never got the project funded. As the others filled in some of the blanks, you would be OK if this is the level of DR protection you want.

The one problem with the scenario is in guarding against a physical disaster at your primary site. While you would have a standby server at another location, all of your data is in your primary site. A fire, electrical outage .... that could take out a primary server could also take out your data. You might also want to look at adding data replication to your alternate site. EMC has SRDF. The XP's have Continuous Access. Compaq also has this functionality, I just can't remember the name of it.

Honestly, the new servers are dependable enough that I'm not that worried about a server being down for an extended time where I would want to cut over to a standby. I'm more worried about a larger scale physical disaster.