Simpler Navigation for Servers and Operating Systems
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
General
cancel
Showing results for 
Search instead for 
Did you mean: 

What are the methods available for High Availability for a Unix server?

Daniel Tan_1
Occasional Visitor

What are the methods available for High Availability for a Unix server?

Hi,

I'm currently looking at different ways to ensure high availability for a
business application and I was looking at the Unix box that holds our
database (Oracle) and after a quick chat with the DBA. He suggested

Standby server
============
- A complete back-up solution for Unix and database aspects.

Pros:
- Minimal impact to users during outages
- Outages are transparent to users
- High availability of database at all times

Cons:
- Expensive initial start-up and maintenances.
- Minimal usage when on standby.

Questions/Notes:
------------------
- Any other pros and cons that you can add to the list?


MC/Service Guard with Oracle High Availability Service (HP)
==========================================
- Activates another production server with similar hardware configuration
upon outage. Prior to outage, this production server performs it's own jobs
and services and only acts as a backup server when outages happen. Used in
Agilent for some servers.

Pros:
- Minimal impact during outages (similar to standby server)
- Outages are transparent to users (similar to standby server)
- High availability of database at all times
- Cheaper than standby server
- Better utilization of resources for both servers.
- Currently used in Agilent.

Cons:
- Requires sourcing of servers of similar hardware configuration which might
not be available.
- Hardware upgrading required for both servers to support full load of both
servers when either server goes down

Questions/Notes:
-----------------
1) Does it require a lot of testing and configuration on both servers?
2) Any other pros and cons that you can add to the list?


Standby database
=============
- A back-up database residing on another server that activates only when the
production database is down
- Not automated process of switching to the standby database and requires a
certain period of down time.

Pros:
- Most economical of all 3 methods

Cons:
- Requires sourcing of production servers that are not heavily utilized
- Hardware upgrading required for this back-up server.
- Users will be affected as down time is required.

Questions/Notes:
-----------------
- Any other pros and cons that you can add to the list?

Is there any other methods or suggestion that I can employ to ensure that
High availability for a Unix server?

Appreciate any comments or suggestions.


10 REPLIES
Animesh Chakraborty
Honored Contributor

Re: What are the methods available for High Availability for a Unix server?

Hi,
It all depends on how critical your system is.
How much downtime you can efford.
Configuring MC/service guard is not a big issue.What is the current configuration of your server and OS version ?

Thanks
Animesh
Did you take a backup?
James R. Ferguson
Acclaimed Contributor

Re: What are the methods available for High Availability for a Unix server?

Hi Daniel:

In my opinion, the MC/ServiceGuard solution is the most desirable.

Consider, that in High Availability (HA) clusters, "active-active" describes a configuration where several nodes (servers) are running critical applications while serving as backup nodes for one other.

"Active-passive" ("active-standby") configurations are those where several nodes are running critical services and others are idle awaiting a failover event. The idle nodes will takeover in the event of a primary node's failure. In some cases, the standby node may be a test system which ceases to serve as a test environment in the event it needs to assume the critical applications of its failed primary.

You have the option of configuring a standby node as a smaller server than the primary when cost is at issue. The idea is to provide service during a failover, albeit not necessarily at optimum levels.

You also have the option of shedding packages from the standby server during a failover to reduce its overall load, leaving more of its resources available to the package adopted from the primary (failed) server.

There a a wealth of documents in the Technical Knowledge Base on MC/ServiceGuard. The setup of a functional cluster and the certification of its viability are not trivial. However, there is a wealth of documents at www.docs.hp.com -- specifically here:

http://docs.hp.com/hpux/ha/index.html

and in the Technical Knowledge Base (TKB). You can find MC/ServiceGuard documentation in the TKB by searching with the keyword "UXSG*". One document you will find immediately useful if you setup a cluster is: "How do I test MC/ServiceGuard?", document #UXSGKBAN00000471.

Regards!

...JRF...
Daniel Tan_1
Occasional Visitor

Re: What are the methods available for High Availability for a Unix server?

Here's the Unix server specs.

Server Type : UNIX
System Model : N4000
Number of Processor Card: 4
MEMORY : 4 GB

DISK:
Application Disk Space : 88 GB
USes RAID 5 Protection & MIRROR Protection
OS Type : HP-UX
OS Version : 11.0
Roman_5
Occasional Visitor

Re: What are the methods available for High Availability for a Unix server?

just some thoughts:
When running Oracle Parallel Server under MC/Service Guard each node is running a separate oracle instance but they are still connected to the same database and share the same set of hard drives.
So Oracle Parallel Server needs shared data disks which can be connected to both nodes. OPS also needs raw partitions instead of filesystems for database.
Planning to use backup server to perform some services while you application is still running on primary server you need to make sure that IO thoughput is sufficient to accomodate both your usual production load plus these services (jobs) concurrently.
Also I'd try to avoid running jobs on backup server that can touch the data that is being accessed by production application.
When the same data is accessed by 2 oracle instances oracle has to ensure that each of them sees the changes made by another. As a result often instead of accessing memory cache the data will be read from disk or sent over the cluster interconnect, which being slower may detrimentally affect performance.

To make failover completely transparent to users application still needs to be able to handle some exceptional conditions whether oracle connection time failover or oracle transparent application failover is used.
Transparent application failover "transparently" moves application connection to the second instance but there are so many restrictions on it that usually it does not happen that smooth and application needs to be able to handle loss of the package state (if any) and some transaction-related errors.
Connection-time failover means that application will need to reconnect when database failover happens with apparent loss of database session state.

BTW. Standby database can be periodically opened for reporting in Oracle 8i. So it is not a complete waste of resources.

A. Clay Stephenson
Acclaimed Contributor

Re: What are the methods available for High Availability for a Unix server?

Hi Daniel,

I suppose that the best answer I can offer is that many of us on HP platforms face exactly that issue and the majority choose MC/ServiceGuard. I find that prople who are purely DBA's (or nearly so) tend to favor the Standby Database; those who are more of the SysAdmin ilk or those who are comfortable in both roles tend to favor MC/ServiceGuard.

Having said that I should warn you that setting up a High Availabilty system involves much more than the standby database or MC/SG. You need to think of backup power (e.g. generator to supplement your UPS); redundant newwork switches and routers; redundant HVAC systems; and any other single points of failure that you can think of. If you are already comfortable in Oracle, I suggest that you attend the HP Class on MC/ServiceGuard. You will then be in a better position to make an informed choice; if you are not fairly fluent in LVM, it's probably a good idea to brush up on it before attending. The class starts with the overall analysis of failures and in one sense, learning ServiceGuard is almost a by-product. You will also learn the steps needed to connect SCSI devices to multiple hosts and about things like in-line terminators so that you can replace SCSI controllers without clobbering the bus.

I would say that the strongest argument in favor of MC/SG is that it can be used to protect so many applications beyond databases.

My 3 cents, Clay
If it ain't broke, I can fix that.
Sanjay_6
Honored Contributor

Re: What are the methods available for High Availability for a Unix server?

Hi Mike,

You can think about having MC/SG with OPS. Under OPS the oracle is running on multiple nodes and share the same set of disk/diskpool. This provides less outage than th eusual MC/SG. I think so. I'm not using OPS though we do have installation of OPS at our site. Personally i'm using MC/SG without OPS. OPS provides less downtime than the usual MC/SG environment.

Hope this helps.

thanks
Sanjay_6
Honored Contributor

Re: What are the methods available for High Availability for a Unix server?

Hi Daniel,

Looked at your specs after i replied earlier. Out OPS installation is also on N-Class. 2 * N-Class Server with share HP E-Array. Don't know the detailed specs though.

Thanks
Daniel Tan_1
Occasional Visitor

Re: What are the methods available for High Availability for a Unix server?

One question: Will the outages be transparent when a outage occurs on the server? Will the back-up node kicks in immediately before the client experience timeout response due to the outage?
Sanjay_6
Honored Contributor

Re: What are the methods available for High Availability for a Unix server?

Hi Daniel,

The outages are not transperent. Users will have to login again. External Applications, if any linked to the package need to be restarted. What happens is that when there is a failure, the package moves from the system where the failure occurs to the alternate node where it is configured to move. The package goes down on one node and gets started on the other node, so there is an outage of a few minutes at the max, depending on your configuration. This is in the normal MC/SG configuration.

In an OPS configuration i'm not exactly sure, but i believe there is very little outage as the database gets restarted on the other node.

Hope this helps.

thanks
Magdi KAMAL
Respected Contributor

Re: What are the methods available for High Availability for a Unix server?

Hi Daniel,

Sure MC/SG is what you can look for.
Configuring packages ( virtual nodes )which can switch to alternate nodes as soon as a major failure occurs is what can solve your problem.

MS/SG can handle also the standby lan interface : automatic switching on the next available lan as soon as the first one comes down .

MCSG with External disks ( if possible HP AutoRAIDs disks ) could be entirely a very good solution for Mission Critical Application.

You evaluation of AutoRAID disks should respect the threashold of 50% used to maitain the AutoRAID in RAID 0+1.

Magdi