cancel
Showing results for 
Search instead for 
Did you mean: 

trucluster failover

kirit_2
Frequent Advisor

trucluster failover

Hi,

I need some concrete information about how trucluster v5.1B failover happens and how we can achieve the same.

Here is scenario:

Systems: One DS20E and oneDS10
Storage: MA8000
OS:Tru Unix v5.1B
Cluster: Tru cluster V5.1B
Database: Oracle 9i single instance server.
Front-end application: D2K ( developer 2000)


Question:

1]If my primary server fails - will orcle startup happens automatically on second server ? If yes how we can achieve this.

2]Do my users (who uses frontend applications like developer 2000 ) need to reconnect ones failopver happens.

3]Do my telnet users need to reconnect when failure happenes?





















5 REPLIES
Venkatesh BL
Honored Contributor

Re: trucluster failover

The manpage of 'caa' contains some useful info...
Paul Henderson_2
Frequent Advisor

Re: trucluster failover

re:
1) you have to create an Oracle 'resource' and then put it under control of Cluster Application Availability (CAA). There are example scripts for this in the distribution. See the caa(8) man page for more info.
2) you need to make D2K a 'dependent' resource to the Oracle resource, so that if Oracle fails, D2K will exit and be re-established when the Oracle resource is re-established.
3) the telnet users should be connecting to the cluster alias so no reconnect is necessary.
kirit_2
Frequent Advisor

Re: trucluster failover

Dear Paul

Point no 1 is clear.

Point no 2 - how can I make D2K dependent resource to oracle ? what about activity commited by user?

what exact activity user might need to do when oracle instance will move to second server?

can I have some cook-book to configure this ?
Gary Hansford
Frequent Advisor

Re: trucluster failover

Kirit,

From my experience with Ingres here is my two peneth : -

1] Using CAA you can script a set of commands to be used to stop, start and monitor a program suite [ In effect a bit like an NT service - but better ;-) ] You can also create dependancies on the service so for instance it requires to be able to see the network. You can then say the service can be run on one of 'n' nodes in the cluster (which takes into account dependancies). You also create a cluster alias IP address for your "Oracle Service" that your users connect to.. that way they end up talking to the box running Oracle. CAA will monitor and on loss of a node or dependancy move the service using the commands you script. There should be examples available for CAA/Oracle (even Oracle themselves must surely provide a generic CAA script ?).

2] My guess (from knowing Ingres), is YES you will have to reconnect. Imagine you only have one server and you switch it off and back on again - what would you have to do with Oracle if this outage takes a minute (which is what the scripts may take to recognize server gone, stop/restart on a.n.other server) ?

I don't know whether Oracle Parallel Server handles this through stateless connections and memory sharing between the boxes. This would be a cost/benefit/hype excercise you would have to undertake.

3] Again I would say "yes". The box they were connected to was dropped / crashed. Anything they were running went the way of the dodo on that machine...

------------------------------------------

So of course you will now be questioning - where's the benefit of a cluster ?? !!

Shared storaged/printing, administration and failover handling...

If you have stateless applications (e.g. HTTP servers-Apache), you can run the app on all boxes in your cluster. The user connects to an alias (which you register to all boxes in the cluster). For these failover will be invisible to users - very nice :-)

Hope this gives you the answers you need.

Gary
kirit_2
Frequent Advisor

Re: trucluster failover

Hi Gary

It great answer - let me put this way if I understood it clearly.

1]CAA only take cares of automatic failover, no guarantee of user connection.

2]D2K users must have to reconnect to cluster alias IP address.

3]If primary server fails - oracle will always start in recovery mode and in that case time will be in minutes - so user's connection will drop.

- would like to hear from some Tru64 unix and oracle guys about exact secnario.

regards

KH