Operating System - HP-UX
1820291 Members
3448 Online
109622 Solutions
New Discussion юеВ

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

 
SOLVED
Go to solution
anthony natale
Advisor

MC/SG and Oracle Hot Backup Problem (Urgent)

I am having a serious problem with running hot backups of Oracle under MC/SG. Hopefully someone can help. First some system info:
- MC/SG version A11.12
- ORACLE Version 7.3.4.5.0
- 2 node Cluster.
- Omniback 3.5
- OS HPUX 11.0

So basically what happens is everytime we try to run a hot backup and our backup script tries to put the tablespaces in backup mode, they get locked up and the script times out. Oracle is suggesting it's the su to oracle command in the script, but we ran all the commands from the script successfully outside of serviceguard.

The DBAs tested everything in the script and the only time we have problems is when serviceguard is running. If we just mount the filesystems and start the oracle applications manually there are no problems with hot backups, but this is not a work around. Perhaps there's some patch out there, or maybe we're not seeing something that MC/SG is doing.

Please advise, thanks ahead of time for any suggestions or fixes to this frustrating problem.

anthony.
15 REPLIES 15
A. Clay Stephenson
Acclaimed Contributor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

I assume that the su commands are within OmniBack Pre/Post Exec scripts. The use of su within pre/post exec's is not supported although it will often work. I've sometimes had to create setuid c programs to do what you are doing along with very careful control of stdin, stdout, and stderr in oracle scripts.
If it ain't broke, I can fix that.
melvyn burnard
Honored Contributor
Solution

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

It is vital you have the latest patch for 11.12 of ServiceGuard as there is a fix for a known issue which seems to match your problem.
Download it from this web site and install on both nodes as per special instructions.

PHSS_22541

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
anthony natale
Advisor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

I wanted to thank you for the responses so far. I also checked into the patch you were talking about Melvyn, but I don't see where my problem is referenced. Could you tell me a little bit more about what you think may be going on?
Thanks again.

anthony.
A. Clay Stephenson
Acclaimed Contributor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Anthony, when you say you ran everything from the script outside of MC/SG, does that mean you did run it from OB2? Or does it mean that you ran the ORACLE script from the shell? This is an important distinction. Also did you run the script on the same node that the package is running on? I'm just looking for a few more clues.

... Clay
If it ain't broke, I can fix that.
anthony natale
Advisor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Hi Clay,

OK, under normal conditions(i.e. MC/SG running and all packages up), OB2 runs the script (pre-exec), it then times out because the script gets hung on the redo log tables in oracle. If we then halt the package and run the same backup table space commands on Oracle, nothing hangs. We have only run command line tests to see if the tables hang in oracle. Though the problem is consistent, hangs only occur while the package is up under MC/SG and works fine (the same commands) if oracle and our application is started manually.
All of this is being run on the same node.

Thanks again for asking.
anthony.
A. Clay Stephenson
Acclaimed Contributor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Anthony, I too did not find a clear reference to your problem in Melvyn's patch text. It is still not clear to me that this is a MC/SG problem. I run oracle backup scripts under MC/SG packages daily w/o problems. As bad as this is, I think you are going to have to know for sure whether this is a SG problem or a su pre/post exec OB2 problem. I assume this is a production box which makes the problem more pressing. My best cut at this is to move all the packages except the one in question to alternate node(s) - to keep them running. Then
halt the package in question and then halt the node. I would then manually mount the volume groups (probably requiring first a vgchange -c n) and do all the steps that the package script does normally. Invoke the backup and then see if all is well. If so, then you have an MC/SG problem otherwise it's OB2.

I don't suppose we are dealing with something as trivial as needing to increase the 'ScriptOutputTimeout' setting in the options/global file; that would be too easy.

I am still leaning towards the su in an OB2 exec script problem. I have coded a c program
(more on less sudo) which I use in my OB2 exec scripts and it never fails under MC/SG; if you determine that the problem is OB2 I can send you the source.

... Later, Clay
If it ain't broke, I can fix that.
anthony natale
Advisor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Thanks for the quick response Clay,

When the script times out it does that as a result of getting hung on a couple tablespaces in the database, namely "rollbackone and rollbacktwo". So it looks like the script is able to be run. But it only locks up when serviceguard is running the package. If I manually mount he package and start its services, the same backup tablespace commands with work without hesitation and not hang. What do you think? Thanks again,

ant.
A. Clay Stephenson
Acclaimed Contributor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Ok Anthony, just so we are both on the same page. When you say that you manually execute the backup scripts does that mean from within OB2? I think that you need to do a OB2 Preview (making sure that the options/global value 'ExecScriptsOnPreview' is set to 1 ; you will need to bounce OmniBack is you have to set it). If it hangs now, SG has nothing to do with the problem.
If it ain't broke, I can fix that.
anthony natale
Advisor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Thanks again Clay,

Our DBAs are now using manual commands within Oracle to put the tablespaces in backup mode. (Basically the same way our script does it). We are not actually running our script anymore since the same commmands won't even work at command line (at server manager prompt within oracle). The server manager commands that put the tablespaces in backup mode work if serviceguard is not running. The same backup command, when service guard is up, gets hung when it gets to the "rollback" tablespaces. Hopefully this info helps you. So I basically am taking OB2 out of the equation, we are now dealing with serviceguard and oracle's abaility to put tablespaces in backup mode while under MC/SG control. Thanks again,

ant.
A. Clay Stephenson
Acclaimed Contributor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Anthony, this is really strange. I ran a script under oracle 7.3.4 (although on 10.20) under SG which made the identical calls on rollback segments and never saw this behavior.
I used the equivalent calls for over a year before switching to snapshot backups.
Have you looked at the mount options for each of these filesystems? Are they in anyway different. I'm grasping at this point.

..Good luck; you're going to need it. I'm surprised that ORACLE didn't give you the standard 7.3.X is no longer supported line.
If it ain't broke, I can fix that.
A. Clay Stephenson
Acclaimed Contributor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Anthony, I've had one more screwball idea. Is is possible that the mountpoint directory for your redo logs has restricted permissions like 700 and owned by root and that oracle is having difficulty eventhough the mounted directory permissions look fine.
-- Just a thought
If it ain't broke, I can fix that.
melvyn burnard
Honored Contributor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Please read the symptom number 11 on the patch text file.
We have seen Sybase, SAP, NFS, and a host of other products have seemingly unrelated problems caused by the fact that a SIGALRM blocking done by SG gets inherited by child proceses and never cleared, so they never respond to SIGALRM.
There are many other crucial fixes in hte patch as well.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
anthony natale
Advisor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

Looks like the problem has been resolved.

I want to formally thank both Melvyn Burnard and A. Clay Stephenson for your efforts and rapid response to my problem.

Melvyn, the patch worked and everything seems to be running as expected. Thanks again.

anthony.
Eric Leonard_1
New Member

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

OK, Look..I'm not too sure how your kicking off the post/pre execs. I personally dont put the script in the post/pre lines, rather I build shell scripts and simply set them off with the pre and post. So as far as the su issue..if its in the line...run a script and open a fresh shell. Now, the Oracle issue can get much more nasty then the unix/OB2 side. I would hope that you DBA is on the horn with oracle all day asking about archive mode, RMAN, or anything else that will hcause roll backs(the undo logs and archive junk) to hang...probably due to writes. Attend a Perf View console for a while and check your usage...traffic....possible contention...I suggest getting more info from the Oracle side...what version 8..8i?.....are your rollbacks on seperate disks?....this could case issues under load balaceing....due to mounting lag and disk traffic from data_ts..and such..Seperate spindles(disks) for each of your table spaces...as well as loging will dramaticly inprove performance and pracess hangs....oh, you could try and make that process LESS NICE...by modifying its nice value....Time slice!...this could hurt!...look at your time slice thresholds, if the sytem cant think about you desire long enough it may never get started on it... ;)...well, I wish you luck and advise Oracle support!
Eric
Kurt Beyers.
Honored Contributor

Re: MC/SG and Oracle Hot Backup Problem (Urgent)

I've had the same problem in the past. The hot backup was fine when Oracle was running locally on the node (cluster down), but didn't go when Oracle was running in the clsuter. It was solved at the time by installing patch PHSS_23373 (which is now superseeded by PHSS_26270)

Kurt