Operating System - HP-UX
1834454 Members
2247 Online
110067 Solutions
New Discussion

Re: db connectivity lost after backup

 
lastgreatone
Regular Advisor

db connectivity lost after backup

oracle db is on hpux 11 across the firewall
oracle 9ias is on linux in the dmz
OB backups start on both systems at 5:00 a.m. which shutdown oracle instances for approx. 35 minutes
oracle instances startup but for some reason a specific oracle process fails to resume after backups, and yet all other http requests resume without problem
In fact, running the process interactively on linux returns with data
SGA has been tweaked on both servers
Memory on both servers appear adequate
Shutting down the oracle start_stop script on linux does not clear up the problem
But once I reboot the linux server the process runs ok again.
It almost appears as a caching issue either at the firewall, or on hpux or on linux.
Any clues would be appreciated.
19 REPLIES 19
Eric Antunes
Honored Contributor

Re: db connectivity lost after backup

I think you are talking about a db link? If it is the case, verify that the DB where the db link is created starts later than the other.

If the problem persists, it may be related with the firewall...

Regards,

Eric Antunes
Each and every day is a good day to learn.
Rick Garland
Honored Contributor

Re: db connectivity lost after backup

Is the listener running or is it shutdown as part of the backup process? Maybe it is not be started up after the backup is completed.

Check the output of the `lsnrctl stat` after a backup has completed and you have troubles getting to the DB.

lastgreatone
Regular Advisor

Re: db connectivity lost after backup

No db link issue.
No listener issue.
I wonder if it's ndd or icmp related on the hpux oracle server. Or as you suggested possibly the firewall is configured to timeout any persistent connections that is not being acknowledged inside on the hpux oracle db server (because instances are down during the backup) and a clean reboot of the linux server in the dmz stops and starts the ethernet connection therefore clearing up the port.
I'm probably way out in left field on this but I'm running out of ideas.
Rick Garland
Honored Contributor

Re: db connectivity lost after backup

Could there be some Linux iptable rule that gets turned on after the backup completes?

iptables -nvL to see what is listed
lastgreatone
Regular Advisor

Re: db connectivity lost after backup

iptables is disabled on linux
lsnrctl stat does not give me any relevant info, I believe it's because the oracle database server hpux is part of a SAN configuration.
Fred Ruffet
Honored Contributor

Re: db connectivity lost after backup

what does netstat says on both sides ? Are there still connection established or close_wait after backup ?

Regards,

Fred
--

"Reality is just a point of view." (P. K. D.)
Eric Antunes
Honored Contributor

Re: db connectivity lost after backup

Hi,

Can't you disable the firewall for one night and see what happens...? :-)

Regards,

Antunes
Each and every day is a good day to learn.
lastgreatone
Regular Advisor

Re: db connectivity lost after backup

1-netstat shows connections established from both sides
2-no I cannot disable firewall, out of my control and besides I'd disable the whole organization....bad idea :(
3-what's interesting, I think, is from the hpux oracle database server I can ping other servers in the dmz but not the linux server, could that be a clue? yet they appear to be using the same route. Any advice on this?
Fred Ruffet
Honored Contributor

Re: db connectivity lost after backup

When you say you can't ping, is it when before backups or after ? can you telnet or ssh on it ? What I want to know is "is it only FW blocking ICMP" or "is server really unreachable".

I asked for netstat status on both server. You say it says established. Even after backup ? (when unable to connect)

Regards,

Fred
--

"Reality is just a point of view." (P. K. D.)
Rita C Workman
Honored Contributor

Re: db connectivity lost after backup

We have always found it best to kill ALL oracle processes left hanging around as the last part of our shutdown scripts.
If everything is properly shutdown, including your listener, then you should start again.

But leave anything....and you will see things such as this.

Just my 2cents,
Rita
lastgreatone
Regular Advisor

Re: db connectivity lost after backup

1-cannot ping the linux server either before or after backups, I can ssh from hpux to linux after and before backups.
But if 'FW is blocking ICMP' that means it's blocking the linux server only, why?. Since I can ping an NT and HPUX server in the dmz from the same hpux oracle database server.

2-netstat show connections established even after backups

3.It happened again this a.m., here's a c&p of the error (php script):

07-09-2004 07:47:38-0400


ORA-04031: unable to allocate %s bytes of shared memory ("%s","%s","%s","%s")



4. I know it seems to be an oracle problem, but yet like I said before, the SGA has been modified on both the Oracle iAS and db servers.

lastgreatone
Regular Advisor

Re: db connectivity lost after backup

Thanks Rita for your 2cents, that's my belief as well. But unfortunately the hpux oracle db server is on a SAN and managed by another group in IST. And it's a first for the organization. Interacting with these people prove to be difficult and without knowing exactly what to test, I cannot provide them with any guidance. Hence, my contacting the HP gurus (you and your colleagues).

But in case you're wondering, I do a clean shutdown of all oracle processes including the listener on the linux server in the dmz, and yet I still have to reboot the server to get that specific process to connect again. weird

Whatever leads I can get from all of you is great.
TIA.
Eric Antunes
Honored Contributor

Re: db connectivity lost after backup

France,

Is your java_pool_size greater than 8Mb??
Each and every day is a good day to learn.
lastgreatone
Regular Advisor

Re: db connectivity lost after backup

Hi Antunes,

yes, the java pool on the hpux oracle db server=117MB, and on the linux ias=60MB
Fred Ruffet
Honored Contributor

Re: db connectivity lost after backup

ORA-04031 may have multiple causes :
. SGA too small
. Java pool too small
. shmmax too small

As long as your query return results with a standard connection, it may be Java pool that may be too small.

Have a look at shmmax kernel parameter, compare to result of an ipcs -mb.

Other point is that 9.2.0.5 patch corrects many 04031 issues. Maybe this patch would help.

Regards,

Fred
--

"Reality is just a point of view." (P. K. D.)
Eric Antunes
Honored Contributor

Re: db connectivity lost after backup

"Note that if using MTS and you have not configured the Large Pool, session memory is allocated from the Shared Pool, thus reducing the amount of available memory."

Are you using MTS?
Each and every day is a good day to learn.
Eric Antunes
Honored Contributor

Re: db connectivity lost after backup

Another question:

Please give more info about the concerning DB: RDBMS version, OS version, etc...

Finally, is this a physical standby configuration (on Linux)?

Regards,

Antunes
Each and every day is a good day to learn.
Volker Borowski
Honored Contributor

Re: db connectivity lost after backup

Hmmm,
sounds strange,....

I'd try to check if the same problem occurs,
if you cleanly do it without Omniback.
- Shut down application
- shut down database
- start up database
- start up application

Since this works after a reboot, I'd try to figure out if Omniback or the script used by Omniback cause the problem.
The Omniback Script could do a "su - user" on which you find many opinions in the omniback forum. I have used this many times succesfully for ease of use, but also had difficulties with that (2 times out of ~50).
May be a manual start uses another shell than the OB-Script, which may result in diffrent initialisation of environment.

And if all fails, dump the offline backup for online backup and do not care any more for db-shutdown.

Let's hear how this one turns out....
Volker
lastgreatone
Regular Advisor

Re: db connectivity lost after backup

Fred you are correct about ora-04031 having multiple causes, I found an actual dedicated dns website, http://ora-04031.ora-code.com/.

Eric you have a possible lead with MTS and large pool setting, also mentioned in the discussion threads.

And Volker, I did try a process of elimination with shutting OB, step by step replicating the problem.

Bottom line, this is a complex one and hopefully from the web site above I'll be able to come up with a solution soon.

Thanks all for your help.