Operating System - HP-UX
1834154 Members
2731 Online
110064 Solutions
New Discussion

Calling all Oracle and NFS experts

 
SOLVED
Go to solution
Greg Stark_1
Frequent Advisor

Calling all Oracle and NFS experts

We have an number of 11.00 servers with Oracle databases mounted via NFS to NetApp F810 Filers.

We mount them with the following options:
hard,intr,vers=3,proto=udp

Since these apps are in a 24/7/365 manufacturing environment, scheduling downtime can be a challenge to say the least. Occasionally, we need to upgrade the proprietary OS of the filer which requires a reboot and about 5 minutes of filer downtime. We recently had a filer crash and reboot and as far as we can tell, we think Oracle/NFS handled the 5-10 minute outage without any corruptions or alerts.

Now to my questions:
1. How does Oracle/NFS hande transactions and queries when it cannot communicate with the database files?

2. Should it be our policy that all Oracle apps be shutdown if there is going to be any type of planned outage to the filer?

3. If a filer unexpectingly crashes and reboots, what steps should be taken to insure data integrity?

Thanks again,
Greg
7 REPLIES 7
Michael Steele_2
Honored Contributor

Re: Calling all Oracle and NFS experts

I believe this is a stale NFS mount question and the resolution requires a client reboot.

This is certainly a NFS timeout question which is measured with :

# nfsstat -rc


If timeouts and retransmissions are similar then increase timeouts.

Refer to /etc/rc.config.d/nfsconf

The timeout should be greater than the total end-to-end recovery time. That is, running fsck, mounting file systems, and exporting file systems on the client. (With journalled file systems, this time should be between one and two minutes.) Setting the timeout to a value greater than the recovery time allows clients to reconnect to the file system after it returns to the cluster on the new node.

Support Fatherhood - Stop Family Law
Michael Steele_2
Honored Contributor

Re: Calling all Oracle and NFS experts

I believe this is a stale NFS mount question and the resolution requires a client reboot.

This is certainly a NFS timeout question which is measured with :

# nfsstat -rc

If timeouts and retransmissions are similar then increase timeouts.

Refer to /etc/rc.config.d/nfsconf

The timeout should be greater than the total end-to-end recovery time. That is, running fsck, mounting file systems, and exporting file systems on the client. (With journalled file systems, this time should be between one and two minutes.) Setting the timeout to a value greater than the recovery time allows clients to reconnect.
Support Fatherhood - Stop Family Law
Steven E. Protter
Exalted Contributor
Solution

Re: Calling all Oracle and NFS experts

Commentary: Running Oracle on an NFS mount significantly inhibits performance. You would be much better off connecting to the database direct on a disk array via fiber.

Answers:
1) For a limited period of time, Oracle will cache the transactions in memory until it reaches certain limits. Then it will crash. You may get data corruption. If it tries to cut an archive log, it will crash. You should shut down the database prior to bringing down the NFS mount.
2)Yes.
3)Oracle has built in functionality to help you with this. If you are making archive logs, even in a full corruption situation, you should be able to restore the last hot backup and roll forward to nearly the last transaction.

If you want real data integrity, you should move the database to a reliable fiber connected san/disk array.

At the very least the database should be running against local disk with RAID 1 Mirror/ux mirroring. That would insure data integrity and prevent crashes.

In both of the above setups, we have run tests while transactions were posting with a sudden system reboot. We used the reboot command.

The databases recovered automatically in all tests and came on line a few minutes later than after a more controlled restart.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Jeff Schussele
Honored Contributor

Re: Calling all Oracle and NFS experts

Hi Greg,

I just wanted to let you know that you're a much braver soul than I. You do know that NFS stands for Not Friggin Stable don't you?

If I were you I'd at least require that the NFS traffic be contained to it's own private network - Gig E if possible - if you just *have* to use it.

Good Luck &
Best Regards,
Jeff

Oh man & to put realtime Oracle transactions across it. Man that's real high-wire, trapeze type stuff....makes me shudder.....
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Graham Cameron_1
Honored Contributor

Re: Calling all Oracle and NFS experts

Just to reiterate.
You are brave/crazy to run Oracle database files on NFS mounts.
I have only ever done this once, temporarily, and in an absolute emergency.
And in a 24x7 manufacturing environment, you take your exposure to another level.
At the very least, to have any chance of recovery after server failure, you must ensure that your redo logs are on locally connected disk.
At least one control file also.
-- Graham
Computers make it easier to do a lot of things, but most of the things they make it easier to do don't need to be done.

Re: Calling all Oracle and NFS experts

Greg,

While I share some of the concerns of the other guys here about this, I would suggest that you get advice direct from NetApp.

I've been told myself by NetApp that this config works and is supported - so they should have some expertise in this - you no doubt have a support contract with them -make use of it and give them a grilling.

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Alzhy
Honored Contributor

Re: Calling all Oracle and NFS experts

It will always be a good idea to shut down an Oracle instance if the underlying storage (be it DAS, SAN or NFS) will "disappear" for a while. Hmm, I never thought people are actually runing Oracle RDBMS's off of NFS filers -- a supported configuration by Oracle BTW. I architected a scheme once but got thrown out in favour of a SAN. Back then, I would have implemented a "cluster" of NetApps filers connected via an elaborate switched gigabit infrastructure to servers. Some larger servers would have more than one Gigabit drop and either "trunked as a superpipe" or having an NFS mount to each gigabit pair link (for falover).
.
Boy you're lucky your DB instances suffered no corruption at all during your Filer downtime when Oracle was up...
Hakuna Matata.