1836612 Members
3638 Online
110102 Solutions
New Discussion

Re: NFS problem

 
Joe Short
Super Advisor

NFS problem

I am building a 2 node cluster on HPUX 11i, using MVSG v11.15. I have a file system that is mounted on the primary node, and NFS mounted on the alternate node. However, when the cluster fails over, it hangs. It cannot unmount the NFS file system on the alternate node. Is there a fix?
16 REPLIES 16
Sunil Sharma_1
Honored Contributor

Re: NFS problem

Hi

You may require a product called MC/SerGd NFS Kit (B5140BA ).


sunil

*** Dream as if you'll live forever. Live as if you'll die today ***
Joe Short
Super Advisor

Re: NFS problem

Thanks, that's what I was afraid of, but before I go down that road I have at least one more trick up my sleeve. Besides, I don't want to ask my client to pony up any more money.
Pete Randall
Outstanding Contributor

Re: NFS problem

Joe,

Speaking as a non-SG NFS user, I can definitively say that unmounting an NFS mounted file system after the exporting host disappears is all but impossible - at least I've never managed it and I've never heard of anyone else who has either. If this NFS kit addresses this situation, it sounds like the only choice other than eliminating the dependance on NFS entirely.


Pete

Pete
Jeroen Peereboom
Honored Contributor

Re: NFS problem

L.S.

I have seen such a set-up, a 2-nodes MC-SG cluster with packages A serving an NFS fiesystem to package B.

In the control scripts, is A is stopped, B is stopped first, which I found really annoying. Also some tricks were done when starting packages, I forgot them but I think killing packages (forcing a package to go down) was involved when starting it.

If you bring up package A on the other side, the NFS server is available again to the client (who is connecting to an NFS server based on package IP address.

My feelings on this:"
- I didn't complete trust it.
- Later I heard of MC-SG NFS.

JP.
Joe Short
Super Advisor

Re: NFS problem

Yeah, in the past on older version I have been able to work around this issue using fuser, but it no longer works on NFS mounted file systems. Very annoying, and difficult when working with a client that has a strict budget. What I am going to try is to reverse the situation. The alternate node will mount the file system, and the primary NFS mount it. Since this is a very simple cluster, 2 nodes, 1 package, I will remove the stale mount situation during failover.
What I do find curious is that this was working before I added monitors, and applications to the package. The only other change was to install a patch, PHSS_30087. I wonder if the patch may have changed something.
Pete Randall
Outstanding Contributor

Re: NFS problem

Joe,

The monitors and/or applications are probably holding the NFS mount point open, which is why you get into trouble with it.


Pete

Pete
Joe Short
Super Advisor

Re: NFS problem

Pete, could you give me a little more on that? I'm not fully sure I get your point.
Pete Randall
Outstanding Contributor

Re: NFS problem

Joe,

My thinking was that either the monitor or the application was referencing the NFS mount point, thus not allowing it to be released. I'm thinking in automount terms here, but the more I think about it, the less likely I think that has anything to do with this situation where the exporting server has disappeared. It doesn't matter how it was mounted, it's not going to release the mount until it can communicate with the other server.

Worthless rambling, I guess! Disregard!


Pete

Pete
Joe Short
Super Advisor

Re: NFS problem

Yup, that's about it. The exporting server goes dark, and leaves a stale mount on the other server. I suspect that if I switch the NFS roles of the servers, I will get around this. I'll mount the file system on the alternate node, and NFS mount it on the primary. That will remove the stale mount situation during failover.

Re: NFS problem

Joe,

I see no reason why this shouldn't work without the NFS toolkit - the toolkit is just a bunch of scripts anyway... of course if you have the toolkit and it still doesn't work you have the advantage of being able to call the response centre and log a call.

As long as the NFS client mounts the file system via the virtual IP address rather than a host IP address then the mount point should go stale, but come back again when the file system is re-mounted and (presumably) re-exported on the failover system.

The issue is likely to be that as you are on 11i, NFS is defaulting to TCP, rather than UDP and is therefore connection oriented. Try mounting the file system with the 'proto=udp' option - that may solve your problem.

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Joe Short
Super Advisor

Re: NFS problem

Duncan, I am using the same mount point name on both servers. So I still may end up with a problem. I won't be able to try my solution until Monday (it's Friday here) but I am sure it will work, and it does not matter which node mounts and which node NFS mounts the file system. I only have one pacakge between 2 nodes. So I suspect I may be tripping over my own foot.
Joe Short
Super Advisor

Re: NFS problem

Well, I tried reversing wher the file system mounts, and that didn't work. then I tried Duncan's suggestion, and that didn't work either.
I find it hard to beleive that to use NFS in any small way in an MCSG environemtnrequires the NFS toolkit.
If anyone has any ideas, I'm open to suggestions.
I am under the gun here, and need to make this work in less than a week.
Sridhar Bhaskarla
Honored Contributor

Re: NFS problem

Hi Joe,

Your application should access *only* the nfs mount point on both the boxes. For ex., server A:

/dev/vg01/apps -> /apps

floating_IP:/apps -> /somewhere/apps

Server B:
-> /apps

floating IP:/apps -> /somewhere/apps

Your application should always be using /somewhere/apps NFS mount rather than /apps. This way /apps is always process-free on Server B:

During the failover, on Server A you will need to first stop NFS server, unmount the NFS mount on server A, unexport the NFS entry and restart the NFS server subsystem and then unmount the local filesystem. During this time Server B's /somewhere/apps would be in hanging|stale state. Once the package failsover to Server B, it will mount the local filesystem at /apps, exports the mount point and mounts the NFS *only* if there is no entry in /etc/mnttab.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Joe Short
Super Advisor

Re: NFS problem

Sridhar, I have only 2 server in this environment. How can I not mount the file system directly on one of them? It needs to be mounted somewhere before it can be exported.
Sridhar Bhaskarla
Honored Contributor

Re: NFS problem

Hi Joe,

You will need to have two packages. I guess I messed up the order in my previous message.

Server A runs Package A (NFS Server functionality) primarily

Startup:
1)Mounts the local filesystems including to_be_NFS filesystem on the mount point say /apps and adds the nfs_floating IP through regular serviceguard.
2) Exports /apps for NFS.
3) Mounts "nfs_floating_IP:/apps /somewhere/apps" if the entry is not there in /etc/mnttab.
4) Starts the application that uses /somewhere/apps

Shutdown:
1) Stops the application
2) Unexports /apps
3) Restarts NFS server
4) Unmounts the local filesystems including the NFS_filesystem


Package B runs on Server B Primarily

Startup:

1) Mounts the filesystems (nothing should be mounted on /apps) and adds the floating IPs.
2) Mounts "nfs_floating_IP:/apps /somewhere/apps
3) Starts the application

Shutdown:
1) Stops the application
2) Unmounts the filesystems

In both the above, /somewhere/apps which is teh mount point, wont' get unmounted. It will come active when the floating IP is active and appears.

With some refining you may be able to get it to work. The key is not to access the local filesystem directly.


-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Joe Short
Super Advisor

Re: NFS problem

Solved it. I was stepping on my own toe. The problem was that I was exporting a file system called /stage. I was then NFS mounting it on the same mount point (mount SERVER1:/stage /stage). So, when the package failed over, the NFS mount went stale. The package would try to mount the /stage file system but the mount point came back as busy because it was stale.
This occurrs even when the package exported and was referenced in NFS mounting the file system.
The solution was so easy I'm embarrassed to talk about it. I simply NFS mounted it to a different mount point (mount SERVER1:/stage /stage2) This way the original moint point remains available during a failover, and once the original is remounted, and the package exports it. The NFS point returns from being stale.