Operating System - HP-UX
1834499 Members
2541 Online
110068 Solutions
New Discussion

Ignite Server and NFS Hang

 
SOLVED
Go to solution
hpuxrox
Respected Contributor

Ignite Server and NFS Hang

I am testing a ignite server in the lab for planing an ignite roll-out to production.

I was wondering if anyone has had problems with the NFS mounts hanging the client machine? I started a make_net_reovery form three machines over night and when I came in this morning, all three machines had NFS hang problems. (bdf hangs etc) I tested this again by disconnecting the lan interface from the ignite server while all three machines were doing make recoverys and it did it again. If this is the case, if the clients hang when they loss connction to the ignite server; There is no way this is going to production.
13 REPLIES 13
Steven E. Protter
Exalted Contributor

Re: Ignite Server and NFS Hang

This is a pretty common problem.

Ignite needs the networking configuruation just perfect to work properly.

1) Make sure you boot off the private built in lan card. Ignite won't boot of add in lans.
2)inst_adm -d will show ignite networking
3) inst_adm -d > file to put the info in a file
4) edit the file to match actual networking
5) inst_adm -f file to load the new networking settings

Try again.

It should work. The actual settings depend on the NIC card in use.

You need to boot off built in but can use faster GB add in NIC cards to actually push the golden image.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
hpuxrox
Respected Contributor

Re: Ignite Server and NFS Hang

This is during the backup, not a restore. The problem is the ignite server has the client mount the ignite filesystem with HARD mounts. Is there a way to change them to soft mounts.
Steven E. Protter
Exalted Contributor

Re: Ignite Server and NFS Hang

Whoops.

Few more things to check:

1) /etc/exports on the NFS server. It can grant or deny access based on hostname. You need to make sure those permissions are consistent. exportfs -av to see them re-export
2) Check switch configuration and make sure 1000 BaseT NIC cards are autonegotiate, evven in switch config. 100 BaseT NIC's need to be set to manual in the switch config and on the server in /etc/rc.config.d/hpbtlanconf

Most of my issues with this stuff had to do with /etc/exports configuration.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
hpuxrox
Respected Contributor

Re: Ignite Server and NFS Hang

The hard|soft mounts are determained from mount_nfs from the client, not from the servers exports file. The problem is, "make_net_recovery -s igniteserver" run from the client, mount these mounts automagicly. Not from /etc/fstab. So, my question is how to I make Ignite use soft mounts verses hard mounts?
Dave Olker
Neighborhood Moderator

Re: Ignite Server and NFS Hang

Hi,

I don't know how to configure Ignite to use soft NFS mounts, but my question is: why would you want to use soft mounts in this case?

If you suspect you're having some type of network problem that is causing the NFS mounts to stop responding, a soft mount won't buy you anything except the client may not be hung when you come in the next morning. It doesn't mean the backup will be successful, or that it will have made any better progress than the hard mount did.

Rather than worry about mount options, I'd first try to determine the reason for the NFS mount stop responding,.

Are these clients able to successfully NFS mount filesystems from this server under normal conditions? If so, what mount options do they use when they work properly? UDP/TCP? PV2/PV3? rsize/wsize? Do manual NFS mounts work in addition to automounted filesystems? Have you tried using a tool like iozone to test reading/writing large amounts of data across the NFS mounts, as Ignite does when it performs a backup or install, to see if the client stop responding,during normal read/write operations?

Again, until you get to the bottom of why a normal hard mount is stop responding,I wouldn't spend a lot of time trying to make it a soft mount.

Just my opinion...

Regards,

Dave



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Steven E. Protter
Exalted Contributor

Re: Ignite Server and NFS Hang

Dave knows more about NFS than I'd ever hope to know. I will state for the record that I've done many successful make_net_recovery implementations.

It was done with hard mounts and all trial and error was dealt with by modifying the /etc/exports on the target server.

Here is the successful implementation I used for a three server group, all writing make_net_recovery backups to a target server named tzfat. Here is the relavent portion of tzfat's /etc/exports file.

/images -access=moriah:hebron:jufdev:jufprod
/var/opt/ignite/recovery/archives/tzfat -anon=2,access=tzfat

All make_net_recovery archives go to /images which I set up to told them. Permissions allow access. The second line was made so the servers can have access to their configuration files when recovering. It may have been necessary to make the recoveries work. I recall some errors complaining the servers could not access their config files.

Ignite version:
Sat Nov 13 18:10:13 2004:tzfat:/ root user
[5001 /root #] swlist -l product | grep -i ignite
Ignite-UX B.5.4.50 HP-UX System Installation Services

Which means I have the Ignite version you were seeking in your other thread. I'm starting the rolling upgrade process to V6 right now.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
steven Burgess_2
Honored Contributor

Re: Ignite Server and NFS Hang

Hi Randall

Have you checked your syslog for any messages during the ignites ? We had huge issues here with make_net_recovery. Problem with us was put down to an issue in the SAN. We were getting a tonne of PV timeouts which in turn was causing nfsd to become extremely overworked, this led to him not responding to I/O requests and in the end ignoring an rpcinfo from the cluster monitor daemon. This caused the nfspkg to failover which failed because there was an ignite process that wouldn't let go of the filesystem

nice.....

We are currently re-engineering another solution.

HTH

Steve
take your time and think things through
hpuxrox
Respected Contributor

Re: Ignite Server and NFS Hang

Well, there is no problem to research. I disconnected the LAN card myself from the ignite server to test the affects on the clients.

We are planing a large rollout to 100s of machines. And, these are all very critical machines and cant afford a single point of failure; ignite backups are secondary.

So, system uptime is 100% more important that the ignite back. And, if the ignite has any affect on production it will never be implemented.

I do not want to come in at 1:30AM and find that the ignite server went down and 10 of my production machines are offline becuase of NFS hangs; this is simply unacceptible.
Ermin Borovac
Honored Contributor
Solution

Re: Ignite Server and NFS Hang

If you want to use soft NFS mounts with Ignite you can try mounting filesystem manually with 'soft' option before running make_net_recovery. make_net_recovery won't try to remount the filesystem if it's already mounted.

# mount -o soft :/var/opt/ignite/clients /var/opt/ignite/recovery_client_mnt
# /opt/ignite/bin/make_net_recovery -s
# umount /var/opt/ignite/recovery_client_mnt
Steven E. Protter
Exalted Contributor

Re: Ignite Server and NFS Hang

I understand your ignite concerns.

Ignite imaging is used in my shop and extensively all over hp. Every time I went to class and messed over a education center machine, Ignite was used to fix it.

That being said, the issues you are experiencing with Ignite are networking issues. They may or may not have to do with your Ignite configuration.

Its possible there is a networking issue uncovered with Ignite.

You are busy and have a lot more machines to maintain than I do. It still might be worth carefully looking into this issue.

I think there is probably an inconsistency in your server or client instl_adm -d ouptut. That means a network setting is wrong somewhere and is responsible for the entire NFS hang issue.

I had this exact same issue and the response center helped me fix it and avoid a nasty production problem that was just waiting to happen because of how someone had my switch port settings set.

Maybe the response center is the way to go.

If you post some specific data I'm betting you can resolve the issue right here without disrupting production.

Regards and good luck,

steve
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Michael Roberts_3
Honored Contributor

Re: Ignite Server and NFS Hang

From mount_nfs(1M):

Hard vs. Soft
File systems that are mounted read-write or that contain
executable files should always be mounted with the hard option.
Applications using soft mounted file systems may incur unexpected
I/O errors.

Since the purpose of make_net_recovery is an "insurance" in case of future disaster, the most conservative path (hard mount) was taken with regard to writing the archives on NFS.
etouq ot hguone revelc ton m'i
hpuxrox
Respected Contributor

Re: Ignite Server and NFS Hang

After working with HP Support, they stated that the mounting of the file systems is hardcoded in the make_net_recovery command. So, that can not be changed. They also suggested running a script that verifies that the NFS mounts have been dismounted properly; which I will put in the make_net_recovery push script. I talked with my mananger and suggested clustering the server due to its criticality, and he seems to not have much of an issue with it. so, i guess my solution, or rather mediation, is to,

Script to verify mounts
MC/SG Clustering of the ignite server

Thanks for all your help
hpuxrox
Respected Contributor

Re: Ignite Server and NFS Hang

Thanks