Operating System - HP-UX
1748256 Members
4058 Online
108760 Solutions
New Discussion юеВ

Re: NFS lockd issue 11iv3 & redhat linux AS

 
SOLVED
Go to solution
wvsa
Regular Advisor

NFS lockd issue 11iv3 & redhat linux AS

Good afternoon all;

Having a strange problem with NFS. We have a rx6600 mounting two directories to a redhat linux server. Here are the dfstab mount options:

share -F nfs -o sec=sys,anon=0 /nsp_mart
share -F nfs -o sec=sys,anon=0 /sas_bi

On the redhat linux box we mount the directories as follows:

romans02:/nsp_mart /nsp_mart nfs bg,soft 0 0

romans02:/sas_bi /sas_bi nfs bg,soft 0 0


With these options I can log on as a user and read and write a file under the /nsp_mart directory.

However in running a application (SAS) the application hangs when it attempts to open or write a file to directories under /nsp_mart.

The SAS application will work if it is run in the following manner:

sas -filelocks none

Should mention that SAS is run from the redhat linux server and is attempting to read and write files to the /nsp_mart directory. At any rate when the -filelocks none option is used all is well SAS writes and reads from the /nsp_mart directory.


My question is this is there a problem with lockd on the hpux 11iv3 server?

I have cleared the locks using the clear_lock command Have kctune klm_log_level=9 to see if there are any errors in syslog, no messages appear in syslog.

Not sure how to proceed, see that lockd is on port 4045 and running lsof -i -P have never seen this port being used.

Not sure what is causing this problem would greatly appreciate any input.

Thank you!

Norm

27 REPLIES 27
Dave Olker
HPE Pro
Solution

Re: NFS lockd issue 11iv3 & redhat linux AS

Hi Norm,

So are you saying that you enabled debug KLM logging and reproduced the hang but never saw any messages related to KLM? That tells me this may not be a locking problem. Have you tried collecting a network trace while reproducing the problem? I'd suggest collecting a nettl trace or Wireshark trace on the rx6600 while reproducing the problem. Once the hang occurs, let it hang for a few seconds then stop the trace. The trace will hopefully show what over-the-wire requests were happening when the hang happened.

Regards,

Dave
I work for HPE

[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
wvsa
Regular Advisor

Re: NFS lockd issue 11iv3 & redhat linux AS

Hello Dave;

Thank you for responding. Once I figure out how to get nettl trace and/or wireshark configured and running will let you know what I find.


Norm
Dave Olker
HPE Pro

Re: NFS lockd issue 11iv3 & redhat linux AS

Here's what I'd suggest on the HP-UX NFS server:

1) Turn on nettl tracing:
# /usr/sbin/nettl -tn pduin pduout loopback -e ns_ls_ip -f

2) Reproduce the hanging application

3) Turn off nettl tracing:
# /usr/sbin/nettl -tf -e all

This will create one or two files using the name you supplied in step 1 and the suffix ".TRC000."

Once you have this file you can load it into Wireshark (yes, Wireshark can directly read HP-UX nettl format) and filter on the IP address of the client, or NFS packets, or KLM packets, etc.

Hope this helps,

Dave

I work for HPE

[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
wvsa
Regular Advisor

Re: NFS lockd issue 11iv3 & redhat linux AS

David;

Thank you for your responses. Will running trace this morning. Would you be willing to be another set of eyes and look at the trace. Will provide all the necessary background data.


Thanks again


Norm
Dave Olker
HPE Pro

Re: NFS lockd issue 11iv3 & redhat linux AS

Sure Norm.

I'd want the *raw* trace files, not any formatted stuff. I'd also want the IP addresses of the NFS client and server. You can either post the trace files to this thread, or if you're not comfortable with that (I wouldn't blame you) you can send the trace files to me directly: dave.olker@hp.com.

Regards,

Dave
I work for HPE

[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo

Re: NFS lockd issue 11iv3 & redhat linux AS

Dave,

Once you have the emails from Norm, you might want to ask a forums admin (such as Melvyn) to edit your post and take out your email address before the dreaded spam-bots strike...

Duncan

I am an HPE Employee
Accept or Kudo
Dave Olker
HPE Pro

Re: NFS lockd issue 11iv3 & redhat linux AS

Hi Duncan,

I'm a moderator of the forums so I can edit any post. I appreciate your concern, but I have no problem listing my HP email here. I encourage HP customers to contact me, and many do. :)

Dave
I work for HPE

[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Dave Olker
HPE Pro

Re: NFS lockd issue 11iv3 & redhat linux AS

Hi Norm,

I looked at the first round of nettl traces you sent me and here's the concerning thing I see in the trace:


172.17.6.19 -> 172.17.6.29 Portmap V2 GETPORT Call NLM(100021) V:4 TCP
172.17.6.29 -> 172.17.6.19 Portmap V2 GETPORT Reply (Call In 27711) PROGRAM_NOT_AVAILABLE
172.17.6.19 -> 172.17.6.29 Portmap V2 GETPORT Call NLM(100021) V:4 TCP
172.17.6.29 -> 172.17.6.19 Portmap V2 GETPORT Reply (Call In 28515) PROGRAM_NOT_AVAILABLE
172.17.6.19 -> 172.17.6.29 Portmap V2 GETPORT Call NLM(100021) V:4 TCP
172.17.6.29 -> 172.17.6.19 Portmap V2 GETPORT Reply (Call In 28695) PROGRAM_NOT_AVAILABLE
172.17.6.19 -> 172.17.6.29 Portmap V2 GETPORT Call NLM(100021) V:4 TCP
172.17.6.29 -> 172.17.6.19 Portmap V2 GETPORT Reply (Call In 28720) PROGRAM_NOT_AVAILABLE
172.17.6.19 -> 172.17.6.29 Portmap V2 GETPORT Call NLM(100021) V:4 TCP
172.17.6.29 -> 172.17.6.19 Portmap V2 GETPORT Reply (Call In 28812) PROGRAM_NOT_AVAILABLE

The Linux client is attempting repeatedly to retrieve the port number of the NLM (Network Lock Manager) daemon running on the HP-UX system. The HP-UX box should reply with port number 4045. The fact that it doesn't reply with a port number tells me that rpc.lockd and rpc.statd may not be running on your 11i v3 system.

Can you verify that you have the LOCKMGR variable set to 1 in your /etc/rc.config.d/nfsconf file on the HP-UX system? Can you also confirm that the following command run on the HP-UX server returns the expected results:

# rpcinfo -t localhost 100021
program 100021 version 1 ready and waiting
program 100021 version 2 ready and waiting
program 100021 version 3 ready and waiting
program 100021 version 4 ready and waiting

You should get a "ready and waiting" reply for all 4 versions of program 100021, which is the Network Lock Manager. If you don't then we need to figure out why the lock manager is not getting started on your HP-UX system.

Regards,

Dave
I work for HPE

[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
wvsa
Regular Advisor

Re: NFS lockd issue 11iv3 & redhat linux AS

Hello Dave;

here is the info you requested:

#***************************************************************************
LOCKMGR=1
LOCKD_OPTIONS=""
STATD_OPTIONS=""

#***************************************************************************
# NFS client configuration variables
:q!
root@romans02:/etc/rc.config.d
# rpcinfo -t localhost 100021
rpcinfo: RPC: Program not registered
program 100021 is not available


The rpcinfo does not look good, so what is the next step?

Norm