HPE GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Sever degredation of RPC communications
Operating System - HP-UX
1833780
Members
2385
Online
110063
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Go to solution
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2004 10:26 PM
01-21-2004 10:26 PM
Hi network wizards,
I have a system that experiences considerable impededance of NFS/RPC Services.
Boundary conditions:
# uname -srv
HP-UX B.11.11 U
# echo "sc prod mem;info;wait;il"|cstm|grep -i total
Total Configured Memory : 8192 MB
Cell Total (MB): 4096
Cell Total (MB): 4096
System Total (MB): 8192
PDT Total Size: 100
Total Configured Memory : 8192 MB
Cell Total (MB): 4096
Cell Total (MB): 4096
System Total (MB): 8192
PDT Total Size: 100
# model
9000/800/rp7410
# swapinfo -ta
Kb Kb Kb PCT START/ Kb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 8388608 1110752 7277856 13% 0 - 1 /dev/vg00/lvol2
dev 12288000 1105884 11182116 9% 0 - 1 /dev/vg00/lvol12
reserve - 6898764 -6898764
memory 6463356 1624092 4839264 25%
total 27139964 10739492 16400472 40% - 0 -
The system is cluster leader in a two-node MC/SG cluster that hosts the single and only package as primary node, which itself is an Oracle/SAP R3 application (thus the 20 GB swap)
The system load is and has been negligable
# uptime
11:56am up 46 days, 21:39, 4 users, load average: 0.15, 0.14, 0.15
The same goes for CPU utilization.
The system seems only to suffer from memory insufficiencies.
E.g. the curren summary from glance's mem report reads this
Total VM : 8.74gb Sys Mem : 1.48gb User Mem: 5.57gb Phys Mem: 7.98gb
Active VM: 2.69gb Buf Cache: 817.6mb Free Mem: 136.4mb
E.g. current swap report shows this
Swap Available: 26504m Swap Used: 3751mb Swap Util (%): 40 Reserved: 10489m
Yesterday when the RPC problem became apparent I couldn't even start glance for a while, and thus ran "sar -w" which showed swapout activity.
Therefore I'm tempted to believe that our RPC problems are just a cause of insufficient memory.
NFS-wise the same cluster node is NFS server and NFS client.
This is due to the SAP quirk to make heavy use of automounter and lots of NFS exports and imports accross a whole farm of other servers (e.g. SAP transports).
This automount stuff in a cluster environment is causing quite some grief, but SAP prerequisites it.
So, somtimes RPC tools like showmount timeout, while at other times they respond quickly.
I have quite a few entries about failed NFS communication in syslog.log (n.b. alster is the NFS server/exporter, lena (the cluster package's virtual hostname) the client, both are on the same machine (thus I think I can neglect observations of the LAN)
# grep -i nfs /var/adm/syslog/syslog.log|sed -n '/Jan 21/,$p'|tail
Jan 22 05:00:32 alster vmunix: NFS server (pid716@/usr/sap/trans) ok
Jan 22 05:00:32 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
Jan 22 05:04:02 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
Jan 22 05:04:02 alster vmunix: NFS server (pid716@/sapmnt/Z01) not responding still trying
Jan 22 05:30:46 alster vmunix: NFS server (pid716@/sapmnt/Z01) not responding still trying
Jan 22 05:31:01 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
Jan 22 06:50:28 alster vmunix: NFS server (pid716@/sapmnt/Z01) not responding still trying
Jan 22 06:51:02 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
Jan 22 07:40:00 alster vmunix: NFS server (pid716@/sapmnt/Z01) not responding still trying
Jan 22 07:40:00 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
The RPC stats for the server look like this
# nfsstat -sr
Server rpc:
Connection oriented:
calls badcalls nullrecv
57792 0 0
badlen xdrcall dupchecks
0 0 20800
dupreqs
0
Connectionless oriented:
calls badcalls nullrecv
13668160 0 0
badlen xdrcall dupchecks
0 0 4879982
dupreqs
188
which looks sound to me, while the client side exhibits probably too many timeouts and retransmissions
# nfsstat -cr
Client rpc:
Connection oriented:
calls badcalls badxids
0 0 0
timeouts newcreds badverfs
0 0 0
timers cantconn nomem
0 0 0
interrupts
0
Connectionless oriented:
calls badcalls retrans
4011982 139 13067
badxids timeouts waits
13186 12027 0
newcreds badverfs timers
0 0 65907
toobig nomem cantsend
0 0 0
bufulocks
0
Because there are so many services and partners involved in NFS I fear that a thorough NFS performance analysis is beyond my possibilies/horizon.
Therefore, I would like to ask you network gurus for some advice what to look for, and where improvements could be applied.
Rgds.
Ralph
I have a system that experiences considerable impededance of NFS/RPC Services.
Boundary conditions:
# uname -srv
HP-UX B.11.11 U
# echo "sc prod mem;info;wait;il"|cstm|grep -i total
Total Configured Memory : 8192 MB
Cell Total (MB): 4096
Cell Total (MB): 4096
System Total (MB): 8192
PDT Total Size: 100
Total Configured Memory : 8192 MB
Cell Total (MB): 4096
Cell Total (MB): 4096
System Total (MB): 8192
PDT Total Size: 100
# model
9000/800/rp7410
# swapinfo -ta
Kb Kb Kb PCT START/ Kb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 8388608 1110752 7277856 13% 0 - 1 /dev/vg00/lvol2
dev 12288000 1105884 11182116 9% 0 - 1 /dev/vg00/lvol12
reserve - 6898764 -6898764
memory 6463356 1624092 4839264 25%
total 27139964 10739492 16400472 40% - 0 -
The system is cluster leader in a two-node MC/SG cluster that hosts the single and only package as primary node, which itself is an Oracle/SAP R3 application (thus the 20 GB swap)
The system load is and has been negligable
# uptime
11:56am up 46 days, 21:39, 4 users, load average: 0.15, 0.14, 0.15
The same goes for CPU utilization.
The system seems only to suffer from memory insufficiencies.
E.g. the curren summary from glance's mem report reads this
Total VM : 8.74gb Sys Mem : 1.48gb User Mem: 5.57gb Phys Mem: 7.98gb
Active VM: 2.69gb Buf Cache: 817.6mb Free Mem: 136.4mb
E.g. current swap report shows this
Swap Available: 26504m Swap Used: 3751mb Swap Util (%): 40 Reserved: 10489m
Yesterday when the RPC problem became apparent I couldn't even start glance for a while, and thus ran "sar -w" which showed swapout activity.
Therefore I'm tempted to believe that our RPC problems are just a cause of insufficient memory.
NFS-wise the same cluster node is NFS server and NFS client.
This is due to the SAP quirk to make heavy use of automounter and lots of NFS exports and imports accross a whole farm of other servers (e.g. SAP transports).
This automount stuff in a cluster environment is causing quite some grief, but SAP prerequisites it.
So, somtimes RPC tools like showmount timeout, while at other times they respond quickly.
I have quite a few entries about failed NFS communication in syslog.log (n.b. alster is the NFS server/exporter, lena (the cluster package's virtual hostname) the client, both are on the same machine (thus I think I can neglect observations of the LAN)
# grep -i nfs /var/adm/syslog/syslog.log|sed -n '/Jan 21/,$p'|tail
Jan 22 05:00:32 alster vmunix: NFS server (pid716@/usr/sap/trans) ok
Jan 22 05:00:32 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
Jan 22 05:04:02 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
Jan 22 05:04:02 alster vmunix: NFS server (pid716@/sapmnt/Z01) not responding still trying
Jan 22 05:30:46 alster vmunix: NFS server (pid716@/sapmnt/Z01) not responding still trying
Jan 22 05:31:01 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
Jan 22 06:50:28 alster vmunix: NFS server (pid716@/sapmnt/Z01) not responding still trying
Jan 22 06:51:02 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
Jan 22 07:40:00 alster vmunix: NFS server (pid716@/sapmnt/Z01) not responding still trying
Jan 22 07:40:00 alster vmunix: NFS server (pid716@/sapmnt/Z01) ok
The RPC stats for the server look like this
# nfsstat -sr
Server rpc:
Connection oriented:
calls badcalls nullrecv
57792 0 0
badlen xdrcall dupchecks
0 0 20800
dupreqs
0
Connectionless oriented:
calls badcalls nullrecv
13668160 0 0
badlen xdrcall dupchecks
0 0 4879982
dupreqs
188
which looks sound to me, while the client side exhibits probably too many timeouts and retransmissions
# nfsstat -cr
Client rpc:
Connection oriented:
calls badcalls badxids
0 0 0
timeouts newcreds badverfs
0 0 0
timers cantconn nomem
0 0 0
interrupts
0
Connectionless oriented:
calls badcalls retrans
4011982 139 13067
badxids timeouts waits
13186 12027 0
newcreds badverfs timers
0 0 65907
toobig nomem cantsend
0 0 0
bufulocks
0
Because there are so many services and partners involved in NFS I fear that a thorough NFS performance analysis is beyond my possibilies/horizon.
Therefore, I would like to ask you network gurus for some advice what to look for, and where improvements could be applied.
Rgds.
Ralph
Madness, thy name is system administration
Solved! Go to Solution.
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2004 02:23 AM
01-22-2004 02:23 AM
Solution
From
http://secu.zzu.edu.cn/book/NetWork/NetworkingBookshelf_2ndEd/nfs/appb_02.htm
"badxids ~ timeout
RPC requests that have been retransmitted are being handled by the server, and the client is receiving duplicate replies. Increase the timeo parameter for this NFS mount to alleviate the request retransmission, or tune the server to reduce the average request service time."
You might also check your patches.
http://www2.itrc.hp.com/service/patch/search.do
shows 29 patches when you just search on nfs.
Ron
http://secu.zzu.edu.cn/book/NetWork/NetworkingBookshelf_2ndEd/nfs/appb_02.htm
"badxids ~ timeout
RPC requests that have been retransmitted are being handled by the server, and the client is receiving duplicate replies. Increase the timeo parameter for this NFS mount to alleviate the request retransmission, or tune the server to reduce the average request service time."
You might also check your patches.
http://www2.itrc.hp.com/service/patch/search.do
shows 29 patches when you just search on nfs.
Ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2004 02:36 AM
01-22-2004 02:36 AM
Re: Sever degredation of RPC communications
I would say all your problem stem from lack of memory and consequent massive swapping. Performance degrades so much during heaby swapping that everything can be affected; nfs daemons, network, root processes etc. I think someone once said performance degrades by around 100 times during swapping.
I would fix your memory problems first to see if it fixes your problem.
If not, then the best source for looking indepth at nfs issues and fixing them is a book called "optimizing nfs performance" by Olker (available on Amazon). Excellent, talks about everything, nfs, rpc, network and how to tune and debug.
I would fix your memory problems first to see if it fixes your problem.
If not, then the best source for looking indepth at nfs issues and fixing them is a book called "optimizing nfs performance" by Olker (available on Amazon). Excellent, talks about everything, nfs, rpc, network and how to tune and debug.
Im from Palmerston North, New Zealand, but somehow ended up in London...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2004 07:39 PM
01-22-2004 07:39 PM
Re: Sever degredation of RPC communications
Ron,
the rabbit for your link to ORA networking books.
Kudos to those who provide this service ;-)
Stefan,
you are probably right that the prevailing memory issue is to blame.
I will do further performance checks to single NFS trouble out.
the rabbit for your link to ORA networking books.
Kudos to those who provide this service ;-)
Stefan,
you are probably right that the prevailing memory issue is to blame.
I will do further performance checks to single NFS trouble out.
Madness, thy name is system administration
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Events and news
Customer resources
© Copyright 2025 Hewlett Packard Enterprise Development LP