- Community Home
- >
- HPE Community, China
- >
- 服务器
- >
- Itanium服务器
- >
- HP-UX 11i v3 - Fuser Hang or Very Slow ISSUE
类别
Company
Local Language
论坛
讨论平台
论坛
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
讨论平台
论坛
讨论平台
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
论坛
博客
- 将帖子标记为未读
- 加注书签
- 订阅
- 禁止
- 订阅此主题的 RSS 提要
- 高亮显示此帖
- 打印此帖
- 报告不当内容
修改时间 12-01-2014 07:21 PM
修改时间 12-01-2014 07:21 PM
HP-UX 11i v3 - Fuser Hang or Very Slow ISSUE
HP-UX 11i v3 - Fuser Hang or Very Slow ISSUE解决:
故障现象:
- 周六做切机演练,umount一个文件系统速度很慢,一般的花3分钟左右,用#fuser -cu对当前的挂载点进行进程扫描列出进程也很慢,一般也得花3分钟。
HP-UX 11i v3 - Fuser Hang or Very Slow
Issue
Fuser takes a very long time to complete on all files across all file systems. End-user claiming that only a reboot would solve the issue and then eventually it would happen again. End-user also saw a spike in CPU utilization to 90% during the fuser command execution.
End-user originally said that fuser hung and he had to terminate it. A tusc was ran on the command with the folllowing syntax and the file was analyzed.
/var/tmp/tusc.out -p -e -E -T "" fuser <file>
Solution
Looking at the tusc file that the end-user sent, focusing on the lines where the pstat(PSTAT_PROC_VM...) is seen and being called as a problem.
First, here's what 1 of the lines looks like:
1292431568.288187 [24910] pstat(PSTAT_PROC_VM, 0x7fffba40, 104, 0, 3) ... = 1
Where: 104 is the size
0 is the PID
3 is the number of process regions.
Now one of the entries seems out of place:
1292431731.546178 [24910] pstat(PSTAT_PROC_VM, 0x7fffba40, 104, 4012, 65686)= 0
PID 4012 has 65686 process regions open.
Have end-user look up this pid and it turns out to be an Enterprise Manager Agent for Oracle (Process name = emagent). DBA shutdown the agent and fuser works just fine.
Keywords : fuser hung, slow oracle, server, enterprise, agent, tusc
Here is a sample script that can be used to find which process is using the most pregions:
- 打电话咨询RC,RC给个脚本列出当前的进程中占pregions最多的进程:
#!/usr/bin/sh # For use on 11iv3 only. # # This script uses tusc to determine how many pregions a process has. # For example, in the following tusc trace: # pstat(PSTAT_PROC_VM, 0x7fffb620, 104, 17649, 6000) ....... = 1 # ^ ^ # | | # PID - | # | # pregion ---- # The script sorts the output and displays it in an easy to read format. # Keep in mind, if fuser takes 10 minutes to return, so will this script. # TUSC=/usr/local/bin/tusc OUT=/var/tmp/tusc.fuser ${TUSC} -o ${OUT} -s pstat fuser /stand PROC=$(awk '{ print $4 $5 }' ${OUT} | sort -nk 2,2 -t , | tail -1) PID=$(echo $PROC | cut -d , -f 1) NUM=$(echo $PROC | cut -d , -f 2 | sed 's/)//') echo "PID with most pregions is ${PID}" echo "PID ${PID} has ${NUM} pregions" echo "" ps -p ${PID}
Here is an example of the output:
root:clunk:/tmp> ./preg /stand: PID with most pregions is 2408 PID 2408 has 1146 pregions PID TTY TIME COMMAND 2408 ? 135:54 mxdomainmgr
- 当时系统执行结果:
EDOCDB01[/usr/contrib/bin/tools]#sh /tmp/fuser3.sh
/stand:
PID with most pregions is 10531
PID 10531 has 35026 pregions
PID TTY TIME COMMAND
10531 ? 3557:38 db2sysc
- 由执行的结果看:(db2sysc进程号10531最大的pregions数是35026,导致了fuser慢)。
5. RC最后发过来一个Patch说可以修复这个Issue:
Symptoms:
PHCO_43253:
( QX:QXCR1001225006 )
fuser(1m) is very slow if one or more process has
thousands of memory mapped regions
6.Patch分析:从patch的描述看,fuser有两种访问方法,一种是扫描mmaped regions,就会花费很长时间在进程的map区导致fuser和umount慢,第二种就是lighter interfaces,不去扫描进程的map区,尽管这些进程的map区存在,执行fuser很快。
Defect Description:
PHCO_43253:
( QX:QXCR1001225006 )
fuser(1m) uses pstat_getproc(2) and pstat_getprocvm(2)
interfaces to query the information related to
processes and their mmaped regions. pstat(2) also provides
a lighter versions.
- pstat(PSTAT_PROC_LITE,..) and
pstat(PSTAT_PROC_VM_LITE, ..)
for performance benefits. fuser(1m) is not using
the lighter interfaces.
- 解决方法;
打上patch PHCO_43253,把fuser默认访问map区的interfaces修改为lighter interfaces方式来提高fuser访问进程的效率。
Resolution:
fuser(1m) has been modified to use the lighter pstat(2)
interfaces for better performance.
案例总结:fuser默认就是扫描mmaped regions的访问方法,有时候系统umount慢就是重启一下操作系统也可以释放相关进程的map区(其实重启一下相关进程的应用就可以,本例子中的就是DB2的应用),但是最好的方法就是修改fuser的访问方式,变为lighter versions的访问方式,fuser的时候不扫描map区就可以解决问题。
- 标记:
- fuser