Re: rman processes wait state

Sergey Akifiev_1 · ‎12-10-2007

i'm in doubt about where to post this question. let it be database forum.

well, i started full database backup using rman on to NFS-mounted file system. and now while monitoring its perfomance have noticed that network card is putting only 3-5MB/s. this is way to low, because while doing `cp' of large file i've got about 10MB/s throughput.
digging further i've found that both of rman processes (it is doing backup with 2 channels allocated) didn't wait anything from oracle's point of view. disk array holding datafiles is also idling. so, IMHO, it boiling down to OS.
so, i took glance utility and checked processes' wait states. it appear that both rman processes are in `SYSTEM' wait state for 90% of time! not `Disk IO', nor `NFS'.

so, please give me some points to check in hpux.
thns in advance.

Steven E. Protter · ‎12-10-2007

Shalom,

It could still be an i/o problem.

On the NFS server.

Think about this:
NFS server is busy, or network is clogged with requests. Process attempting to back it up is waiting for I/O. The process is put to sleep because there isn't any I/O to do.

I'd check the NFS server utils, nfsstat and such for bottlenecks and pending requests.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Sergey Akifiev_1 · ‎12-10-2007

couldn't you help me to elaborate this point?

on 'client' filesystem is mounted like this:
sergey@buster:~$ mount | grep bootes
/mnt/bootes on bootes:/opt/pub rsize=32768,wsize=32768,NFSv3 on Fri Nov 16 17:06:31 2007

and nfsstat on client:
sergey@buster:~$ nfsstat

Server rpc:
Connection oriented:
calls badcalls nullrecv
0 0 0
badlen xdrcall dupchecks
0 0 0
dupreqs
0
Connectionless oriented:
calls badcalls nullrecv
0 0 0
badlen xdrcall dupchecks
0 0 0
dupreqs
0

Server nfs:
calls badcalls
0 0
Version 2: (0 calls)
null getattr setattr
0 0% 0 0% 0 0%
root lookup readlink
0 0% 0 0% 0 0%
read wrcache write
0 0% 0 0% 0 0%
create remove rename
0 0% 0 0% 0 0%
link symlink mkdir
0 0% 0 0% 0 0%
rmdir readdir statfs
0 0% 0 0% 0 0%
Version 3: (0 calls)
null getattr setattr
0 0% 0 0% 0 0%
lookup access readlink
0 0% 0 0% 0 0%
read write create
0 0% 0 0% 0 0%
mkdir symlink mknod
0 0% 0 0% 0 0%
remove rmdir rename
0 0% 0 0% 0 0%
link readdir readdir+
0 0% 0 0% 0 0%
fsstat fsinfo pathconf
0 0% 0 0% 0 0%
commit
0 0%

Client rpc:
Connection oriented:
calls badcalls badxids
16177472 735 1
timeouts newcreds badverfs
0 0 0
timers cantconn nomem
0 734 0
interrupts
1
Connectionless oriented:
calls badcalls retrans
25 5 25
badxids timeouts waits
0 29 0
newcreds badverfs timers
0 0 0
toobig nomem cantsend
0 0 0
bufulocks
0

Client nfs:
calls badcalls clgets
16176751 2 16176749
cltoomany
96860
Version 2: (1 calls)
null getattr setattr
0 0% 1 100% 0 0%
root lookup readlink
0 0% 0 0% 0 0%
read wrcache write
0 0% 0 0% 0 0%
create remove rename
0 0% 0 0% 0 0%
link symlink mkdir
0 0% 0 0% 0 0%
rmdir readdir statfs
0 0% 0 0% 0 0%
Version 3: (16176750 calls)
null getattr setattr
0 0% 4081 0% 416 0%
lookup access readlink
812 0% 19033 0% 0 0%
read write create
436998 2% 15703075 97% 107 0%
mkdir symlink mknod
1 0% 0 0% 0 0%
remove rmdir rename
8 0% 0 0% 0 0%
link readdir readdir+
0 0% 0 0% 9645 0%
fsstat fsinfo pathconf
1906 0% 1 0% 96 0%
commit
571 0%

next the server:
[sergey@bootes /opt/pub/backup/oracle/buster]$ nfsstat
Client Info:
Rpc Counts:
Getattr Setattr Lookup Readlink Read Write Create Remove
0 0 1323 0 1100 0 0 0
Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access
0 0 0 0 0 19 0 1539
Mknod Fsstat Fsinfo PathConf Commit
0 87 1 0 0
Rpc Info:
TimedOut Invalid X Replies Retries Requests
0 0 0 0 4069
Cache Info:
Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW Hits Misses
21308 1402 4199 1323 14410 1090 0 0
BioRLHits Misses BioD Hits Misses DirE Hits Misses
0 0 151 19 83 0

Server Info:
Getattr Setattr Lookup Readlink Read Write Create Remove
55196 104931 83033 0 538349 16748799 18431 18
Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access
0 0 200 2212 0 0 26011 41575
Mknod Fsstat Fsinfo PathConf Commit
4 23222 1 96 30550
Server Ret-Failed
121873
Server Faults
0
Server Cache Stats:
Inprog Idem Non-idem Misses
0 0 0 904916
Server Write Gathering:
WriteOps WriteRPC Opsaved
16748799 16748799 0

i see nothing suspicious here. error counters are not increasing.

client is hpux 11.11 and the server is freebsd 6.1-RELEASE

Eric Antunes · ‎12-11-2007

Hi Sergey,

First of all, I'm not sure rman is NFS certified... You should do your rman backups to a tape or to local disks, if you have plenty of space.

Anyway, what do you get from the following:

SELECT p.SPID, EVENT, SECONDS_IN_WAIT AS SEC_WAIT,
STATE, CLIENT_INFO
FROM V$SESSION_WAIT sw, V$SESSION s, V$PROCESS p
WHERE (sw.EVENT LIKE 'sbt%' or sw.EVENT LIKE '%IO%')
AND s.SID=sw.SID
AND s.PADDR=p.ADDR

Best Regards

Eric Antunes

Each and every day is a good day to learn.

T G Manikandan · ‎12-18-2007

Are you using Netapp as your NFS device or its just another server?

T G Manikandan · ‎12-18-2007

Couple of points I would check,

* Check as whether you are using the latest version of NFS.

Suggestion:

* Try replacing your interface to Gigabit to have more speed.

* Probably you can reduce the no. of RMAN channels to increase the speed.

Sergey Akifiev_1 · ‎12-18-2007

2Eric Antunes: according to v_session_wait rman i working hard doing something. the last wait was, if remembered correctly, `file open' with state `-1' (i.e. i was waiting for it some time ago).
2T G Manikandan:
as is wrote before the server is a FreeBSD 6.1-RELEASE box. tough it is not of latest version, but i cannot recall any significant change in NFS code there.
next. to my pity i don't have opportunity to engage gigabit interface.
and experiments with reducing/increasing the number of RMAN channels didn't do ant good too.

Eric Antunes · ‎12-19-2007

Hi Sergey,

Check this as root:

#ps -ef|grep rpc.lockd
#ps -ef|grep rpc.statd

Eric

Each and every day is a good day to learn.

Sergey Akifiev_1 · ‎12-19-2007

here it is. on client:
sergey@buster:~$ ps -ef|grep rpc.lockd
root 769 1 0 Dec 11 ? 0:00 /usr/sbin/rpc.lockd
sergey 16150 13341 1 15:46:47 pts/0 0:00 grep rpc.lockd
sergey@buster:~$ ps -ef|grep rpc.statd
root 763 1 0 Dec 11 ? 0:00 /usr/sbin/rpc.statd
sergey 16152 13341 0 15:46:53 pts/0 0:00 grep rpc.statd

and on server:
[sergey@bootes ~]$ ps ax | egrep "rpc.(lock|stat)"
430 ?? Ss 0:00,91 /usr/sbin/rpc.statd
435 ?? Ss 0:00,88 /usr/sbin/rpc.lockd
442 ?? IW 0:00,00 /usr/sbin/rpc.lockd

PS what the point in running ps command as root?

Eric Antunes · ‎12-19-2007

Hi Sergey,

No point at all, sorry...

What is your RDBMS version? Is it a RAC??

Eric

Each and every day is a good day to learn.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: rman processes wait state

rman processes wait state