Operating System - HP-UX
1827807 Members
2727 Online
109969 Solutions
New Discussion

Re: performance problem with basic on 11.11

 
Massimo Bianchi
Honored Contributor

performance problem with basic on 11.11

Hi all,
one of my customer has a big performance problem.

They migrated their application, which is written in basic [Thoroughbred (http://www.tbred.com) ] from an 11.0 server to a new 11.11 server.
executable was compiled on a 10.30 system, and there is no source to recompile it on the 11.11

Unfortunatly he did no stress test, the stuff worked and he didn't investigated further.

Now, with the full load, the system is struck with SYS usage.

I traced one of their process with tusc, and most time seems spent during the following system calls:
read
lseek64
fcntl

One of the FS has over 800.000 files, so i checked the kernel for proper parameters.

buffer_cache looked to much (default 50%) so i tuned it to 15%
vx_ninode was default (0), so, to avoid leaks, i changed this to 90% of ninode.

Still no luck, so i checked vxfs parameters.
most of io looked like of 1024 bytes, default for vxfs is a read_ahead of 64k, so i changed
read_pref_io to 8k and read_unit_io to 8k.

The backup performance are better, and no relevant WIO , but still there is a very high SYS usage, around 65%, while IDLE 0%, USER 35%.


Any ideas ?

Thanks,
Massimo


19 REPLIES 19
Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

tusc final output:

Syscall Seconds Calls Errors
exit 0.00 1
fork 0.11 136
read 2.75 103249
write 0.50 15540
open 0.02 932
close 0.02 934
wait 0.02 136
unlink 0.02 42
time 0.30 27590
brk 0.00 14
lseek 0.00 88
alarm 0.01 510
kill 0.00 1
stat 0.00 22
ioctl 0.02 273
umask 0.00 273
fcntl 1.06 89655 31
setitimer 0.00 2
getitimer 0.00 1
sigvec 0.03 2451
semop 0.00 2
sigprocmask 0.00 2
sigsuspend 0.00 1 1
sigaction 0.00 3
sigsetstatemask 0.16 19450
getdents 0.06 708
fstat64 0.00 4
lseek64 0.98 106467
stat64 0.44 9583 8670
----- ----- ----- -----
Total 6.48 378070 8702

Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

sar -w output.

there are many context switches... there are about 320 interactive users, that start the basic. maybe there is the problem ???

Steve Lewis
Honored Contributor

Re: performance problem with basic on 11.11

It is most likely to be a software issue, unfortunately. It sounds like the software is inefficiently re-reading files from memory buffers and locking/unlocking them repeatedly.

Since you said that wio is not relevent here, you will have to look at CPUs and memory. Maybe try writing a separate program which uses mpctl() to bind a process to a cpu. Increasing memory or going for faster CPUs. If you cannot get hold of the source then your tuning options are limited.

You could try separating batch processes from interactive ones. Run the batch (background) processes at a much lower priority, maybe at a different time of day, since they will eat up all the resource they are given and slow it down from the interactive users.

You could also look into the sort of work the users are doing. A bad query from one user could ruin it for everyone else. You could set up a separate system for management information and heavy stuff.

a top list, vmstat -S , sar -b and sar -u would be more useful to us all than sar -w. A measureware global report would be even better :-)

Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

Hi,
in the ZIP some stats...

I too think that is a software issue, but there is no chance to modify it.

All i can do is at OS level.

The db is proprietary, written in basic... the application is called "Databridge" something...

THanks,
Massimo

Steve Lewis
Honored Contributor

Re: performance problem with basic on 11.11

Thanks for the stats - a very interesting system file. I am going to try some of these on my test system.

Remember that read calls also apply to sockets as well as disk files. Have you checked the network activity and tuning?

sar -v and sar -c would show more info. The buffer activity is nice and high compared with the disk i/o.

The only other thing I can think of is NFS. But you would have checked that already, I think.



Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

Hi,
some of the tuning parameters are derived from the "NFS tuning Guide".


This system hosts also:
DTC printing
SAMBA shares


I think i already did a pre-tune for the nfs parts, i see a throughput of about 300Kb/sec, over a 100Mb, so network is not the bottleneck, i think.

sar -v show nothing, cache is used and there are alway enough free file identifier.


sar -c .... i will check....


Massimo

Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

another set of stats, from todays workload....

Massimo

Steven E. Protter
Exalted Contributor

Re: performance problem with basic on 11.11

vxinodes are a frequent troublemaker on new systems. The last new 11i system I did had horrendous swagent/sd-ux performance until I installed all the patches for that.

Here is a doc from a performance guru. You may have seen it. If you go through it carefully you may find something. Every system I went through with this doc had something wrong with it.

http://www2.itrc.hp.com/service/cki/search.do?category=c0&docType=Security&docType=Patch&docType=EngineerNotes&docType=BugReports&docType=Hardware&docType=ReferenceMaterials&docType=ThirdParty&searchString=UPERFKBAN00000726&search.y=8&search.x=28&mode=id&admit=-1335382922+1068046301234+28353475&searchCrit=allwords

Since you are European, you may have to track this one down by document id.

Introduction to Performance Tuning
DocId: UPERFKBAN00000726 Updated: 20031008

If you can't find it, I guess I can cut and paste it in, but I'm always hesitant because I don't know which docs HP doesn't like cut and pasted.

There's always email. Let me know.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

Thanks SEP,
but i already used that document.


Indeed vxinode are a nasty parameter, i'm trying to reduce them, but i'm also lloking for something other. and very, very big.

The users complains that the older system (11.0) was much faster.

The new one has faster disks, faster CPU, faster bus.... but runs slower!

In my mind there are two issue: or some kernel parameter very wrong, or an application issue.

For the application i can do nothing.. but i cannot find anything evident either in the system!

Massimo


Steven E. Protter
Exalted Contributor

Re: performance problem with basic on 11.11

I'd do a comparison of kmtune output to as similar a system thats working right as possible in this case.

Then I'd make adjustments based on anything that jumps out at you.
dbc_max_pct 6
dbc_min_pct 2

The default is 5 and 50. Not good.

If they are too far apart, you will have some very expensive(in cpu cycles) slowdowns on your system while its changing those figures.

That should have been in my first post, so zero point this. Please.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
A. Clay Stephenson
Acclaimed Contributor

Re: performance problem with basic on 11.11

The high context switch rate could be caused by a low timeslice setting. Do a kmtune and post that output. If the "old" box was running faster then do a kmtune on it and compare them. Also setting dbc_max_pct to 15 is meaningless without knowing how much memory is in the box. How much memory do you have? Your buffer cache could still be too large.
If it ain't broke, I can fix that.
Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

Memory is 4G.

Information from previous system is difficult to obtain, now it is powered off, and i do not know if it can be switched on, without some care. The customer itself is not very expert, and i'm not there.

Initially we left dbc_max_pct to 50 because this server is also a file server (SAMBA) and docs from NFS tuning suggested a value from 50, as a start.


yesterday i monitored the system with glance, and buffer cache was all used, up to 2G.

Since i also reputed it too much, i lowered to 15% (600Mb), trade off between kernel raccomendation and a little more for the nfs server.


Today i checked the "sar -b" output and the buffer read hit ratio is still good, around 99%, so i think that it can be lowered again.

older system was a N, with 2 450MhZ CPU, newer system is a rpXX with 2 750MhZ CPU. I'm not sure about the values, they were determined from some speech with the customer and i didn't check it.

timeslice is default (10), and

in attach some other infos from a peak moment.

Thanks guys :)

Massimo

Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

Hi,
i can confirm CPU speed:

OLD SERVER: 2*450Mhz

NEW SERVER: 2*650Mhz


Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

If any can be interested: one of the issue looks related to the scsi io subsystem, since in some peak moment avgservice times are exagerated.

I think to install PHKL_29047, for the scsi io subsys; also i find very suitable the set of patch PHKL_25995, PHKL_25994, PHKL_25993, PHKL_28983 to enable fast file descritors and
PHKL_29110, vxfs performance, since there is the FS with 800000 files.

Any feedback on these patches ?



Massimo


Hercaud
New Member

Re: performance problem with basic on 11.11

Hello,

I see you used tusc for it ...
I was looking for already compiled binaries of tusc7.5.
The Sysadmin sites indded has it but points to binaries for HP-UX 11.00 PA or 11i Iltanium (11-22) . No binary there for HP-UX 11-11 -((.

Thanks for any ggod hint on that
steven pan
Occasional Advisor

Re: performance problem with basic on 11.11

hi steven

i have some problems about ninode and vx_ninode, i think u can help me to find this file to mail me .

Introduction to Performance Tuning
DocId: UPERFKBAN00000726 Updated: 20031008

thanx a lot

panlm at vip.sina.com
StevenPan
Bill Hassell
Honored Contributor

Re: performance problem with basic on 11.11

The high system overhead is due to a huge number of system calls from the application(s). That seems to be confirmed with the tusc analysis. My guess is that it may be opening and closing files in a very inefficient manner. Confirm this with sar -a 1 50. The 3 values are:

iget/s - the number of times per second that the system had to go to the disk and get inode information.

namei/s - the number of name-to-inode translations per second. The directory contains a name and the inode number for the start of that file. Opening lots of existing files will require the name to inode translation which then points to the physical disk location.

dirblk/s - the number of directory blocks scanned per second. When looking for a file in a directory, the filesystem code (kernel) must look through all the entries to find a specific file. A massively large directory (flat) will generate huge overheaad in scanning for the name.

On a quiet system:

HP-UX yoda B.11.11 U 9000/831 07/15/04

11:17:49 iget/s namei/s dirbk/s
11:17:50 0 142 0
11:17:51 0 5 0
11:17:52 0 2 0

Average 0 49 0

(these numbers won't line up because of the proportional font and extra-space stripping that occurs in the forums). Here is a busy system (running the command: du -s /):

HP-UX yoda B.11.11 U 9000/831 07/15/04

11:31:21 iget/s namei/s dirbk/s
11:31:22 1787 2264 0
11:31:23 2577 2873 0
11:31:24 1676 1933 0

Average 2013 2355 0

In this case, files are not being opened so no directory blocks are used, just inode information.

sar -a will probably confirm a poor application design. If the filesystem has 800,000 in a signle directory, this is probably the first place to start fixing things. You can locate massively large (flat) directory structures with:

find /whatever -type d -exec ll -d {} \;|sort -rnk5 | head -10

The apps will have to be rewritten to find the files in a hierarchical directory structure. Most sysadmins agree that massively large flat filesystems are a nightmare to administer, and are often an attempt to create a database without using a real database management product.


Bill Hassell, sysadmin
Massimo Bianchi
Honored Contributor

Re: performance problem with basic on 11.11

Closing....
Shaun Branigan
New Member

Re: performance problem with basic on 11.11

I have just come across your problem, there is a solution to problems caused by large numbers of files in one directory. Thoroughbred use a configuration file call IPLINPUT, in this file you can setup the directories for access. Each directory can be setup to have sub directories that are the first 3 letters of the file contained within them.Thoroughbred then locates the file required using this rule.This speeds up file search times and makes it more efficent.