Operating System - HP-UX
1847047 Members
4965 Online
110261 Solutions
New Discussion

Re: bad i/o performance (application problem)

 
Carlo Montanari
Advisor

bad i/o performance (application problem)

Hi all. I have an L2000 with hpux 11.0, which is running an absurd, home made (_not by me_), application, written in shell script, using a pseudo-database based on many text files.
Each minute it runs from crontab and updates the - ehm - database, which means creating, deleting, sorting, compressing, uncompressing these thousands of little files and of course forking like a rabbit. Strange to say, sometimes it has performance problems and the disk containing the "db" is 100% busy with long i/o queue.
Since I cannot erase the application from the server, can anybody suggest me any workaround I could apply to the system, like kernel parm tuning (I think DNLC, inode cache, etc.) or similar?

TIA, Carlo
8 REPLIES 8
Thierry Poels_1
Honored Contributor

Re: bad i/o performance (application problem)

hi,

you can try to increase the buffer cache (dbc_min_pct & dbc_max_pct) which might relieve the disks a bit.

Another possibility is that all memory is consumed and that the server is swapping like hell, so check swapinfo during peak time.

good luck,
Thierry.
All unix flavours are exactly the same . . . . . . . . . . for end users anyway.
Alzhy
Honored Contributor

Re: bad i/o performance (application problem)

How could you ascertain it is an I/O problem? I've seen many such "applications" that uses flat files or pseudo ISAM files or UNIX db files - to great success. Simple but effective apps - I would call them.

The key to such applications is tuning your system.. you already thought of some. Filesystem buffer cache is one area that most of the time fixes the problem -- study your "sar -b" output over time.
Hakuna Matata.
David Child_1
Honored Contributor

Re: bad i/o performance (application problem)

I've had a problem in the past with a similar setup. The application in my case would create thousands of small log files which would later be deleted. After a while it really started to run poorly.

Even a simple 'ls -l' command in the directory and it was very slow.

I then defraged the directory (fsadm -F vxfs -d ). After it finished the 'ls -l' returned much quicker and the application responded much better.

David
RAC_1
Honored Contributor

Re: bad i/o performance (application problem)

Do you have glance installed?
Open it up, what is hitting high-cpu, memory, disk, network?

Identifying a bottle neck is a big exercise. Depending on what you have given..
What's RAM on this machine? What are buffer cahe settings?

Which FS is being accessed by this crontab entry? glance -i (will give those details)

If this crontab entry is accessing particular
FS and that FS hitting 100 % i/o, then
try distibuting the files on to different FS and make appropriate chnages in your script.

Also depending on what are your buffer cache settings, it may be increased.
Give sar -b 2 20 output.

Anil
There is no substitute to HARDWORK
Jeff Schussele
Honored Contributor

Re: bad i/o performance (application problem)

Hi Carlo,

I'd also check to see whether 1 minute cron jobs is the proper timing.
Could it be that jobs are not completeing w/in that minutes & are "backing up" causing contention issues?
Watch the cron queue using the /var/adm/cron/log to see just when they're starting & ending. Each entry pair will be denoted by a unique PID. Just grep for it.

HTH,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Bill Hassell
Honored Contributor

Re: bad i/o performance (application problem)

A huge amount of the performance hit is caused by the directory operations (opening, extending, closing) files. These tasks affect every other process on the system and create a very high system overhead. If cost is no object, move the data files to a RAM disk appliance, or at least a SAN with lots of disk cache. Another option is to simple move the database and scripts to a dedicated machine. Otherwise, get a lot more memory and set the max_dbc_pct parameter to equal 800 to 1500 megs. There's not a lot of magic to 'fix' a badly designed process.


Bill Hassell, sysadmin
Carlo Montanari
Advisor

Re: bad i/o performance (application problem)

Thank you all.
I've Glance on the server, so I can provide any info about the problem. The system has lots of free CPU and memory, it's mainly waiting for I/O on a single filesystem (where there is the db); yet I cannot distribute it around, since I've no free disks.
The "db" is composed by thousands of *very* little files, so the size of buffer cache (it's 1 Gb BTW) doesn't seem to me the key, since the total data moved is quite little.
Looking at the syscall rate in Glance, I see that the system is spending much time in doing open(), so what Bill says about directory operations is very likely.
Using "sar -a" I see also a high rate of iget, namei, dirbk.
Maybe I'll try reorganizing the directories as David writes.


Mike Stroyan
Honored Contributor

Re: bad i/o performance (application problem)

It does sound like most of the time is spent accessing the data files. You could take a very detailed look at that using the tusc system call tracer with the -T option to record times.
Looking in a different direction, you can reduce the time for starting short-lived programs by using the fastbind command. It records the shared libraries and symbol resolutions for a program so dld.sl does not need to look them up everytime the program is started. Fastbind needs to be rerun whenever you update a shared library. Otherwise it quietly returns to the old way of looking up symbols.