Operating System - HP-UX
1748209 Members
2925 Online
108759 Solutions
New Discussion юеВ

HP-UX 11i Disk I/O rate decreases over days

 
Michael Resnick
Advisor

HP-UX 11i Disk I/O rate decreases over days

We have a HP9000/800/S16K-A with a VA7100 storage array with 15 disks. (I don't know the complete config.) We have two controllers into the VA, but only one is active - the other is a fail-over.

Here's the problem... We get good disk IO throughput immediately after a system reboot. However, after a few days that rate drops. By five or six days the rate is nearly half. After two weeks it is horrible and we must reboot.

This problem is fairly new to the system. We've had the system for 3+ years. About 5 months ago it was moved (via commercial mover) to another data center. Major issues with bringing it back on-line and several parts were replaced.

We've since replaced both controllers and upgraded the firmware. Still, we continue to have the problem.

Anyone have any ideas on what to look for?
5 REPLIES 5
OFC_EDM
Respected Contributor

Re: HP-UX 11i Disk I/O rate decreases over days

Did that commercial mover specialize in moving data centre equipment? I'm concerned about how it was moved. If it was treated poorly who knows what's happened to the hardware.

Have you had HP in to do a thorough check of the hardware? As well as do a cross-check of hardware/software compatibility for the new controllers? If you have a support contract you may have some of those support days they throw in for such things....may cost you nothing to have them come in and triple check it for you.

The Devil is in the detail.
Hein van den Heuvel
Honored Contributor

Re: HP-UX 11i Disk I/O rate decreases over days


>> Anyone have any ideas on what to look for?


Do NOT (start to) look at the storage (Hardware)!

The (observed (how?) disk IO RATE decrease could be a cause, or an effect.

My money is on being is on it being an effect.

A system reboort is rather unlikely to heal you storage (Hardware) is it now?

My thinks that system is asking fewer IO either
1) because there is less to do (caches filled)
or
2) because is burns too much time trying to figure out what IO to do.

So...

1) Is the application performance (response time? throughput?) getting worse over time, and better with reboot?
- can you restart application instead of the whole box with a similar healign effect?

2) Are there other performance attributes changing?
- Higher (system) CPU usage?
- Network counters?
- Free Memory?
- Disk IO response time?

What is the application on the box?
What do the performance indicator of the application tell you?

Hope this helps some
Regards,
Hein van den Heuvel ( at gmail dot com )
HvdH Performance Consulting
Michael Resnick
Advisor

Re: HP-UX 11i Disk I/O rate decreases over days

Yes, the movers "specilized" in data center moves. This server was only one of severl in the data center that was moved.

I did try stop the application but did not notice a change.

We're using a crude test - a perl program will create a temp file on each of several mount points and measure the time it takes to create the file. If it creates a 468mb file in 10 seconds, we use a 46mb/sec rate.

Shortly after a reboot, we'd see values of 234mb/sec and 156mb/sec on a couple of mount points. Currently, I see 52mb/sec and 66mb/sec on those same mount points. There is no back-up or other disk-intensive process running that I can see. the above rates will get worse over the next few days.

Is there some better way to measure what's going on? Glance shows currently 8% disk util.
Michael Resnick
Advisor

Re: HP-UX 11i Disk I/O rate decreases over days

Should also note the system has a 3rd-party hardware support contract - hot HP. :(

That vendor was responsible for parts replacement following the move and for firmware upgrades following a review of current devices.
Hein van den Heuvel
Honored Contributor

Re: HP-UX 11i Disk I/O rate decreases over days

So are ou noticing significant application slow down as the days go by? Can that be quantified in application performance terms?

>> We're using a crude test - a perl program will create a temp file on each of several mount points and measure the time it takes to create the file. If it creates a 468mb file in 10 seconds, we use a 46mb/sec rate.
How big is dbc_max.. in megabytes?

You use a good sized file, but it will likely still fit in memory.
Are you using SYNC in the test, or on a commandline (man 1M sync) and including that time?

Does the CPU usage for the perl test change over time?

Could some free-space fragmention play a role?

There are more robust disk performance tools. Google will gladly find some.

You might want to experiment with 'dd' instead of perl to focus more on IO, less on buffering, and get free timing data.

( Mind you, the perl IO is not a bad plan... if your application is written in perl and is all about createing 400 Mb files. It is not, is it now?! :-)

So how about trying to isolate an application chunk and replay and measure the performance of that on the hour every hour?!

Performance evaluations are NOT easy.
Far from it. GIGO.

Hope this helps some more!
Hein.