Operating System - HP-UX
1836438 Members
2528 Online
110100 Solutions
New Discussion

avque values are at 65,000 on some devices

 
SOLVED
Go to solution
Bruce Kirk
New Member

avque values are at 65,000 on some devices

We are running a HP-UX system with an EMC array for disk storage.

Starting Sunday we have been seeing the avque consistantly showing 65,000 for some devices.
We have found a hardware problem being logged to the syslog file. This has been addressed, but we are still seeing some of the devices with avque of 32,000 - 65,000.

Currently no hardware errors are being reported to the syslog and the system is performing its IO loads in time frames that are similar to the problem occuring.

My guess is that the problem is somewhere in the EMC and perhaps after the cache.

Both EMC and HP have been looking into the problem, but neither have been able to find a problem.

Anyone have any suggestions on how we might try and troubleshoot this problem?

note: the c9 and c11 are the emc devices. Not all of the emc devices are reporting a high avque value. Thoose that are reporting the high avque value are consistantly reporting the high value.
$ sar -d 3 1

HP-UX prd007 B.11.00 A 9000/800 03/29/02

05:26:44 device %busy avque r+w/s blks/s avwait avserv
05:26:47 c1t6d0 14.00 0.50 20 182 4.98 6.71
c2t6d0 6.67 0.50 11 71 5.30 5.72
c9t1d4 1.67 0.50 23 1195 5.03 3.61
c9t5d0 16.00 0.50 235 3765 5.15 0.67
c9t5d1 4.33 0.50 44 2299 4.65 2.69
c9t2d7 6.33 32767.50 76 1221 5.40 0.72
c11t1d4 3.00 32005.48 29 1307 4.99 3.00
c11t5d0 21.67 32767.50 274 4379 5.12 0.65
c11t5d1 5.00 0.50 52 2576 4.76 2.69
c11t2d7 5.33 32767.50 81 1291 4.94 0.74
c11t3d0 0.33 65531.50 2 27 5.50 0.69
c9t1d7 29.33 65522.50 298 8331 4.85 1.45
c9t2d2 17.67 65528.50 182 2907 4.97 0.80
c9t4d7 3.33 65521.50 33 533 5.09 0.74
c9t1d2 0.33 0.50 2 27 3.79 2.33
c11t1d7 34.00 65513.50 352 9264 5.03 1.37
c11t2d2 15.00 65528.50 227 3643 5.05 0.76
c11t4d7 2.00 65521.50 30 475 4.92 0.90
c11t1d0 0.67 0.50 1 11 5.06 6.87
7 REPLIES 7
Steve Lewis
Honored Contributor

Re: avque values are at 65,000 on some devices

I got the same recently, on a failed disk. Even though the disk was replaced, the queue figures stayed at the peak value for weeks.

I think a reboot is the only way you will get rid of those queue high figures.

Paula J Frazer-Campbell
Honored Contributor

Re: avque values are at 65,000 on some devices

Bruce

What was the hardware problem? as it appears to be related.

Try a:-

sar -d 1 100 and watch for any reduction in aqueue if not then:-

As Steve suggested try a server reboot as to have figures this high is not right.

Watch the console during boot for any errors:-


Paula

If you can spell SysAdmin then you is one - anon
Paula J Frazer-Campbell
Honored Contributor

Re: avque values are at 65,000 on some devices

Bruce.

When the server is down reseat all cables/ disks.
Do not just wiggle them physilally disconnect and replug several time as this will remove any oxidation on the contacts.

As all disks on each path are involved I would guesss that a primary cable/connection is the best place to start.


HTH
Paula
If you can spell SysAdmin then you is one - anon
Sanjay_6
Honored Contributor

Re: avque values are at 65,000 on some devices

Hi Bruce,

Have you tried the sar cumalative patch,

the patch is PHCO_25174 for 11.0,

http://us-support3.external.hp.com/wpsl/bin/doc.pl/screen=wpslDisplayPatch/sid=2afc16f819edae52c5?PACH_NAM=PHCO_25174&HW=s800&OS=11.00

Hope this helps.

Regds
Bruce Kirk
New Member

Re: avque values are at 65,000 on some devices

Thanks for everyone suggestions.

We will try the the various options and I will update the post with the results when we get them.
Bruce Kirk
New Member

Re: avque values are at 65,000 on some devices

Per HP the problem is with SAR and Glance Plus not correctly reporting the information. The HP lab is currently working on a bug fix.

PS I do not quite buy it, so I will have to take a wait and see.

Thanks for everyones help
Roger Baptiste
Honored Contributor
Solution

Re: avque values are at 65,000 on some devices

hi,

I faced a similar problem in the past. sar was showing ridiculous values for two controllers which were connected to EMC symm. But it showed normal values for the rest of the controllers. After seemingly facing a deadend, the "problem" got solved after installing patches in response to some other issue. Also check the num_tachyon_adapters kernel parm is set to 10 (or 20).

I don't remember the patches, but if you make sure you system is patched current , it should be included in them.

HTH and yes it would be encouraging if you could assign points.

-raj

Take it easy.