Monitor disk/item=que

Chaim Budnick · ‎11-28-2004

During prime time there are several disks which showed an average of almost 2.

Is this indicitave of any problem?

The users are complaining about performance slowdown during prime time, and we are wondering if this is any indication.

The main application is DSM.

Thanks,

Chaim

Jan van den Ende · ‎11-28-2004

Chaim,

most probably: YES.

It means that an IO request is placed in a queue before it is its turn to be satisfied.
And especially if this is "only" an IO to get relation info about where to find the record with the data requested (or even where to find info about where to find the data!), then the response times tend to detoriate exponentially.

fwiw,

Cheers.

Have one on me.

Jan

Don't rust yours pelled jacker to fine doll missed aches.

Ian Miller. · ‎11-28-2004

Its an indication that there is always I/O queued for those disks. Wether or not thats a problem depends. Are there key files on those disks? What you need to know is the response time on those diks and could it be contributing to the percived problem. For a performance problem you need to look the whole picture - all of the resources involved in in performing the operation that the users perceive as slow. When did the complaints start and whats changed? Look for errors also.

____________________
Purely Personal Opinion

Chaim Budnick · ‎11-28-2004

Is this a hardware or software problem?

What can I do to try and diagnose this a little more sharply?

Chaim

Ian Miller. · ‎11-28-2004

Are there key files on those disks?

Do you have PSDC or other tool to look at which files are particularly busy?

If not the information from
show mem/cach=(topq ,vol=diskname) may be
helpful.

____________________
Purely Personal Opinion

Wim Van den Wyngaert · ‎11-28-2004

DSM flushed its cache every 30 seconds (by default). During this intervwal you might have a queue. Unless you have the queue length of 2 all the time. In that case you should investigate (which process is doing the IO and when).

Wim

Wim

Ian Miller. · ‎11-28-2004

In what tool did you see this average and over what period was the average? It may be the affect of the flush every 30 seconds if the period for the average is longer than 30 seconds. The key thing is does this I/O queue affect the system performance hence my question on key files and caching statistics.

____________________
Purely Personal Opinion

Wim Van den Wyngaert · ‎11-28-2004

I have a cluster running DSM. I checked and found that during end of day processing the queue length is about 2 - 2.5. We flush every second. But we don't have performance problems because of that.

Wim

Wim

Hein van den Heuvel · ‎11-29-2004

Like Ian replied, be sure to look at the full picture. For example, if the system is near 100% CPU during the prime time, then the IO queue is no significant problem (assuming no application components 'spins' waiting for IO :-).

What you call disks, are they simple disks (there's your problem :-), or virtual units on multi-disk stripe/raid/mirror. Very simplistically stated it would be ok to have a queue length of up to 1 per physical disk per unit. For a 5 member raid-5, it should be reasonable to have an average queue of 3 or so.

As always, Much depends on specific application usage.
For example the performance of single application which does read, little-processing, read, a little more process in a tight loop will be 99% defined by the disk, but will never have more than 1 io in the queue for just that task. Add an other activity on that disk and it will seem really slow.
On the other end of the spectrum, take a task like VMS backup, or perhaps this DSM flush. What if it spends some processing to determine several IOs to be done, issues all of those IOs and then waits for all of them to be done. There you would always see a queue, no matter how fast the disk, but it would not at all be a real problem. (Note: I am making up the DSM part, I have no understanding about its IO engine).

hope this helps some,

Hein.

DICTU OpenVMS · ‎11-29-2004

A little off topic I think, but it may come in handy once.

In case the disks are in a HSG80 take in mind that the controler can be overloaded if you send it to many IOs. That way you can hang one of the four ports. The only way to get that port back is to reboot that controler (top or bottom). Possible cause can be to high DIOlm of an account, for example the backup account. You can check if the HSG gets to many IOs with ana /sys :

$ ana /sys
sda> fc stdt /all

then check the QFseen. Any higher than 0 means that the HSG had to many IOs once...

Good luck with the performance.

Wim Van den Wyngaert · ‎11-29-2004

But /all is only 7.3-1+

Wim

Wim

Chaim Budnick · ‎11-29-2004

The disks in question are 5 member RAID5

Chaim

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Monitor disk/item=que

Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que

Re: Monitor disk/item=que