1752613 Members
4537 Online
108788 Solutions
New Discussion юеВ

Re: System slow

 
SOLVED
Go to solution
John Russell_13
New Member

System slow

Our VMS guy is on leave. I'm a windows guy and have been left with the task of watching over our Alpha for a few days. Today the system is crawling. We have 3 cpu's and 2GB of ram. I'm attaching a file with some screenscrapes. Can anyone tell me what's going on here. Any assistance would be greatly appreciated.

Thanks in advance.

John
20 REPLIES 20
Ian Miller.
Honored Contributor
Solution

Re: System slow

Your system appears a bit short free memory but as the paging rate is not excessive I expect thats not the problem. I wonder if you have a bottleneck on one disk. Could you post the result of
MONITOR DISK/ITEM=Q/INT=1/VIEW=10

I wonder if some disk will show an excessive queue length.

Can you post more detail on your disk subsystem (scsi, FC) raid or not and so on.
____________________
Purely Personal Opinion
Robert Gezelter
Honored Contributor

Re: System slow

John,

Please also do the following commands and post the output:

$ SHOW SYSTEM
$ SHO QUEUE/BATCH/ALL

One of the possibilities that I would like to eliminate is the possibility that you have a batch queue at an interactive priority, with a large,resource hungry batch job competing for resources.

- Bob Gezelter, http://www.rlgsc.com
John Russell_13
New Member

Re: System slow

Hello,

Thanks for your replies. It's late in the day and everything is back to normal at this point. But here is the info you both asked for.

2 HSG80's running dual redundant fiber paths. We are letting VMS shadow. No raid whatsoever. I'm attaching a .txt as well.

Thanks,

John
John Russell_13
New Member

Re: System slow

Also, Ian, what would be considered excessive queue length? I will check this tomorrow during peak time but I have no idea how this would be measured. I do know that one of our shadow sets are accessed quite frequently by many endusers.
Travis Craig
Frequent Advisor

Re: System slow

John,

I haven't dealt with disk queue length much, but I ran a test on my machine that is pretty heavily loaded, with a disk I/O rate of 50-100, and the average queue length is well under 1. I suspect that an average of 2 or 3 would mean at least one process is quite disk bound, and other users of the same disk will run slowly. I don't know whether that effect would apply to the whole affected controller or not.

I notice you have that one job that has consumed most of a CPU and has done a lot of I/O's, but it has done them over a long period of time (12-18 days). That would be between 700000 and 1000000 per day. That seems like quite a few if they are all disk I/O's, so it might be hogging quite a lot of one disk's throughput. Whether that would slow down everyone would depend on whether they are using the same disk or controller. I guess the process's CPU use isn't a problem because you have 3 CPU's. I assume its priority is not a problem, for the same reason.

I don't see anything else in your outputs that stands out to me.

--Travis Craig
My head is cold.
Ian Miller.
Honored Contributor

Re: System slow

shadowset DATA4 has a average queue length of 4 in the displays you posted. This is worth investigating. The other disks have a minimal queue length. Either lots of I/O is directed to that disk and it is overloaded or there is a problem and it is not responding as fast as the other disks.

$ MONITOR DISK/ITEM=OP

will tell you the operation rate to each disk.
If its found that a lot of I/O is going to DSA4 then
$ SHOW DEVICES/FILES DSA4
will tell you the files open on that disk - talk to the application people about what the files are.

Has the workload changed recently?
____________________
Purely Personal Opinion
Petr Spisek
Regular Advisor

Re: System slow

Hi,
queue lenght 4 (permanently) not so good for performance. Try to find hotfiles and separate this.
How looks interrupts on your system? (MONITOR MODE).

Petr
Robert Gezelter
Honored Contributor

Re: System slow

John,

Ok, first things first. The queue length of 4 is a potential problem, if it persists for an extended period. If it is a momentary thing, it is not as much of a problem.

My curiosity is piqued by JOB0f UZ02DRV, however. In the 18 days that the system since the system has last been booted, it has accumulated 12 days of CPU time (translation, one of the CPUs has effectively been 66% occupied by this job since bootstrap -- presuming that the job started at boot time -- if it was started later, it is more suspicious).

Working from here, it is hard to diagose, but I wonder what that job is doing, and would suggest checking if the IO is originating with that job.

I hope that the above is helpful.

- Bob Gezelter, http://www.rlgsc.com
Jeff Chisholm
Valued Contributor

Re: System slow

Is this the appropriate time to bring up our System Performance Analysis offer? In exchange for some minor $$'s we will locate and tell you just how to work around performance bottlenecks. See the details on the web page. Regards, Jeff

http://www.hp.com/hps/perevent/valupack/openvmssysadmin/vp002.html
le plus ca change...