1848642 Members
7202 Online
104034 Solutions
New Discussion

Re: System crawling

 
Nicky_5
Regular Advisor

System crawling

Hi,

One of out clients have and HP box which suddenly started crawling, and they could not perform any activity. They decided to reboot the machine.

Now they are asking me to investigate as to what could have been the reason for this.
I checked the system logs but it has only logs from after the reboot.

Can some one tell me how i can go about doing this?

Also i want to monitor their machine as t what processes are using how much cpu / memory / paginf space etc.. can you tell me how i can do it?
The OS is HP-UX B.10.20.

On AIX i use commands like top topas lsps etc.

Also i checked the vmstat output which is as below, can you tell me what is wrong:

vmstat 2 10
procs memory page faults cpu
r b w avm free re at pi po fr de sr in sy cs us sy id
0 144 0 46587 646119 4 1 0 0 0 0 0 767 1868 616 15 1 84
0 143 0 46216 646089 7 1 2 0 0 0 0 788 2943 1057 3 1 96
0 143 0 46216 646089 4 0 2 0 0 0 0 764 2306 782 3 0 97
1 142 0 46102 646087 2 0 1 0 0 0 0 781 3035 1018 27 1 72
1 142 0 46102 646087 0 0 1 0 0 0 0 808 3233 1034 12 3 85
1 142 0 46102 646087 0 0 0 0 0 0 0 776 2277 706 10 0 90
0 143 0 46022 646087 0 0 0 0 0 0 0 756 1879 562 5 1 94
0 143 0 46022 646086 0 0 0 0 0 0 0 781 2293 641 11 1 88
0 143 0 45167 646086 0 0 0 0 0 0 0 753 1672 449 0 0 100
0 143 0 45167 646070 0 0 0 0 0 0 0 740 1298 343 10 0 89
#
7 REPLIES 7
Pete Randall
Outstanding Contributor

Re: System crawling

At this point, about all you can do is check the old syslog file located at /var/adm/syslog/OLDsyslog.log. Performance issues like lockups are nearly impossible to troubleshoot after the fact.


Pete

Pete
Rick Garland
Honored Contributor

Re: System crawling

As to what may have been the issue, the reboot probably cleared it out so you may not find any clues. If you have PerfView you may find some historical data to look at.

As to tools to use, if you have Glance this can provide some realtime stats. Glance is a cost product.

PerfView is historical data collection and is also a cost product.

What is default on the system are tools such as sar, vmstat, top, iostat, etc.

There are some other tools available from the net that can provide monitoring - Big Brother, Nagios, etc. These tools require configuration to implement.
D Block 2
Respected Contributor

Re: System crawling

Nicky,

running TOP and look at the resources, in particular, the FreeMemory at the bottom of the screen.

I would also suggest running: GLANCE. type glance as root, and see if you have access to this command.
Golf is a Good Walk Spoiled, Mark Twain.
Nicky_5
Regular Advisor

Re: System crawling

Hi,

I do not have glance on this machine.

The vmstat output shows that the system is not memory or cpu bound, but for some reasons the number of processes in wait queue (b) is unusually high...

Please advice
D Block 2
Respected Contributor

Re: System crawling

Nicky,

you can use "sar" to look at your run queue.

# sar -q 2 30

if your runq-sz greater than 3 ? greater than 16 ?
Golf is a Good Walk Spoiled, Mark Twain.
Marvin Strong
Honored Contributor

Re: System crawling

The problem would be that top, vmstat are really only useful during the problem once you have rebooted, if the performance issue is gone executing these commands isn't going to help you figure out why there was a problem. Unless it recurs.

sar can probably help you if you were collecting sar data. PerfView if you have it and know how to use it could also be useful.

outside of that, as stated above the old syslog might have some information in it, for example if you were hitting some kernel parameter limit.


Rick Garland
Honored Contributor

Re: System crawling

Setup sar collection on the machine. You will be able to collect stats on cpu, memory, swap, buffer, io, etc.

Can archive the data as well so you will have the stats ready when the problem occurs again.