1833758 Members
2518 Online
110063 Solutions
New Discussion

Hang-situations

 
Lianne Sorber
Occasional Advisor

Hang-situations

Hi,

We have an Itanium HP-UX server with SAP (Oracle) running on it.
Since a couple of months we regularly get a hanging-situation. SAP does hardly respond and it takes ages to log on into UNIX.

When we finally get logged into UNIX we run top and see that both processors are 95% idle.
But when we run sar -d 3 5, we see that the bootdisk (c2t1d0 in vg00) is 100% used. We think it is swapping.

Does anyone have any suggestions what might cause this behaviour? We do not think it is SAP causing this. We had performance problems before which were caused by SAP but then we had no trouble to login onto UNIX.

We are thinking maybe the network interface card might cause the problems but we are not sure how to test this. Well I suppose we can try to logon to the console next time we experience this "hanging".

I hope someone has a tool to find out what is wrong.
By the way we checked the syslog but see no entries there at the time of the "hanging"

Best regards
Lianne
7 REPLIES 7
James R. Ferguson
Acclaimed Contributor

Re: Hang-situations

Hi Lianne:

You should examine the output of:

# swapinfo -tam

# vmstat

Page-outs in double-digits (>10) signify that paging (swapping) is becoming a problem.

Regards!

...JRF...
spex
Honored Contributor

Re: Hang-situations

Hi Lianne,

# vmstat 5 60
# sar -w 5 60
If po>0 or swpot/s>0, then you are swapping.

If you'd like to record system activity for later analysis, make use of the system activity report package. 'man 1m sa1' for more information.

Are console logins unresponsive when the problem occurs?

PCS
V. Nyga
Honored Contributor

Re: Hang-situations

Hi,

you should check errors in your lan statistic.
Sorry, but I only know it for 11.11:
landiag - lan - dis
Then you see errors.
Maybe NFS problems/errors?

Any guru knows it for Itanium? (11.23, right?)

Volkmar
*** Say 'Thanks' with Kudos ***
Bill Hassell
Honored Contributor

Re: Hang-situations

Very slow response time is quite different than a hang. In a hang condition, there is no response to any query including the console. Only a TC (transfer of control) reboot will fix and identify the problem with a crash dump.

However, a severe slowdown is different and requires more tools than sar to determine the problem. It is certainly possible that a runaway process is consuming all memory and forcing a massive amount of page outs -- run vmstat to see the page out rate. A temporary failure of a DNS server will also cause massive delays although there won't be a lot of disk activity at all under this condition.


Bill Hassell, sysadmin
Lianne Sorber
Occasional Advisor

Re: Hang-situations

Thank you all for your quick replies.

We will have to wait until the next hang situation before we can try the commands that you sent us. (this might take weeks).
In the mean time we are having a look at what the output is in a normal situation.

If we find out what was the cause I will let you know.

Thank you.

Best regards
Lianne
Lianne Sorber
Occasional Advisor

Re: Hang-situations

It appears SAP was the cause after all. Whenever a user started a transaction with a large selection a lot of memory was consumed by this user. This made SAP start to swap making all other users wait for memory.

So we are changing settings SAP now so this can not happen anymore hopefully.

Thank you for all your help
Lianne Sorber
Occasional Advisor

Re: Hang-situations

The problem lies in SAP