Operating System - HP-UX
1836743 Members
2893 Online
110109 Solutions
New Discussion

Strange Sar EMC STP issue

 
SOLVED
Go to solution
Vic S. Kelan
Regular Advisor

Strange Sar EMC STP issue

We have been noticing some storage performance issues on our server.

We started the investigation and enlisted EMC tech, he decided to run STP on the symmetrix while I ran sar at the same time, we set both to run for 24 hours, my sar syntax was:

nohup sar -ud 300 288 > sarfilename &

It only ran for two hours between 14:20-16:10 in 5 minute increments.

Stranger, the EMC techs tool also terminated at the same two hour period despite setting it to run for 24 hours.

These boxes are on seperate physical machines and we had no issues on the day it was run at all.

Any ideas???
Thanks!
5 REPLIES 5
Ninad_1
Honored Contributor

Re: Strange Sar EMC STP issue

Do you see any error messages in nohup.out file ?
is the sar still running ?
ps -eaf | grep sar
Is there any message in /var/adm/syslog/syslog.log file or /var/adm/messages ?

Is there any clue from the sarfile as to server CPU contention/high usage.
Did the sar terminate or then continued to log ?

Regards,
Ninad
Vic S. Kelan
Regular Advisor

Re: Strange Sar EMC STP issue

Thanks Ninad,
this is very strange, I dont see anything in the syslog. the nohup file is empty since no error messages.....

The script was no longer running when I checked it the next day (as expected since it was more than 24 hours) but it only ran for 2 hours.

This happened on the two servers connected to the symmetrix, the diag on the symmetrix also stopped at the same time sar terminated on the two seperate boxes.....
2 hours
I am re-running the job again and just wondering if it will terminate again afer

Ninad_1
Honored Contributor

Re: Strange Sar EMC STP issue

Its really strange - and would have thought to be a coincidence but how can that be.
I cant think of any reason why sar should terminate as its running on your server and has nothing to do will anything running on the other server or how your EMC array is doing.- Just a min. - where are you logging the sar output ? is it any filesystem/disk on EMC array ? if yes can you have the log file in say /tmp or somewhere else.

Usually I have observed commands not logging for some time and again logging but thats due to server being 100% loaded and the command was probably not getting the CPU to do the task, but once the CPU was a bit idle again was able to see the logs.

So in your case the only reason I can imagine at the moment is if you are logging to a file that on disks from EMC and something happening to the EMC array and not able to write to the disk [ same with nohup.out ]
But in that case something would have appeared in syslog.log

Lets see how your this run goes.

Regards,
Ninad
Vic S. Kelan
Regular Advisor

Re: Strange Sar EMC STP issue

I found out whats happening.

When I exit the shell despite using nohup and & it kills the job.

http://forums1.itrc.hp.com/service/forums/bizsupport/questionanswer.do?threadId=193539

An old thread but perhaps never got fixed....

"SK,

We have seen this problem as well. For an example, when I do a 'nohup sar 1 90 &' and exit my shell, the sar dies.

We opened an issue with the response center, and they escalated it to engineering. It is now a defect fix that they are working on. (works in some places, but not in others.)

We found that if you open a shell from your current shell, and run the program, and then exit that child shell, nohup works as expected... You might try this as a work around for now, they don't expect the patch to be ready for a couple months...

Hope it helps

John"

So I use "at" command now.

Ninad_1
Honored Contributor
Solution

Re: Strange Sar EMC STP issue

Hi,

Infact thats the same thing I was going to ask you today as I was thinking on my way to work, I remembered having faced similar situation, when I used to logout even if I mentioned nohup it used to terminate the parent process and the child processes used to run fine.
The workaround I came up with was write a small script for whatever you want to execute and just insert a sleep before the actual commands.
Thus script will have.
sleep 120
sar -ud 300 288 > sarfile

when I run the script the sleep is running and I logout the session immediately. Thus it does not kill the sar as its invoked later on when I have logged out already.

Try this.

Regards,
Ninad