Operating System - HP-UX
1855689 Members
2417 Online
104103 Solutions
New Discussion

Re: Server halted for 30 min

 
SOLVED
Go to solution
Larry Basford
Regular Advisor

Server halted for 30 min

We have a new test server RX2620 which has been up for about 3 mounths and running fine. We put a live account of about 30 users on it Dec 13th.
It runs Universe DB with little load.

Dec 26th between 1:30 and 2:00 it was not accessable. telnet to it connected but login took about 15 min to show up.ssh the same. Never got in till all freed up. I can find no problems in any logs.
The second disk of the mirrored pair had a steady light on. The primary boot was flashing.
Now (normal) both flash.

The only indication in a log was the
/etc/opt/resmon/log/client.log
reporting loss of sockets.

Any ideas?
The system has been normal since. And has not been rebooted.

-------------------Start Event--------------------
Event 132 occurred at Mon Dec 26 13:27:27.416904 2005
Process ID: 2476 (/etc/opt/resmon/lbin/p_client) Log Level: Error
do_version_handshake: Failed to receive a version reply
-------------------End Event----------------------

-------------------Start Event--------------------
Event 2505 occurred at Mon Dec 26 13:27:35.652186 2005
Process ID: 2476 (/etc/opt/resmon/lbin/p_client) Log Level: Error
update_monitors: socket open failed
-------------------End Event----------------------

And this process 2505 is still running.
Desaster recovery? Right !
6 REPLIES 6
Denver Osborn
Honored Contributor

Re: Server halted for 30 min

So there was nothing in the /var/adm/syslog/syslog.log file at around the time of the "hang"?

Have you looked in /var/opt/resmon/log/event.log for anything?

Sounds like it could be a disk issue to me from how you described the disk led was lit solid during the hang. If it is a disk prob, something might be in the event log or syslog.log. Might even check the server logs from the MP

-denver
Shameer.V.A
Respected Contributor
Solution

Re: Server halted for 30 min

Hi Larry,
Also Check the dmesg for the purticular time for any event.check with stm on the hard disk.

Regards,

Shameer
.... See invisible, feel intangible and achieve impossible as everything is possible ....
RAC_1
Honored Contributor

Re: Server halted for 30 min

I have a very similar problem with exactly similar error message. I am yet to resolve it.
Things to check.
Do you have appropriate line in /etc/inetd.conf and /etc/services for EMS registrar?
The /etc/inetd.conf line should look as follows.

registrar stream tcp nowait root /etc/opt/resmon/lbin/registrar /etc/opt/resmon/lbin/registrar

Also what does /etc/nsswitch.conf look like?? Particularly for services directive?? Have you blocked any of the services in /var/adm/inetd.sec??

And last but not least, what does set_fixed -vL say? Do you have any monitors in down state?? do you those devices on system and in claimed state?
There is no substitute to HARDWORK
Larry Basford
Regular Advisor

Re: Server halted for 30 min

Attached more information.

I'm still not finding anything unusual.

Sould I be running STM exersize on the mirrored HD of a production system in use?

unmirror the disk?

Some very good points but many logs are void of info at those times.
If I happens again I'm pulling the disk!
Desaster recovery? Right !
Larry Basford
Regular Advisor

Re: Server halted for 30 min

The system locked up again.
called HP support and determined I had a bad boot disk.

CSTM reported 135 I/O errors on

Number of I/O Error entries: 203


Device paths for which entries exist:

(135) 0/1/1/0.1.0

HP sending a new drive.
I pulled the disk and the mirror has taken over until I can replace it tomorrow.
The disks on the same buss which explains why the mirror acts this way. Support said the top disk is on a different buss. So probably should mirror 0 and 2 of 1 and 2 to get on different busses.
Desaster recovery? Right !
Larry Basford
Regular Advisor

Re: Server halted for 30 min

Defective mirrored boot disk.
Replacement in hand. Will replace tonight.
Desaster recovery? Right !