- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: System unreachable
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 01:25 PM
02-05-2010 01:25 PM
System unreachable
BTW, this happened last year on one of our rp3440 systems and we have not found anything either..Please help!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 01:52 PM
02-05-2010 01:52 PM
Re: System unreachable
However, you should have a look at /var/adm/syslog/OLDsyslog.log. That will be the file that was active when the problem occurred.
During reboot syslog.log gets moved to OLDsyslog.log, recreated and then syslogd starts.
When you connected to the console, were you able to ping other devices on the network? Did you check to see if the inetd daemon was running? Was the sshd daemon running? How busy was the system? Was there a high load at the time?
A ping indicates some modicum of network connectivity, but that's no guarantee that other things will work.
At this point, there may not be much to look at to determine root cause of the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 02:13 PM
02-05-2010 02:13 PM
Re: System unreachable
Jan 23 16:55:56 sapux083 SAPPRD_00[26285]: Unable to open trace file sapstartsrv
.log. (Error 13 Permission denied) [ntservsserver.cpp 1909]
Jan 24 03:35:19 sapux083 xntpd[20125]: synchronized to 10.213.255.2, stratum=3
Jan 24 03:35:51 sapux083 xntpd[20125]: synchronisation lost
Jan 24 03:37:27 sapux083 xntpd[20125]: synchronized to 10.213.255.1, stratum=3
Jan 24 03:43:51 sapux083 xntpd[20125]: synchronized to 10.213.255.2, stratum=3
Jan 24 03:44:23 sapux083 xntpd[20125]: synchronisation lost
Jan 24 03:44:55 sapux083 xntpd[20125]: synchronized to 10.213.255.1, stratum=3
Jan 27 08:51:51 sapux083 sshd[12353]: Accepted keyboard-interactive/pam for glyo
ns from 10.213.39.145 port 2156 ssh2
Feb 1 12:58:11 sapux083 syslog: gethostbyaddr: D1CNI12285PL.coxnewscop.int. !=
169.137.106.89
Feb 2 14:47:55 sapux083 syslog: gethostbyaddr: d1cls12231pl.coxohio.com. != 169
.137.104.13
Feb 2 15:12:25 sapux083 syslog: gethostbyaddr: D1ADV7899P.coxnewscop.int. != 16
9.137.105.87
Feb 2 15:12:26 sapux083 syslog: gethostbyaddr: d1mkt10645p.coxnewscop.int. != 1
69.137.105.123
when i went to the console, i could not ping or type anything, it wasnt until i rebooted that i got anything to write to the console window. This system was not being used heavily at the time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 02:18 PM
02-05-2010 02:18 PM
Re: System unreachable
What's in /etc/opt/resmon/logs?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 02:20 PM
02-05-2010 02:20 PM
Re: System unreachable
One cause of this could be RAM/SWAP related. Were/are you running tight on RAM on this system and how much swap do you have configured? If you have perf tools installed you could possibly use the "extract" tool to look at processes that were running at the time. Do a trawl from the / [root] folder using "find" for any core or trace files that may have been created at the time of the "crash". Is this an Oracle RDBMS server? Which version of 11i are you using?
Let us know how you get on with your investigation.
Keith
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 02:22 PM
02-05-2010 02:22 PM
Re: System unreachable
-rw-r--r-- 1 root root 60961 Feb 5 17:19 api.log
-rw-r--r-- 1 root root 500031 Feb 5 06:36 api.log.old
-rw-r--r-- 1 root root 2086 Oct 12 2008 client.log
-rw-r--r-- 1 root root 500831 Oct 12 2008 client.log.old
-rw-r--r-- 1 root sys 17977 Feb 5 13:56 emsagent.log
-rw------- 1 root sys 3598 Feb 5 13:56 emsha.log
-rw-r--r-- 1 root root 276784 Feb 5 14:59 registrar.log
-rw-r--r-- 1 root root 28078 Jun 18 2009 reslog.html
any file you interested in me checking?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 02:28 PM
02-05-2010 02:28 PM
Re: System unreachable
User event occurred at Fri Feb 5 06:41:32.197519 2010
Process ID: 7721 (/usr/sbin/stm/uut/bin/.../dm_chassis) Log Level: Error
The chassis code monitor (dm_chassis) cannot run on this machine. Either the ma
chine does not generate chassis logs, or the machine is not supported by dm_chas
sis. Currently, the following set of machines are supported by dm_chassis:
superdome
S-class
-------------------End Event----------------------
-------------------Start Event--------------------
User event occurred at Fri Feb 5 06:46:32.682878 2010
Process ID: 7919 (/usr/sbin/stm/uut/bin/.../dm_chassis) Log Level: Error
The chassis code monitor (dm_chassis) cannot run on this machine. Either the ma
chine does not generate chassis logs, or the machine is not supported by dm_chas
sis. Currently, the following set of machines are supported by dm_chassis:
superdome
S-class
saw this in the api.log file... It was also present in the old api.log.. still checking the others..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 02:49 PM
02-05-2010 02:49 PM
Re: System unreachable
Thanks for your reply. Here are some of the answers
===============
No Perftools installed,no core files found using the find / -name core* command. system is used as an application server for one of the production SAP/Oracle systems.
swapinfo -t
Kb Kb Kb PCT START/ Kb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 4194304 0 4194304 0% 0 - 1 /dev/vg00/lvol2
dev 26624000 0 26624000 0% 0 - 1 /dev/vg00/lvol10
reserve - 18453120 -18453120
memory 26111764 1347344 24764420 5%
total 56930068 19800464 37129604 35% - 0 -
Memory Information:
physical page size = 4096 bytes, logical page size = 4096 bytes
Physical: 33552384 Kbytes, lockable: 26060496 Kbytes, available: 29910084 Kbytes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 02:55 PM
02-05-2010 02:55 PM
Re: System unreachable
-------------------Start Event--------------------
Event 2960 occurred at Fri Feb 5 13:58:48.020756 2010
Process ID: 2713 (/etc/opt/resmon/lbin/registrar) Log Level: Error
process_time_event: Expired awaiting-reply object, socket=7
-------------------End Event----------------------
-------------------Start Event--------------------
Event 2937 occurred at Fri Feb 5 13:58:48.025938 2010
Process ID: 2713 (/etc/opt/resmon/lbin/registrar) Log Level: Error
abort_awaiting_reply_obj: socket=7: Connection aborted
-------------------End Event----------------------
-------------------Start Event--------------------
Event 2961 occurred at Fri Feb 5 13:58:48.026394 2010
Process ID: 2713 (/etc/opt/resmon/lbin/registrar) Log Level: Error
process_time_event: Expired contact object for monitor /usr/sbin/stm/uut/bin/too
ls/monitor/RemoteMonitor
-------------------End Event----------------------
-------------------Start Event--------------------
Event 2960 occurred at Fri Feb 5 13:59:04.015484 2010
Process ID: 2713 (/etc/opt/resmon/lbin/registrar) Log Level: Error
process_time_event: Expired awaiting-reply object, socket=7
----------------
Nothing else significant in the other files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 02:56 PM
02-05-2010 02:56 PM
Re: System unreachable
Verify with swlist | grep -i online
http://software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=B6191AAE
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 02:57 PM
02-05-2010 02:57 PM
Re: System unreachable
Let us know.
Keith
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 03:00 PM
02-05-2010 03:00 PM
Re: System unreachable
Load averages: 0.02, 0.01, 0.02
187 processes: 171 sleeping, 15 running, 1 zombie
Cpu states:
CPU LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
0 0.01 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0.0% 0.0%
1 0.06 4.2% 0.0% 0.4% 95.4% 0.0% 0.0% 0.0% 0.0%
2 0.00 0.0% 0.0% 0.2% 99.8% 0.0% 0.0% 0.0% 0.0%
3 0.01 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0.0% 0.0%
4 0.01 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0.0% 0.0%
5 0.03 0.0% 0.0% 0.2% 99.8% 0.0% 0.0% 0.0% 0.0%
6 0.01 3.4% 0.0% 0.8% 95.8% 0.0% 0.0% 0.0% 0.0%
7 0.01 0.0% 0.0% 0.6% 99.4% 0.0% 0.0% 0.0% 0.0%
--- ---- ----- ----- ----- ----- ----- ----- ----- -----
avg 0.02 1.0% 0.0% 0.2% 98.8% 0.0% 0.0% 0.0% 0.0%
Memory: 6177968K (1233628K) real, 18980620K (3751632K) virtual, 22543108K free
Page# 1/38
CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
1 ? 6148 prdadm 155 20 14891M 81092K sleep 0:31 12.51 12.49 dw.sapPRD_D
6 ? 15530 prdadm 154 20 14903M 93220K sleep 2:32 1.39 1.39 dw.sapPRD_D
7 ? 2435 root 152 20 587M 96616K run 0:21 0.79 0.79 java
=============================================
using the top command
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 03:01 PM
02-05-2010 03:01 PM
Re: System unreachable
B3929DA 3.5-ga15-04 HP OnLineJFS 3.5
OnlineDiag B.11.11.18.05 HPUX 11.11 Support Tools Bundle, Dec 2006
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 03:26 PM
02-05-2010 03:26 PM
Re: System unreachable
I can remember several issues with v1 and vhand (which were fixed with patches). I don't tend to rely on memory stats from "top", it's a shame you don't have glance installed. It's difficult for me to see the RSS figures for your SAP processes (it looks like 30Gb - but that may be shared memory).
I'd definitely consider making SWAP 1.5x RAM (at least).
Now, I wonder if they still have the evaluation copy of Glance somewhere.........
( 8 )
Keith
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 03:27 PM
02-05-2010 03:27 PM
Re: System unreachable
You have messages being sent an nothing listening, that is, if your firmware is up to date. IF not, then you have nothing being sent and nothing listening.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 04:33 PM
02-05-2010 04:33 PM
Re: System unreachable
If you want to know why a system was hung, you need to use TC to get a memory dump. (You first need to make sure crash dumps are enabled.)
>swapinfo -t
(It would helpful next time to always use -tam.)
>Patrick: A ping indicates some modicum of network connectivity, but that's no guarantee that other things will work.
I've had that too, unfortunately. :-(
>Keith: as HP-UX wants to reserve the same amount of RAM in swap at all times.
As long as you have pseudo-swap enabled, that's a myth about device swap.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 09:03 PM
02-05-2010 09:03 PM
Re: System unreachable
dbc_min_pct=5
Systems were patched in 2008
We are currently working on a patch plan to update the systems. I will look into installing glance as well.
GOLDAPPS11i B.11.11.0712.475 Applications Patches for HP-UX 11i v1, December 2007
GOLDBASE11i B.11.11.0712.475 Base Patches for HP-UX 11i v1, December 2007
HWEnable11i B.11.11.0612.458 Hardware Enablement Patches for HP-UX 11i v1, December 2006
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 09:48 PM
02-05-2010 09:48 PM
Re: System unreachable
Question for you: Has your application been upgraded recently with any new rollouts?
I ask because a memory leak could freeze the system.
Let me know.
Anyway, you're not going to know where the problem is until you upgrade your diags and firmware. You're probably many, many version out of date.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2010 10:02 PM
02-05-2010 10:02 PM
Re: System unreachable
Not sure about the rollouts.. i know they did a database refresh a few months back.. but this is their app server, so not sure what changes were made on it.. but i will ask. In the meantime, I will take your suggestion and see what I can do to expedite the patch/firmware and diag update..
Thanks for your feedback.