- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Event Monitor Notification - only a vague idea
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-29-2009 02:57 PM
05-29-2009 02:57 PM
Event Monitor Notification - only a vague idea
Notification Time: Fri May 29 15:23:24 2009
hnrgnw sent Event Monitor notification information:
/storage/events/disk_arrays/AutoRAID/0000003FEB3E
is >= 3.
Its current value is SERIOUS(4).
Event data from monitor:
Event Time..........: Fri May 29 15:23:24 2009
Severity............: SERIOUS
Monitor.............: armmon
Event #.............: 101
System..............: hnrgnw
Summary:
Disk Array at hardware path : Device removed from monitoring
Description of Error:
The device has been removed from the list of devices being monitored by
this monitor.
Probable Cause / Recommended Action:
The device was removed from the system, has stopped responding to the
system or it has been replaced with a device that is not supported by this
monitor.
Run ioscan to determine the state and type of the device.
Check the /var/stm/data/os_decode_xref for the information indicating
which devices are supported by this monitor.
Check other monitors to determine if they are now monitoring the
device by running /etc/opt/resmon/lbin monconfig and the using the
"Check monitoring" command.
Additional Event Data:
System IP Address...: 205.135.30.74
Event Id............: 0x4a20605c00000000
Monitor Version.....: B.01.01
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_armmon.clcfg
Client Configuration File Version...: A.01.01
Qualification criteria met.
Number of events..: 1
Associated OS error log entry id(s):
None
Additional System Data:
System Model Number.............: 9000/800/L2000-44
EMS Version.....................: A.03.20
STM Version.....................: A.44.00
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/armmon.htm#101
v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-29-2009 04:02 PM
05-29-2009 04:02 PM
Re: Event Monitor Notification - only a vague idea
Follow the link and you get:
Event 101
Severity: Critical
Event Summary: Disk at hardware path x/xx/x.x.x : Device removed from monitoring
Event Class: I/O
Problem Description:
The device has been removed from the list of devices being monitored by this monitor.
Probable Cause / Recommended Action
The device was removed from the system, has stopped responding to the system or it has been replaced with a device that is not supported by this monitor.
***Run ioscan to determine the state and type of the device.
*** Check the /var/stm/data/os_decode_xref for the information indicating which devices are supported by this monitor.
*** Check other monitors to determine if they are now monitoring the device by running /etc/opt/resmon/lbin/monconfig and using the "Check monitoring" command.
Automated Recovery: None
Event Generation Threshold: 1 occurrence
Any UNCLAIMED or NO_HW changes noticed from ioscan?
Might want to check the STM log for other issues also.
Could be a number of things. Start with the array subsystem hardware and all system logs.
Warning on the 12H?
hth,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-31-2009 09:24 PM
05-31-2009 09:24 PM
Re: Event Monitor Notification - only a vague idea
adn you will find a command that you can run to get this information.
Since this is serious and >3 urgent action is needed.
Go to your SAN and look for RED LIGHTS.
Depending on your SAN ( VA,EVA,XP)
you can hot replace a disk.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-01-2009 06:35 AM
06-01-2009 06:35 AM
Re: Event Monitor Notification - only a vague idea
regarding:
"Go to your SAN and look for RED LIGHTS.
Depending on your SAN ( VA,EVA,XP)
you can hot replace a disk."
On AutoRaid, I don't think you'll find "red lights". you should be able to look at the status via sam. Also the Admin guide indicates commands to list array logs, which may show something of interest.
Admin Guide here:
http://h10032.www1.hp.com/ctg/Manual/lpg28365.pdf
Other resources at:
http://h10025.www1.hp.com/ewfrf/wc/manualCategory?dlc=en&lc=en&product=61936&cc=us&
Drive(s) should be hot-swappable, so once you locate the culprit and obtain spares, it should be an easy fix.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-01-2009 11:35 AM
06-01-2009 11:35 AM
Re: Event Monitor Notification - only a vague idea
1. The system is still up and running
2. no unclaimed HW
3. no missing disk
4. no warning message when run arraydsp -a AUTORAID1
5. no further notification email since May 29
Should I let it go or just keep an eye on it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-01-2009 12:02 PM
06-01-2009 12:02 PM
Re: Event Monitor Notification - only a vague idea
Was this the only message in the event log?
hth,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2009 10:18 AM
06-02-2009 10:18 AM
Re: Event Monitor Notification - only a vague idea
The 12H autoraid is still being monitored
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2009 10:30 AM
06-02-2009 10:30 AM
Re: Event Monitor Notification - only a vague idea
You might want to run this and check the results for any h/w errors or redirect into a file and post it for review:
# echo "sel dev all;info;wait;il"|/usr/sbin/cstm
hth,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2009 10:45 AM
06-02-2009 10:45 AM
Re: Event Monitor Notification - only a vague idea
see the attachment for the result of the command:
echo "sel dev all;info;wait;il"|/usr/sbin/cstm
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2009 11:11 AM
06-02-2009 11:11 AM
Re: Event Monitor Notification - only a vague idea
Well the good news is 12H appears to be healthy! ;-)
The not-so-great news is that the tool is reporting issues when trying to obtain status with several PCI adapters. This message appears throughout the report:
"An error was encountered when attempting to obtain the information.
Check the tool activity log for more details."
You'll have to look at the activity log to see where the issue might be. The log can viewed with:
# cstm
cstm> ial
Check the time stamps in the activity log for today's date to see what it indicates when the tool was run earlier today. It may point to why the tool was unable to retrieve all of the hardware information.
hth,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2009 11:19 AM
06-02-2009 11:19 AM
Re: Event Monitor Notification - only a vague idea
Sorry I forgot to include to check the system activity log and not just the infolog activity:
# cstm
cstm> ls
Rgds,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2009 11:36 AM
06-02-2009 11:36 AM
Re: Event Monitor Notification - only a vague idea
cstm>ial
^-- (InfoActLog) is currently disabled. --
cstm>ls
Thu Feb 19 17:27:42 2009: Aborting daemon process (cclogd) with process
identifier (1541) as support tool system is shutting
down.
Thu Feb 19 17:27:42 2009: Aborting daemon process (diaglogd) with process
identifier (1542) as support tool system is shutting
down.
Thu Feb 19 17:27:42 2009: Aborting daemon process (memlogd) with process
identifier (1543) as support tool system is shutting
down.
Thu Feb 19 17:27:42 2009: Aborting daemon process (psmctd) with process
identifier (1544) as support tool system is shutting
down.
Thu Feb 19 17:35:58 2009: Diagmond daemon started.
Thu Feb 19 17:37:08 2009: Launching daemon executable
(/usr/sbin/stm/uut/bin/sys/cclogd).
Thu Feb 19 17:37:08 2009: Launching daemon executable
(/usr/sbin/stm/uut/bin/sys/diaglogd).
Thu Feb 19 17:37:08 2009: Launching daemon executable
(/usr/sbin/stm/uut/bin/sys/memlogd).
Thu Feb 19 17:37:08 2009: Launching daemon executable
(/usr/sbin/stm/uut/bin/sys/psmctd).
Fri Feb 20 09:35:11 2009: An IPC message of 0 length was read.
Possible Causes/Recommended Action:
The sender of the IPC message connected but did
not send any data or exited before sending any
data.
Check the support tool system activity log and
the tool activity logs for more information on
the sending process.
Fri Feb 20 09:35:11 2009: Attempt to read the IPC header from a port failed.
The port is on system name (hnrgnw) at IP address
(205.135.30.74) and has system port (1508).
Fri Feb 20 09:35:11 2009: Message received from system port had an invalid
header. Message will be ignored.
Fri Feb 20 10:25:44 2009: An IPC message of 0 length was read.
Possible Causes/Recommended Action:
The sender of the IPC message connected but did
not send any data or exited before sending any
data.
Check the support tool system activity log and
the tool activity logs for more information on
the sending process.
Fri Feb 20 10:25:44 2009: Attempt to read the IPC header from a port failed.
The port is on system name (hnrgnw) at IP address
(205.135.30.74) and has system port (1508).
Fri Feb 20 10:25:44 2009: Message received from system port had an invalid
header. Message will be ignored.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2009 11:58 AM
06-02-2009 11:58 AM
Re: Event Monitor Notification - only a vague idea
Try this:
echo "sel dev all;info;wait;ial"|/usr/sbin/cstm
-or-
Try the report again manually
#cstm
cstm> sel all
cstm> info
cstm> il
quit
done
cstm> ial
quit
save
file-name
cstm> quit
hth,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2009 02:01 PM
06-02-2009 02:01 PM
Re: Event Monitor Notification - only a vague idea
OK, I got something out of that command. Please see the attachment.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2009 03:59 PM
06-02-2009 03:59 PM
Re: Event Monitor Notification - only a vague idea
A few things are not right and they may or may not be related to the system firmware being way out of date. You system is at 41.39 and critcial fixes were published after that version. The latest is 44.28 for your system.
For 11.00:
http://www13.itrc.hp.com/service/patch/patchDetail.do?BC=main|patchDetail{PHSS_31800,{hpux:11.00,}}|&patchid=PHSS_32704&sel={hpux:11.00,}
a) According to the diag tool, one of your processors isn't responding properly. This is interesting because part of log report has both processors active and configured. So I'm not sure why it's failing to report properly, PDC?
+++++
from infolog report for proc at 166:
Processor PIM Information: PIM contains NO data
+++++
-- Information Tool Activity Log for CPU on path 166 --
Log creation time:
Tue Jun 2 14:44:53 2009
Tue Jun 2 14:44:53 2009: Information tool (cpu) starting on path (166).
Tue Jun 2 14:44:53 2009: Failure Summary:
The returned status from PDC_SYSTEM_INFO call:
(-4).
Possible Causes/Recommended Action:
The PDC_SYSTEMINFO call returned status indicating
that selected processor is not present.
This is an unexpected status. No recommended
action available.
+++++
However this issue was supposedly fixed with PDC 41.39:
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=120&prodSeriesId=377380&prodTypeId=15351&objectID=c00906010
I have a feeling it didn't and you may want to have HP update all of your system firmware to the latest supported versions.
b) For some reason the /dev/diag/diagx files are either missing or the tool can't read them. You'll need to resolve this or recreate the files with 'insf -e'. See insf man pages:
Tue Jun 2 14:44:52 2009
Tue Jun 2 14:44:52 2009: Information tool (pci) starting on path (0/7/0/0).
Tue Jun 2 14:44:52 2009: The device file /dev/diag/diag1 is not present on the
system. Communications with diag1 diagnostic pseudo
driver is not possible without this file.
Possible Causes/Recommended Action:
Please use "insf -e " from the /dev directory to
create the /dev/diag/diag1 device file.
Tue Jun 2 14:44:52 2009: Failed to obtain device_id, revision_id,
device_status, vendor_id and class_code for the card.
An error or warning occurred while obtaining the
information.
You might want to check /etc/rc.log and verify everything started up correctly the last time it booted. If you find errors then investigate.
IMHO I would get the system firmware updated to match the installed software requirements first. SInce the diag tool isn't reporting the adapter type I can't tell what h/w is supported for your config. You can check here for some adapter minimum PDC versions (although 11.00 isn't listed you can get an idea of where your system should be at):
http://docs.hp.com/en/SFWM1/ch04.html#ftn.dc3
The diagnostic subsystem is not functioning properly and you might not have any accurate warnings until something really breaks.
hth,
BTW:
http://forums13.itrc.hp.com/service/forums/helptips.do?#33
;-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-03-2009 09:34 AM
06-03-2009 09:34 AM
Re: Event Monitor Notification - only a vague idea
I'll do what you said
Thanks