- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Configuring MC/SG to send hpmcSGSubnetDown trap
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2008 12:08 AM
06-11-2008 12:08 AM
in the wake of a scheduled total power off of cells in our data center weeks ago it somehow got unnoticed that the standby NIC on a cluster's node lost its link for several days.
This was caused by a failed media converter which as an unmanaged dumb device wasn't on the radar of the network management monitors either.
The dead link eventually was only noticed by a cluster reformation and package failover when the primary VIP NIC also experienced a long enough short link loss.
Though both links are fixed by now
I would like to ward another unnoticed link loss by providing a Nagios check.
I instantly after the event fiddled up a plugin that runs checks via linkloops between all involved NICs.
But this will only catch losses which exceed the 5 min check interval or (highly unlikely) are caught coincidentally.
Therefore, I would like to set up a passive Nagios check which would catch some sort of linkDown trap.
Quick googling I came accross the OID for the HP SG Cluster MIB's trap table at
1.3.6.1.4.11.2.3.1.6.3.1.0 (hpmcSGTraps)
Unfortunately, there is no per NIC linkDown trap but at least a hpmcSGSubnetDown [16] entry, which I think could be useful.
However, I haven't found any reference yet
where and how to configure that such a trap is sent by csnmpd to my Nagios server.
Could anyone help?
Thanks
Ralph
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2008 01:49 AM
06-11-2008 01:49 AM
Re: Configuring MC/SG to send hpmcSGSubnetDown trap
As per my understanding you are looking for this Technical knowledge base - document ?
http://www12.itrc.hp.com/service/cki/docDisplay.do?docLocale=en&docId=ucr_na-KMN8606299725_ssb-1
Regards,
Asif Sharif
Asif Sharif
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2008 02:36 AM
06-11-2008 02:36 AM
Re: Configuring MC/SG to send hpmcSGSubnetDown trap
the TKB doc that you referred me to
exactly describes our situation.
Unfortunately, they mentioned that they had no intention to fix this.
However, as this was issued in 2003 things may have changed by now.
Likewise, I haven't been able to download the referred to PDF document.
The link might be stale by now anyway.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2008 02:41 AM
06-11-2008 02:41 AM
Re: Configuring MC/SG to send hpmcSGSubnetDown trap
This document is available on HP's internal network, so HP Support personnel can obtain the document and give it to any customer asking for.
Regards,
Asif Sharif
Asif Sharif
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2008 04:15 AM
06-11-2008 04:15 AM
Re: Configuring MC/SG to send hpmcSGSubnetDown trap
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2008 11:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2008 10:54 PM
06-11-2008 10:54 PM
Re: Configuring MC/SG to send hpmcSGSubnetDown trap
that is really sad to read.
Sounds as if services offered by cmsnmpd were exclusively targeted at SG Cluster Manager,
a product that we don't use.
So there was never any entry point in SG to integrate with ones own monitoring solutions?
I understand that SNMP may no longer be considered state of the art and thus abandoned in future releases altogether.
Nevertheless, such an open interface always offers a relatively easy to use hook for extensible monitors like Nagios.
Do you have any idea how else I could monitor the link states of cluster relevant NICs (apart from my weird linkloop checks)?
Btw, this is the release of SG on the affected cluster.
As this is productive I have no chance to upgrade or patch this along the way.
$ swlist|grep -i guard
B3935DA A.11.14 MC / Service Guard
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2008 04:57 AM
06-12-2008 04:57 AM
Re: Configuring MC/SG to send hpmcSGSubnetDown trap
In the case of a package requirement for the NIC, insure the package configuration has a dependency on the NICs' SUBNET, or else Serviceguard will be blind to a network failure and will allow the package to operate in the absence of network connectivity (assuming heartbeat traffic is still functionong on at least one network).
To check whether you have a package configured to monitor a subnet, use:
# cmviewconf | grep -i -e "package name" -e "package subnet"
Example output:
package name: P1
package subnet: 16.113.0.0
Edit the package configuration file to add a SUBNET reference to the needed network and cmapplyconf the file to update the cluster binary (cmapplyconf requires the package be down).
Providing a SUBNET reference for a package causes Serviceguard to fail the package to the adoptive node if the subnet (primary and standby NICs) are not performing.
If this is not sufficient, a package RESOURSE based on an EMS monitor could be configured - try:
# resls /net/interfaces/lan/status
and
# resls /net/interfaces/lan/status/lan0 to verify the monitor can be created.
Then configure a monitor in the package control script.
The package configuration file contains this example:
# Example : RESOURCE_NAME /net/interfaces/lan/status/lan0
# RESOURCE_POLLING_INTERVAL 120
# RESOURCE_START automatic
# RESOURCE_UP_VALUE = running
# RESOURCE_UP_VALUE = online
#
# Means that the value of resource /net/interfaces/lan/status/lan0
# will be checked every 120 seconds, and is considered to
# be 'up' when its value is "running" or "online".
#
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2008 05:35 AM
06-12-2008 05:35 AM
Re: Configuring MC/SG to send hpmcSGSubnetDown trap
the subnet configuration is already in place
(see below).
However, this hasn't prevented us from slipping our notice that the standby link silently passed away until also the primary was hit for long enough an interval to make the node abandon the cluster.
I cannot do any intrusive package reconfiguration like setting up EMS backed SG resource monitors,
for this is productive.
Here the dumps freed from names and addresses:
# cmviewconf|grep -E 'package (name|subnet)'|cut -d: -f1
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
package name
package subnet
# grep -h ^SUBNET /etc/cmcluster/*/*cntl|cut -d= -f1
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[0]
SUBNET[1]
SUBNET[2]
SUBNET[3]
SUBNET[4]
SUBNET[5]
SUBNET[6]
SUBNET[7]
SUBNET[8]
SUBNET[9]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2008 07:12 AM
06-12-2008 07:12 AM
Re: Configuring MC/SG to send hpmcSGSubnetDown trap
Serviceguard sends lan failure and recovery messages to syslog.
# grep -e fail -e recover /var/adm/syslog/syslog.log| grep lan
Check it periodically, sending an alarm email when any NIC fails today.
NOTE: I investigated /etc/opt/resmon/lbin/monconfig (EMS resource monitor) but it doesn't have a lan monitor capability.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2008 09:26 PM
06-12-2008 09:26 PM
Re: Configuring MC/SG to send hpmcSGSubnetDown trap
the periodical checking of all syslog files (e.g. syslog.log on hpux, messages on solaris and linux) is already a standard check I implemented for each new nrpe enabled host I add to my Nagios config.
So far I've been using single tag matching (no blown up regex or so) with the check_log2.pl plugin for occurrences of "vmunix" to catch messages from the kernel.
Thus, I have to admit, the "cmcld" tagged entries like these
May 3 03:23:22 lech cmcld: lan1 failed
May 3 03:23:32 lech cmcld: lan1 recovered
have escaped my too simple filter
(meanwhile I adapted the filter)
not knowing beforehand that a failed NIC isn't necessarily reported by the kernel.
I much would have preferred if additionally a trap could be caught, though any udp datagram isn't at all guaranteed to be received and reacted on by a manager like my net-snmp snmptrapd/nagios combo.