Operating System - HP-UX
1833727 Members
2396 Online
110063 Solutions
New Discussion

Re: SNMP Newbie needs help

 
exec22
Occasional Advisor

SNMP Newbie needs help

Hello all,

I can find may way around hpux quite well but when it comes to snmp traffic monitoring and making sense, I am a total novice. Therefore, expect my frequent questions from now on.

My situation is:

An rp3410 with 2 add on copper PCI gbit NIC + 2 GB Fiber optic HBA combo card (model number A9784A)

One of these interfaces are connected to the core switch to a 100MB full duplex nailed down port. My NIC is also nailed down to 100FD speed/duplex wise.

All is well with the system operation and connectivity.

On the monitoring side, I have a OpenView NNM running and trapping SNMP events and every time it detects a state change in any of the network interfaces it is monitoring, its sends out an email. And I am currently inundated with "Node is up" messages. According to these messages node is never going down but for some reason it is always coming up. Someone who knows a smidgen more about the NNM told me that it might be because the interface state may be changing from "UP" to "MARIGINAL" and back to "UP" again and we may not be getting notifications about "MARIGINAL state transitions. But this person not being in my group, is not of much help here. The person who set up the NNM is no longer with the company, but I have full access to the server it is running on.

I would like to figure out and get to the root of these messages by finding the faulty component without trial and error. What can I do ?

By the way, there are several other servers rp3440 and rp4440 type machines with the same interface cards as well as few other older N & L class machines with older network interfaces connected to the same core switch on the same blade and I do not see any messages like this.

I already changed the cabling and the port it is connected to and the result did not change any. When I talked to the network guys, they are telling me that, the switch port this fella connected to has like 19,000 errors logged whereas good players has no errors.

Since the connectivity is there most of the time, I am suspicious of the firmware code on this card. But again, the same card is also on the neighboring servers, purchased around the same time frame and they do not have this behavior.

Does anyone have any insight what might be going on ? Any ssuggestion how to narrow it down ? Thanks in advance. All help will be appreciated and points will be awarded for valuable responses.
3 REPLIES 3
D Block 2
Respected Contributor

Re: SNMP Newbie needs help

It sounds like you are on the right track, errors and kernel-driver (just might do the reseting of the card, hence it's back up messages are flooding).


question: Have you also monitored the packets over the network by using Glance ? Go to Networking, then (S) option for Select the NIC Interface.. you might see errors or retransmits. if this is the case, study the LanAdmin command and see if the NIC properties (Half or Full duplex) are the same as the Switch properties (Half or Full duplex). If the server is half and the switch is full, this could cause errors or possible resets in the driver.

Also, get the kernel gelan driver patches and double check over, you never know.

The firmware as you point out is a good suggestion.

Can you afford some downtime and try swapping all the new cards out ? Then just install the 1 geLan copper card ? Then check for errors on the switch side ? If not errors, this might mean something having the network configuration while using 2-nics.. if this is the case, maybe something is wrong with your "netconf" file. You might just play around with the "ifconfig" command and setting the IP address to 0.0.0.0, then re-plumb or provide valid ip number. Sounds like h/w or settings on the card.. good luck.
Golf is a Good Walk Spoiled, Mark Twain.
Bob Ingersoll
Valued Contributor

Re: SNMP Newbie needs help

The person that told you that this may be caused by one interface transitioning between states (interface down/interface up) knows what he's talking about. This is the most likely cause of the problem. Simply put, when ALL interfaces on a node have transitioned to a down state this results in NNM generating a Node Down event; when ALL interfaces have transitioned to up NNM generates a Node Up. So a single flapping interface will cause the situation you described.

I suspect that the problem is a configuration problem rather than an actual interface problem.

Make sure that in DNS ALL addresses for a node resolve to the same name; also ensure that host names refer to only one interface.

exec22
Occasional Advisor

Re: SNMP Newbie needs help

thanks