1748069 Members
5486 Online
108758 Solutions
New Discussion

Re: "Device do not respond to ping packet " alarm overflow

 
SOLVED
Go to solution
_OkaY_
Regular Visitor

"Device do not respond to ping packet " alarm overflow

Hi everybody,

I have the version 7.0 of IMC and I love this soft but I have a problem that is more and more boring...

I have 120 switchs and routers directly mamanged by IMC. It work fines but everyday I have a lot (like 50/60 times) of "Device do not respond to ping packet " alarms... But all my switchs are ok, when I see log into them, there is no problem...

So, why IMC report this alarm ?

I think a ping packet was lost but its not possible to change the config of IMC to send less ping packet ? Or change the interval between 2 ping ?

Hoping you can help me.

Regards,

Okay

 

 

9 REPLIES 9
AlexisKoch
Advisor

Re: "Device do not respond to ping packet " alarm overflow

Hello,

I don't think changing the interval will do what you want: if the device does not respond, you will get the alarm. I had the same problem with remote routers. I ended filtering the alarm and working on the "device unreachability proportion"...

 

Regards,

Alexis

 

LindsayHill
Honored Contributor

Re: "Device do not respond to ping packet " alarm overflow

The default settings are to try 3 ping packets, with a timeout of 2s. You will only get a critical alarm if all 3 packets have failed.

You can change the system defaults at System -> System Configuration -> System Settings.

Losing 3 ping packets in a row is a lot. Are you sure that you don't have network issues? How are those devices connected - are they at the end of remote WAN links? You probably do have connectivity issues that need investigating, especially if you have a lot of these alerts.

_OkaY_
Regular Visitor

Re: "Device do not respond to ping packet " alarm overflow

First of all, thank you for your answers.

To answer at your questions, my network isn't a WAN but a LAN. My switches are HP 2620/3600 and 5500 routers.

I was surprised at the number of packets lost from imc , because by running the ping manually from another server (nagios btw) , it turns out that I have no echo lost. Now I scan the connection between my IMC Server and a switch with Wireshark to get the echo/request packets and see what happens, but i'm not sure to find something.

I watched the Statitics of the imc's server network interface to the corresponding switch and there is no packet loss and an average speed of 1 MBits (1Gbits available).

I'll try to decrease the number of ping send by the system configuration settings, but not to much because I don't want to lost reliability... and work with the device unreachability proportion too, its a good idea.

I'll post my observations later here !

Thanks, if you have another idea don't hesitate :D

 

 

 

 

 

 

 

 

LindsayHill
Honored Contributor

Re: "Device do not respond to ping packet " alarm overflow

Thanks for the additional information. 

If those devices are on a LAN, you should have very little packet loss. Something is clearly not right. I would start with the usual problems - switchport errors, duplex mismatch, etc.

Rather than decrease the pings sent, you could increase it. I would keep investigating that packet loss though. If that server has a lot of packet loss, maybe your users have the same problem.

Do you see alarms for ALL your managed devices, or just some of them? Is there a pattern?

_OkaY_
Regular Visitor

Re: "Device do not respond to ping packet " alarm overflow

Thank to your answer.

Unfortunately, I don't see any pattern of this alarms, it can come from any switch/router at any time.

Yesterday I increase the number of retry in System configurations > System settings > Ping configuration from 3 to 5 and increase the timeout from 30 to 60. Result : I have less alarm : from 50/60 per day to 10.

with retry = 60 and timeout = 1 , I have immediately 30 alarms. (normal I presume)

with retry = 20 and timeout = 60, I have juste one alarm (but still an alarm :/ )

When I connect to a switch who have an alarm, the CPU, memory is fine (like 10/20 % on use) and there is no packet loss, no switchport errors or duplex mismatch (+ : I don't have trap sent for this errors)

 

My IMC Server is installed on a virtual machine (ESX). This can be explain why I lost some packets ? What is the optimal configuration of virtual network interface on the vm who host imc ?

Regards,

OkaY

 

 

 

 

 

 

LindsayHill
Honored Contributor

Re: "Device do not respond to ping packet " alarm overflow

I would not expect to see SNMP traps for errors. Look at your counters instead.

Retry 20 / timeout 60 is a very long time. If you still have alarms at that level, you have serious network issues going on somewhere. You must have enormous amounts of packet loss.

Have a look at the performance stats for your hypervisor. Anything going on there?

What about IMC's latency reports for devices. Do you see anything unusual there?

NeilR
Esteemed Contributor

Re: "Device do not respond to ping packet " alarm overflow


My IMC Server is installed on a virtual machine (ESX). This can be explain why I lost some packets ? What is the optimal configuration of virtual network interface on the vm who host imc ?

I run imc on vmware. I never have the type of issue that you are having - I agree with Lindsay that something is wrong. Most my stuff is on the local LAN but even switches located across fixed ipsec tunnels via public internet do not have those issues.

On our ESX configuration, its installed along with other servers on the server subnet. Uses VMXNET 3 adapter, 16gb ram, 6 cpu. The host has 196gb RAM and 40 logical cpus. Only minor issue with this configuration is some SQL latency to the vmdk on the storage system can be a bit higher than driect attached storage.. That doesn't affect the basic tasks


 

_OkaY_
Regular Visitor
Solution

Re: "Device do not respond to ping packet " alarm overflow

Hi everybody and thanks for your answers !

Retry 20 / timeout 60 is a very long time. If you still have alarms at that level, you have serious network issues going on somewhere. You must have enormous amounts of packet loss.

I'm agree with you and surprise this morning, I don't have anymore alarms with this configuration. So I put the parameters down, retry 15 - timeout 45, and see what happens. At the same time, I searched on the switchs wich are connected to my ESX and I found something. Apparently, I have a lot of packets dropped due to an insuffisant bandwidth on the link between my switch and my router... I'll have to redefine the architecture of my network :/

Have a look at the performance stats for your hypervisor. Anything going on there?

What about IMC's latency reports for devices. Do you see anything unusual there?

The Vmware hypervisor show that my ImC Server works fine. CPU/Memory/Network are approximately 50 % used this last mounth.

The IMC's latency reports show that my switch/routers have a responce time less than 4 ms.

On our ESX configuration, its installed along with other servers on the server subnet. Uses VMXNET 3 adapter, 16gb ram, 6 cpu. The host has 196gb RAM and 40 logical cpus.

My server have a VMXNET 3 adapter, but only 8Go Ram and 2 CPU. Is this a problem ? The host have identic amount of CPU/RAM.

 

An other idea is to upgrade the version of IMC from 7.0 to 7.2. Do I need to incease the memory/CPU before ?

Anyway, thank you very much for helping me !

NeilR
Esteemed Contributor

Re: "Device do not respond to ping packet " alarm overflow

I've seen ram usage above 12gb, but right now its running 8.9 - I'd think 12gb might be better. CPU acitivity can get high and it looks likes its threaded enought to use multiple processors so I'd go to 4 - I think 2 is too low.

AS to whether this is causing the problem you mention can't say for sure - check the resource usage in the deployment management tool.