Switches, Hubs, and Modems
1752762 Members
5217 Online
108789 Solutions
New Discussion юеВ

Re: HP Procurve 2900-48 - High collision or drop rate

 
vlad manea
Occasional Advisor

HP Procurve 2900-48 - High collision or drop rate

Hi,

I have a linux cluster based on rocks 5.1 and the switch is a HP procurve 2900 with 48
ports. On one of the ports (3) where i have plugged in a compute node I got this warning:
"High collision or drop rate". The Switch status displayed is Non-critical. I reinstalled the compute node and I see that all my nodes offer the ~the same load in a parallel job (including the third port).
What should I do in this case? It looks like the warning message is still there...

Thanks,
Vlad
_______
11 REPLIES 11
RonniDK
Advisor

Re: HP Procurve 2900-48 - High collision or drop rate

Have you checked speed/duplex ?

# show interfaces brief

vlad manea
Occasional Advisor

Re: HP Procurve 2900-48 - High collision or drop rate

Yes, I checked and this is the output:

...
3 100/1000T | No Yes Up 1000FDx MDIX on 0
...
However, I see 83,778,975 Tx drops on this port, compared with ~2,000,000 on the other ports.
Any idea how to solve this problem?

Thanks,
Vlad
RonniDK
Advisor

Re: HP Procurve 2900-48 - High collision or drop rate

Do you have drops on all your ports?

I have a similar case @ HP regarding 7 x 2900-48G switches, which all have drops.

I would very much like to hear about your setup, if you do have drops on all your ports.
vlad manea
Occasional Advisor

Re: HP Procurve 2900-48 - High collision or drop rate

Well, I had drops on all ports, but on port 3 they were 40 times higher. I had to reboot the switch today, I was starting to get a lot of errors. I start a new MPI job on all the compute nodes, and after 24 hours the switch is OK.
I looked at #show interfaces and see Drops Tx are 0 on all ports. I had no idea what have caused the previous behavior, maybe a very large parallel job with lots of MPI communication between the nodes.
vlad manea
Occasional Advisor

Re: HP Procurve 2900-48 - High collision or drop rate

The HPC setup I have is the following:
5 compute nodes (Dell PE Sc1425-2x2 processors-2.6 GHz;
8GB RAM)
1 head node Dell PE 2970 2x2 processors-2.8 GHz;
8GB RAM)
1 switch HP Procurve 2900 48.
All the compute nodes and the head node are hooked up into the switch (I am using only one NIC on each machine, so for the moment I am using only 6 ports out of 48).
RonniDK
Advisor

Re: HP Procurve 2900-48 - High collision or drop rate

Okey...

We have a setup in which each computer is using bonding, and about every switch is full.

Can you send me a "show interfaces"?

vlad manea
Occasional Advisor

Re: HP Procurve 2900-48 - High collision or drop rate

There you go:

ProCurve Switch 2900-48G# show interfaces

Status and Counters - Port Counters

Flow Bcast
Port Total Bytes Total Frames Errors Rx Drops Tx Ctrl Limit
----- -------------- -------------- ------------ ------------ ----- ------
1 558,337,029 18,584,801 0 0 on 0
2 493,880,984 2,951,289,212 0 0 on 0
3 322,075,451 2,948,338,084 0 7 on 0
4 4,126,074,570 2,941,405,689 0 1 on 0
5 3,394,545,464 1,483,089,562 0 0 on 0
6 3,282,434,817 1,480,859,094 0 0 on 0
7 0 0 0 0 off 0
8 0 0 0 0 off 0
9 0 0 0 0 off 0
10 0 0 0 0 off 0
11 0 0 0 0 off 0
12 0 0 0 0 off 0
13 0 0 0 0 off 0
14 0 0 0 0 off 0
15 0 0 0 0 off 0
16 0 0 0 0 off 0
17 0 0 0 0 off 0
18 0 0 0 0 off 0
19 0 0 0 0 off 0
20 0 0 0 0 off 0
21 0 0 0 0 off 0
22 0 0 0 0 off 0
23 0 0 0 0 off 0
24 0 0 0 0 off 0
25 0 0 0 0 off 0
26 0 0 0 0 off 0
27 0 0 0 0 off 0
28 0 0 0 0 off 0
29 0 0 0 0 off 0
30 0 0 0 0 off 0
31 0 0 0 0 off 0
32 0 0 0 0 off 0
33 0 0 0 0 off 0
34 0 0 0 0 off 0
35 0 0 0 0 off 0
36 0 0 0 0 off 0
37 0 0 0 0 off 0
38 0 0 0 0 off 0
39 0 0 0 0 off 0
40 0 0 0 0 off 0
41 0 0 0 0 off 0
42 0 0 0 0 off 0
43 0 0 0 0 off 0
44 0 0 0 0 off 0
45 0 0 0 0 off 0
46 0 0 0 0 off 0
47 0 0 0 0 off 0
48 383,366,496 4,161,559 0 0 off 0
A1 0 0 0 0 off 0
A2 0 0 0 0 off 0
A3 0 0 0 0 off 0
A4 0 0 0 0 off 0

ProCurve Switch 2900-48G#
RonniDK
Advisor

Re: HP Procurve 2900-48 - High collision or drop rate

Have you reset the statistics on the switch? According to the "show interfaces" you don't have many drops.

As I understand your previous post:
However, I see 83,778,975 Tx drops on this port, compared with ~2,000,000 on the other ports.
Any idea how to solve this problem?


You have over 83 million drops on port 3 and around 2 million drops every other port - is that correct?
Matt Hobbs
Honored Contributor

Re: HP Procurve 2900-48 - High collision or drop rate

It looks healthy now. Drops simply occur from any buffer overflows. E.g. you had 2 Gig coming in to a 1 Gig port and the switch simply couldn't buffer it so packets were dropped. All you can really do is keep an eye on it, if keeps occurring, maybe you need to setup an aggregated port group for those particular servers that are affected.