Aruba & ProVision-based
1752339 Members
5741 Online
108787 Solutions
New Discussion

Re: Major Network Issues

 
SOLVED
Go to solution
dmcdonough
Occasional Advisor

Major Network Issues

Over the past week i have been experiencing network issues. 

Overall Network Slowness, packet drops, RDP connection drops.

I found that when i reboot my main core switch it fixes everything for a few hours before the issues return. 

I have not done any new network or server changes that would create any issues.

I have an HP Procurve layer 3 switch as my core switch. ALl other swiches are HP Procurve (10 total)

I have worked with HP to check logs. We have enabled spanning-tree on all of my switches (i only have 10 total) 

Even after all of that the issues are still coming back. 

Now from what i understand if i had a loop on my network, spanning-tree should disable the ports of where the possible loop is and not cause mass issues across the entire network.

I am assuming that i don't have a loop and that it must be something else causing havoc on my network.

I also have had Microsoft on my servers to check DHCP, DNS, AD Replication on all my domain controllers. 

Any suggestions?

20 REPLIES 20
dmcdonough
Occasional Advisor

Re: Major Network Issues

core switch is an HP Procurve E5406 zl - J8697A - Running Firmware ver. K.16.02.0012

all other switches are mix of HP Procurve 4104gl and 4108gl switches. we updated the firmware on all downline switches before we enables spanning-tree.

 

i have a running ping going from my workstation to a server on the other side of my core swich. The ping is fine for a few hours and then starts to drop pings and the time starts to grow up to +400-500ms sometimes peaking even higher. This is when the entire network starts to have drops and slowness. IP phone calls between my 2 buildings also start to have call quality issues. i reboot the core switch and then i seem to be ok again for a few hours and the ping goes back to normal. But will eventually happen again.

 

 

parnassus
Honored Contributor

Re: Major Network Issues

Hard to say without any other evidence (Network logical/physical Topology? configuration files? involved Switches' logs? relevant Hosts and Services? Switching/Routing configurations? etc.).

AFAIK actually there is K.16.02.0016 software version.

But...without any correlated logs (especially during the slowdown) it's really difficult to imagine what is the cause (or what are the causes) of your network traffic random slowness (the culprit couldn't be exactly the core Switch...but a service or a host that is impacting - like doing a sort of DoS "Denial of Service" - your core Switch's switching/routing capabilities).


I'm not an HPE Employee
Kudos and Accepted Solution banner
dmcdonough
Occasional Advisor

Re: Major Network Issues

I have a ticket open with HP. we have been working on this the past 3 days. updated all switch firmware, enabled spanning-tree across all switch making the core switch the root priotiy 0.

 

The HP tech recommeded keeping the firmware where it is even though their is a newer version as they reported the version i have is the most stable, knowing good firmware for this model switch.

 

The switch is currently under warranty.

 

I can confirm other than turning spanning tree on no other changes have been made on core switch to cause these issues.

 

Software revision K.16.02.0012

 


Running configuration:

; J8697A Configuration Editor; Created on release #K.16.02.0012
; Ver #0e:01.30.02.34.5f.2c.6b.ff.f7.fc.7f.ff.3f.ef:b5
hostname "CASD-MAIN-01"
module 1 type j9549a
module 2 type j9549a
module 6 type j9538a
trunk A24,B21 trk1 trunk
timesync sntp
sntp unicast
sntp server priority 1 10.0.4.13
time daylight-time-rule continental-us-and-canada
time timezone -300
ip route 0.0.0.0 0.0.0.0 10.0.2.5
ip routing
interface F8
   lacp active
   name "VDI"
   exit
snmp-server enable traps mac-notify
snmp-server enable traps startup-config-change
-- MORE --, next page: Space, next line: Enter, quit: Control-Csnmp-server enable traps running-config-change
snmp-server enable traps mac-count-notify
snmp-server contact "D MCDONOUGH" location "TECH OFFICE LEFT E5406"
vlan 1
   name "DEFAULT_VLAN"
   no untagged A2-A6,A13,A15,A20,B2,B6-B11,B13-B14,B16-B18,F4-F5,F7-F8
   untagged A7-A9,A11,A14,B1,B3-B5,B12,B15,Trk1
   tagged A1,A10,A12,A16-A19,A21-A23,B19-B20,B22-B24,F1-F3,F6
   ip address 192.168.11.1 255.255.0.0
   ip helper-address 10.0.4.13
   ip helper-address 10.0.4.20
   exit
vlan 2
   name "VL_2_Firewall"
   untagged A10,A12
   tagged A1,A20,B19
   ip address 10.0.2.2 255.255.255.0
   exit
vlan 4
   name "VLAN_4_SERVERS"
   untagged A2-A6,A15-A19,B2,B6-B8,B16-B18,F3-F5,F7-F8
   ip address 10.0.4.1 255.255.254.0
   ip helper-address 10.0.4.13
-- MORE --, next page: Space, next line: Enter, quit: Control-C   exit
vlan 8
   name "VLAN_8_PRINTERS"
   tagged A1-A6,B20,B22-B24,F5,Trk1
   ip address 10.0.8.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 12
   name "VLAN_12_WAPS"
   untagged B11
   tagged A1,A4,A10,A21-A23,B3-B5,B7-B8,B19-B20,B22-B24,F1-F6,Trk1
   ip address 10.0.12.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 16
   name "VLAN_16_ADMIN_WIFI"
   tagged A1,A4,A10,A21-A23,B3-B5,B7-B8,B11,B19-B20,B22-B24,F1-F6,Trk1
   ip address 10.0.16.1 255.255.252.0
   ip helper-address 192.168.10.51
   ip helper-address 10.0.4.13
   exit
vlan 20
   name "VLAN_20_STAFF_WIFI"
-- MORE --, next page: Space, next line: Enter, quit: Control-C   tagged A1,A4,A10,A21-A23,B3-B5,B7-B8,B11,B19-B20,B22-B24,F1-F6,Trk1
   ip address 10.0.20.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 24
   name "VLAN_24_STUDENT_WIFI"
   tagged A1,A4,A10,A21-A23,B3-B5,B7-B8,B11,B19-B20,B22-B24,F1-F6,Trk1
   ip address 10.0.24.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 28
   name "VLAN_28_GUEST_WIFI"
   tagged A1,A4,A10,A21-A23,B3-B5,B7-B8,B11,B19-B20,B22-B24,F1-F6,Trk1
   ip address 10.0.28.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 32
   name "VLAN_32_HIGH_SCHOOL"
   tagged A1,B22
   ip address 10.0.32.1 255.255.252.0
   ip helper-address 10.0.4.13
   ip helper-address 10.0.4.20
   exit
-- MORE --, next page: Space, next line: Enter, quit: Control-Cvlan 36
   name "VLAN_36_ELEMENTARY_SCHOOL"
   ip address 10.0.36.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 40
   name "VLAN_40_ADMIN_OFFICE"
   untagged A13
   tagged A1
   ip address 10.0.40.1 255.255.252.0
   ip helper-address 10.0.4.13
   ip helper-address 10.0.4.20
   exit
vlan 44
   name "PRESSBOX"
   tagged Trk1
   ip address 10.0.44.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 48
   name "VLAN_48_HVAC"
   tagged B20
   ip address 10.0.48.1 255.255.252.0
-- MORE --, next page: Space, next line: Enter, quit: Control-C   ip helper-address 10.0.4.13
   exit
vlan 52
   name "VLAN_52_SECURITY_CAMS"
   tagged B19-B20,B22-B24,Trk1
   ip address 10.0.52.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 56
   name "VLAN_56_FOB_DOOR_ACCESS"
   tagged B20
   ip address 10.0.56.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 60
   name "VLAN_60_PHONE_SYSTEM"
   untagged B13-B14
   tagged A15-A19,F3,Trk1
   ip address 10.0.60.1 255.255.252.0
   qos dscp 46
   voice
   exit
vlan 64
-- MORE --, next page: Space, next line: Enter, quit: Control-C   name "CAFE_POS"
   tagged A1,A4,A10,A15-A19,A21-A23,B3-B5,B7-B8,B19-B20,B22-B24,F1-F6,Trk1
   ip address 10.0.64.1 255.255.252.0
   ip helper-address 10.0.4.13
   ip helper-address 10.0.4.20
   exit
vlan 100
   name "FRIENDSHIPHOUSE"
   tagged A1,A4,A10,A21-A23,B3-B5,B7-B8,B19-B20,B22-B24,F1-F6,Trk1
   ip address 10.0.100.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 150
   name "VDI_MNGT"
   untagged B10
   tagged B9,F8
   ip address 10.0.150.1 255.255.252.0
   ip helper-address 10.0.4.13
   exit
vlan 160
   name "VDI_DESKTOPS"
   tagged B9,F8
   ip address 10.0.160.1 255.255.252.0
-- MORE --, next page: Space, next line: Enter, quit: Control-C   ip helper-address 10.0.4.13
   exit
vlan 200
   name "VLAN_200_FIREWALL"
   no ip address
   ip helper-address 10.0.4.13
   exit
spanning-tree
spanning-tree Trk1 priority 4
no spanning-tree bpdu-throttle
spanning-tree config-name "CPSD"
spanning-tree config-revision 1
spanning-tree pathcost mstp 8021d
spanning-tree priority 0 force-version rstp-operation
no autorun
no dhcp config-file-update
no dhcp image-file-update
device-profile name "default-ap-profile"
   cos 0
   exit
password manager

 

parnassus
Honored Contributor
Solution

Re: Major Network Issues

If I were you I would give K.16.02.0016 a try (or, if you don't feel brave enough [*], at least K.16.02.0014): starting with the K.16.02.0014 software build the TCP Push Preserve mode [**] parameter was set to disabled by default (it was enabled by default).

The TCP Push Preserve mode parameter set to disabled could be of help in very busy networks (very busy interfaces) - so, at least, check its status - and very busy networks can be easily found when you start having many concurrent 10Gbps (up)links to Servers and/or to other Switches; if I'm correct...you have the F Slot equipped with a HPE 8-port 10GbE SFP+ v2 zl Module (J9538A)...so a look at traffic of F1-F8 ports would be of help (also open an eye about overprovisioning [***] of 10Gbps ports).

[*] IMHO using K.16.02.0012, 0014 or 0016 is not particularly relevant when you're considering the most stable software argument...indeed you're using the latest K.16.02 software branch (and not a maintenance one) in all cases.

[**] From latest Release Notes:

"The TCP Push Preserve mode determines the queuing of the TCP packets that have the PUSH flag set. When this mode is enabled and the egress queue is full, TCP packets with the PUSH flag set are queued at the head of the ingress queue for egress queue space. This may delay subsequent incoming packets in the same queue and create a head-of-line blocking situation. When this mode is disabled and the egress queue is full, TCP packets with the PUSH flag set are dropped from the head of the ingress queue. If the current switch TCP Push Preserve mode has been set to DISABLED, it will be preserved as DISABLED and the corresponding configuration entries will be suppressed. If the current switch TCP PUSH preserve mode has been set to ENABLED, it will be changed to DISABLED and the change will be noted in system event logs as The tcp-push-preserve feature was disabled. This is a change to default configuration. The CLI command show tcp-push-preserve indicates the status of TCP push mode ENABLED/DISABLED. CLI command [no] tcp-push-preserve changes the status of TCP push mode."

[***] read this (could be of help, in any case).


I'm not an HPE Employee
Kudos and Accepted Solution banner
dmcdonough
Occasional Advisor

Re: Major Network Issues

Thanks, I am going to give this a shot.

 

Quick question. After looking at some other forums i have read that even if you have spanning tree turned on, it will not block a loop that is on a small unmannaged 5 port switch. I have a bunch of small 5 port unmannaged switches in my environment, but all of them are uplinked to one of my managed hp switches that are now configured with spanning tree.

If this is true, then i would think i should take a walk around and see if i can find any loops in these small unmanaged switches.

 

Thanks

 

parnassus
Honored Contributor

Re: Major Network Issues

What's about:

trunk A24,B21 trk1 trunk

that's a static (Non-Protocol) Ports Trunk with Port 24 (Slot A Module) and Port 21 (Slot B Module) as member ports to where? why static (so non LACP IEEE802.3ad)?

Then (I'm a little bit puzzled here):

interface F8
   lacp active
   name "VDI"
   exit

is a little bit of a non sense (a single Port set as LACP Active?)...no trunk assignment here (I mean: no LAG). What's the purpose of setting "LACP Active" on a single port - Port 8 of Module on Slot F (6) - if LACP is used in Port Trunking (so with LAG) configurations?

I guess you're using a single 10Gbps port (F6) to connect to a Virtual Machine's OS on a Hypervisor or to a OS installed directly on a bare metal Desktop Server machine (in both cases through a physical 10Gbps NIC's port on a Server host) to provide VDI Virtual Desktop Infrastructure connectivity. Is it right?


I'm not an HPE Employee
Kudos and Accepted Solution banner
16again
Respected Contributor

Re: Major Network Issues

Is switch ARP table growing out of bounds?

afaik: Creating a loop on an unmanaged switch should also trigger STP, since you're also looping back BPDUs

dmcdonough
Occasional Advisor

Re: Major Network Issues

I have two 1 gig fiber connections between both buildings. I have combined them together in trk1 to improve the inter building connection. I had the HP tech check trk1 and the have verified that it looked fine.

The other VDI port F8 is a 10gig connection to my Dell VDI switch that connects to my VDI host hypervisor cluster. I setup the lack now, because I will be adding a second 10 gig connection soon when I add a second VDI Dell switch for redundancy (per my Dell tech who helped configure the Dell VDI infrastructure side of my network)

So right now I only have F8 but will be adding a 2nd 10 gig connection to mix very soon, hence the lacp config on F8.

I have 100 thin clients out on my network ( mostly in my 2nd building on my production network. They connect over my trk1 lacp and then go through F8 VDI lacp to my VDI environment where my virtual desktops reside.

dmcdonough
Occasional Advisor

Re: Major Network Issues

That's what I expected, that spanning tree would also block down line unmanaged switches if they had a loop.

Thanks for clarifying.

Where would I go and what command on the switch would I use to check if my ARP table is growing out of bounds?

Would I see any error in my switch log?

Thanks