Switches, Hubs, and Modems
cancel
Showing results for 
Search instead for 
Did you mean: 

Procurve J9050A (2900-48) Crashing

SOLVED
Go to solution
Wim-Bart van der Waals_1
Occasional Advisor

Procurve J9050A (2900-48) Crashing

We have a medium sized network containing 2900 series, 2910 series, 2600 series and 6400 series switches.

Since a couple of months we see the current errors on our switches causing a lot of problems on the LAN:

00732 chassis:  Ports 1-24 Data Bus Error: Addr=0x704a60ec Data=0xffffffff flags=0x10000750, IP=0x40176f58 Task='mEaseUpdt' TaskID=0x448648a0 cpsr=0x80000013
00274 chassis:  (7D) Ports 1-24: Blade Crash detected - Available

After this port 1-24 are reïnitializing. Lucky us we have ports set on admin-edge so no STP is happening.

The Events and the christmas tree effects on the switches are happening randomly in intervals of 10mins - 2 hours. We have 3 switches now with the same problem.

Firmware Release: #T.13.71

For info:
Switch port 1-48 are Populated and used on 1G Fdx. The 2 10Gbe Ports are also used.

Sample config:
[i]; J9050A Configuration Editor; Created on release #T.13.71

hostname "unique hostname"
time timezone 60
time daylight-time-rule Western-Europe
no cdp run
fastboot
module 1 type J86yyA
module 2 type J86xxA
module 3 type J90XXA
no stack
interface 1-48
name "Name of device/port"
broadcast-limit 10
speed-duplex auto-1000
exit
interface A1
disable
exit
interface A2
name "Core 6400/port"
exit
interface A3
name "my backup link to brother in arms/A3"
exit
interface A4
disable
exit
ip default-gateway
vlan 1
name "Production"
untagged 1-28,30-A4
no untagged 29
no ip address
ip igmp
exit
vlan 800
name "Internet"
tagged 1-12,17-28,33-44,A1-A4
no ip address
exit
vlan 1700
name "Beheer"
ip address
tagged 1-12,17-28,33-44,A1-A4
exit
vlan 1740
name "VMotion"
tagged 1-12,17-28,33-44,A1-A4
no ip address
exit
vlan 1750
name "Heartbeat"
tagged 1-12,17-28,33-44,A1-A4
no ip address
exit
vlan 1900
name "OTAP"
tagged 1-12,17-28,33-44,A1-A4
no ip address
exit
vlan 1950
name "OTAPHearbeat"
tagged 1-12,17-28,33-44,A1-A4
no ip address
exit
vlan 900
name "TAG"
tagged A2-A3
no ip address
exit
vlan 850
name "DMZ"
untagged 29
tagged 1-12,17-28,33-44,A1-A4
no ip address
exit
banner motd "*******************************************************"
banner motd "* Toegang is alleen toegestaan door geauthoriseerd *"
banner motd "* personeel van de afdeling: *"
banner motd "* Informatisering en Automatisering *"
banner motd "*******************************************************"
banner motd "~"
logging
logging
logging severity warning
sntp unicast
sntp 300
sntp server priority 1
timesync sntp
ip authorized-managers access Manager
ip dns domain-name ""
ip dns server-address priority 1
no ip ssh
snmp-server community "public" Unrestricted
snmp-server host "public" All
snmp-server host "public" All
snmp-server contact "" location ""
spanning-tree
spanning-tree 1-48 admin-edge-port
no spanning-tree 1-48 auto-edge-port
spanning-tree A2 priority 4
spanning-tree A3 priority 4
spanning-tree force-version RSTP-operation
password manager
password operator[/i]



6 REPLIES
Gerhard Roets
Esteemed Contributor

Re: Procurve J9050A (2900-48) Crashing

Hi Wilm

I would suggest you open a call with HP Support on this one.

KR
Gerhard
Wim-Bart van der Waals_1
Occasional Advisor

Re: Procurve J9050A (2900-48) Crashing

I did and new switches are on its way. Hopefully in time before today's maintenance Window.
Wim-Bart van der Waals_1
Occasional Advisor

Re: Procurve J9050A (2900-48) Crashing

Some new info from HP and it seems to be a firmware bug with sFlow in certain situations (high load of module prob) because at my situation it crashes during vMotion and backup (network).

I disabled sflow with the commands:
no sflow 1
no sflow 2
no sflow 3

But then **** happened and I am disappointed with HP a bit because the tech did not tell me that every 10-15 mins I had to disable sflow on every switch again because PCM enables sFlow again and again even when I tell PCM not to do this. Partial solution is to set the snmp write community for PCM to restricted. But this is **** because now our "less skilled" staff is not able to use PCM for their little tasks like setting VLAN's to ports :(
Javed Padinhakara
Respected Contributor
Solution

Re: Procurve J9050A (2900-48) Crashing

regarding the problem :
".....PCM enables sFlow again and again even when I tell PCM not to do this. Partial solution is to set the snmp write community for PCM to restricted. But this is **** because now our "less skilled" staff is not able to use PCM for their little tasks like setting VLAN's to ports :(...."

Instead of settig the SNMP write for PCM to restricted, you could disable sFlow sampling by PCM this way:
a) Go to the device group folder in PCM, and navigate to the "Traffic Tab"
b) Sort the devices by IP address, so that you can choose the corresponding ports easily
c) Select the ports across the devices, for which you want the sFlow sampling to be turned off,( you can multi-select by dragging across the rows)
d) Right click and select "Manual" menu
e) Select "...disable sampling and statistics" if you would like to do away with traffic monitoring completely..or if you want to just avoid sFlow sampling, and retain the statistics collection,you could select "...enable statistics"

Note that doing so you may need to trade-off for the rich traffic views in PCM for these set of devices...you need to take a call on this:)

HTH
Javed


ps:-Noticed that you have joined recently and hence thought will share an important etiquette followed in the forum - assign points on scale (1-10) to people trying to help by answering your queries; its an appreciation for the time they spend in responding to your questions.
Wim-Bart van der Waals_1
Occasional Advisor

Re: Procurve J9050A (2900-48) Crashing

Ok, this solves the problem for me and for now but had to search for the settings. To avoid problems I did disable sFlow for all our 11 2900 devices, sFlow keeps on continue to work on our 26 26xx devices our 6 2910 devices and our 2 6400's.

The steps you describe where not 100% correct in PCM 03.10.201 patch 3 but I did the following:

1) Tools -> Preferences -> Statistics -> Global Traffic
2) Default Port Monitoring: Automatic Statistics Polling only
3) Select: Agent Groups -> Default Agent Group -> Devices -> Procurve 2900
4) Select: Traffic
5) Select: All Ports with any other then the "Enable automatic traffic statistics collection"-symbol
6) Context menu: Manual -> "Manually Disable Sampling and Statistics"
7) Contect menu: Automatic -> "Manually Enable statistics only Mode".

The switches survived the night good and the answer you did give helped me better then just the message from HP to Disable sFlow. Finanally I have a workable work-around and now waiting for a firmware fix to appear.
Wim-Bart van der Waals_1
Occasional Advisor

Re: Procurve J9050A (2900-48) Crashing

Final Solution: Updated firmware....