Switches, Hubs, and Modems
cancel
Showing results for 
Search instead for 
Did you mean: 

Major Problem with ProCurve 6600 Series Switches Locking Up

GR-SC
Occasional Contributor

Major Problem with ProCurve 6600 Series Switches Locking Up

Has anyone else had their ProCurve 6600 Series switches lock up on them?

 

We have ten of them in one production environment, eight of them being 6600-48G-4XG units and two of them being 6600-24XG units.

 

We first experienced this problem around October 2011, with several of the switches having this issue within a span of a few days, and after being in contact with HP support, we upgraded the software to the latest (and still current) version of K.15.06.0008.

 

We had one switch lock up about a couple weeks later, but since then, over four months went by quietly... until the problem came roaring back a couple days ago.

 

In a span of a few hours, FIVE of the ten switches locked up. The firmware was upgraded on November 1st, 2011, and they locked up on April 1st, 2011. Based on the timing, it seemed almost exactly when the uptime went to 152 days. However, it's not an absolute condition, because we have three other units with the same uptime escape unscathed. One of those units locked up a day later. We now have two more units that are beyond 153 days uptime.

 

We have two of the same model switch in another environment, and they've been up 170 days without issue.

 

We are at a total loss in understanding what is causing the switches to lock up like this. I don't know if the timing is a clue or a red herring.

 

Just wanted to see if anyone else has had any similar issues like this.

 

We're in dialogue with HP support again, but nothing in the full tech dump and syslogs stands out to them. They may be having us try and upgrade the software again to a publicly unreleased version again, just like the last time, but unless this is a known issue and the new software specifically addresses this, I'm concerned that it's just another waiting game until the switches lock up again.

 

These are some of HP's higher end data center switches and we are using them in our production environment, so having these switches fail like this should be understandably completely unacceptable.

10 REPLIES
Mik-HPN
Valued Contributor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

Have you contacted support for the April hang also? If not, please do, and if you did let them push the case to L2 support.

Also see PM.

HPN Transceiver guru!
GR-SC
Occasional Contributor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

Yes, due to the serious nature of these switches being in our production environment, we have opened a new support case and had it elevated to a L2 engineer already, and he has collected the show tech dumps from all of the switches in our environment.

plroybal
Frequent Visitor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

The University I am the network engineer for experienced the 152 day time bomb on 4 of our core server switches today within 11 hours of each other.  We have this firmware (K.15.06.0008) on our core 8206zl and all of our 5400zl switches as well as these 6600-48G (P/N: J9451A).  I am blown away that HP didn't do something to warn us.  They know who has their products, ours are registered with HP.  This kind of error is a very poor show on the quality control part of HP's firmware house.  I'm opening a ticket and escalating as well.  Good luck!

EckerA
Respected Contributor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

HI,

We had the same problem with 9 of our Procurve 6600-24 Switches. HP recommanded to use firmware K.15.06.0017.  If you download the software on the hp website it says:

 

CAUTION: If you are updating from a software version that is susceptible
to the "hang" problem to this software, HP recommends that software updates
and reboots occur when there can be someone onsite to power cycle any switch
that does not recover spontaneously after reload.

These software versions are susceptible:
      K.15.01.xxxx - K.15.05.xxxx
      K.15.06.0006 - K.15.06.0014
      K.15.07.0002 - K.15.07.0005
      K.15.08.0007
      K.15.04.0007m - K.15.04.0010m

 

 

But nothing is mentiones in the release notes. Our case is still open, but we didn't hear anything for some time now.

 

hth

Arimo
Respected Contributor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

Hi

 

Actually the reboot issue is in the release notes:

 

Switch Hang (CR_0000106245, CR_0000109565, CR_0000109696) - The switch might fail to boot fully, requiring a power-cycle to recover.

 

All current software versions have this fixed.


HTH,

Arimo
HPE Networking Engineer
plroybal
Frequent Visitor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

Just to clarify, I rebooted all of the failed switches after the initial firmware upgrade.  All four ran with the new code for 152 days and then locked up.

Arimo
Respected Contributor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

Hi

 

I think I heard a colleague mention this exact issue today or yesterday. Please call support, let's see.


HTH,

Arimo
HPE Networking Engineer
fiete
Occasional Visitor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

Heard about the mentioned problem (4 6600 in an iSCSI config) at a customer yesterday, any news on this?

Thanks, f

Fix_IT
Occasional Visitor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

Having similar issue with 54xx ZL series switches.

Please post specifically what fixed the issue of the switches hanging or freezing.

We have found that it seems when there is a power degredation the units hang.

We have also found that it could be an issue of time, and wondering if this is due to log size or something related to a buffer getting full.

Did anyone have sucess with the firmware updates solving the issue?

IOS updates solving the issue?

I am sure everyone on this thread will agree that any responses would surely be appreciated because at this point I may move to another vendors product unless it can be fixed.

Thanks.

GR-SC
Occasional Contributor

Re: Major Problem with ProCurve 6600 Series Switches Locking Up

OP here again after five months.

 

After more than three months in limbo with L2 support, we finally received word that supposedly they have identified the problem.

 

As of version K.15.06.0008, there were two issues that compounded the problem.

 

The first is the 152 day time bomb that causes the switch to become unresponsive.

 

The second is the bug that causes the switch to not reboot properly, as Arimo mentioned above.

 

With K.15.06.0017, the reboot issue seems to have been addressed. Whether the 152 day time bomb issue has been fixed remains to be seen.

 

From what we were told, apparently it was some counter to do with power supplies that caused the 152 day time bomb.

 

We just upgraded to K.15.06.0017 just a couple weeks ago finally, so it'll be a few months before we see results one way or the other.

 

For others that upgraded to K.15.06.0017 earlier, let us know after 160 days whether you're still up or not.

 

Cheers...