BladeSystem - General
cancel
Showing results for 
Search instead for 
Did you mean: 

Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

SwisspostIT
Valued Contributor

Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

Hi everyone,

 

we have since over a year a strange issue on our blades.

The issue we have is, that sometimes some bladeservers (BL460c G7 / BL465c G7) loose one side of the network connection (we have C7000 Enclosures with always two 10Gb Ethernet Pass Thru Modules) and then they report (SNMP) NIC Redundancy Reduced Trap (18014) through our HP SIM. There is always only one port of a module affected, so only one blade looses redundancy.

We then always have to reset the hole Pass Thru Module that the port comes up again.

When connecting with SSH to a affected module, the status is as following:

Port# info


         Uplink                            Downlink
Port Ena Link Speed 1G-Neg  Type  Approved Link Speed Neg
  1  ena  up  10Gb   on    3m DAC    yes    up  10Gb  on
  2  ena  up  10Gb   on    3m DAC    yes    up  10Gb  on
  3  ena down  --    on    3m DAC    yes   down  --   on

 

The port nr. 3 is the "bad" port in this case.

 

My question: Is there anyone else using 10Gb Pass Thru and experiencing these problems?

 

We have a case open at HP but until know they couldn't help us...

 

Regards,

Ville

9 REPLIES
armi
Visitor

Re: Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

Hi Ville

 

We have come across a similar issue and like you have just logged a call with HP, do you have any updates.

 

We are seeing random network and FCoE disconnects using 554FLB and 554M adapters, at FCoE and ethernet level.

 

I have upgraded the firmware of the 10GB pass thru module but it has had no positive affect. After the reboot of the interconnect module we get downlink ports reported as down. We can manually set them as up, but on next reboot of the interconnect it goes down again.

 

Always seems to be the same blades but not all blades affected. All blades are identical for firmware and driver.

 

Regards

 

 

 

 

 

 

Mick Ryan
Advisor

Re: Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

We've seen a similar issue - certain 10G ports boucing for no reason. HP pretty much gave up after the usual round of firmware, drivers, patches, etc resolved to fix anything.

 

As of now, I think the issue is related to the C7000 enclosure model. One has the part number of 507019-B21 and the other is 412152-B21. The ones with the Model Number 412152 seem to have the issue

 

I'll be swapping some enclosures roud this coming weekend to verify that this resolves the issue. I'll add a post on here to confirm the results

 

Mike

Mick Ryan
Advisor

Re: Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

Ok, to close this one out. The enclosure was swapped out over the weekend and the bouncing NIC problem has completely gone away. I hope this helps anyone else in the same position.

 

The older enclosure has been loaded up with G1s using 1Gig interfaces and they seem stable

 

Regards

 

Mike

SwisspostIT
Valued Contributor

Re: Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

Hi everyone,

 

thanks for your answers!

We still have a case open and HP is now replacing affected Pass Thru Modules.

But I think this won't help because we have it randomly on different modules (it's almost never the same Blade or port affected) and we have about 80 Pass Thru Modules on 40 different enclosures...

 

The problem is always gone, if you reset the pass thru module.

Firmware of the pass thru module is 1.0.7.0 which is the newest available firmware...

 

I think the next thing we will try to do is to update firmware of all CNAs on all blades to FW 4.1.450.1707 but this will take a while since we'll have to do that on about 350-400 blades...

 

Regards,

Ville

SwisspostIT
Valued Contributor

Re: Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

hi everyone,

 

some updates:

 - we updated the CNA Firmware to the newest (currently 4.6.247.5)

 - Pass Thru Firmware to 1.0.11.0

 - BIOS newest

 - Drivers newest (4.6.x currently)

 

We still experience the issue, and also on Gen8 Blades (BL460c Gen8).

We even had it on some freshly new inserted Gen8 Blades without an operating system installed, so the issue seems to be OS independent...

 

Regards,

Ville

armi
Visitor

Re: Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

We are in a similar position but have made some progress.

 

We had the issue where the downlink on the I/O module didn't renegotiate with the CNA correctly.

 

Symptoms were after a power status change of the host or I/O module the downlink port randomly went down.

 

We could reboot a whole chassis of blades multiple times and each time different ports would be affected. For some reason port 6 seemed to be the worst offender.

 

This appears to be fixed with a in I/O module firmware 1.0.11.0 and CNA firmware past 4.6.*** and corresponding drivers.

 

However

 

This has now introduced 2 new symptoms.

 

Nexus port when a host is rebooted gets stuck in VFC initialising.

 

or

 

We have a lockup of the I/O module. The downlink and uplink report as online and up, but no FC gets through when a host reboots. The nexus doesn't see any flogi, but ethernet traffic is seen.

 

A reset of the I/O module kicks the ports back into life, but a subsequent reboot of the host may have the same affect.

 

This is using the latest levels set in the HP recipe pdf.

 

We have a call with HP which is getting escalated so slowly and we have jumped through all the log gathering hoops requested.

 

In the meantime if anyone has any insight to a possible fix I'd appreciate any ideas.

SwisspostIT
Valued Contributor

Re: Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

hi,

 

thanks for your inputs!

as you mentioned, the firmware 1.0.11.0 of the pass-thru module and the firmware 4.6x or newer seems to fix this issue, since we didn't have the issue yet with this combination...

 

about the new issues you are experiencing:

Did you have this also with the CNA Firmware 4.9.311.20?

I'll check if anyone at our company experienced an issue like that...

 

Regards,

Ville

armi
Visitor

Re: Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

Yes, we had to go to the latest firmware to satisfy HP and we still see the issue.

 

We've also downgraded to ESXi 5.0 update 2 as its not on the current Spock support matrix, but still have the issue.

 

I identified this morning a mis reporting in the firmware level, which maybe a red herring or not, still waiting on HP to come back to me.

 

Esxi and emulex gui both report the device at 4.9.311.20 but if you do a 'show firmware summary' on OA command line it reports the old version is still installed.

 

Firmware Component                   Current Version      Firmware ISO Version
------------------------------------ -------------------- ---------------------
System ROM                           I31 2013.12.20       I31 2013.12.20      
ILO4                                 1.40                 1.40                
Power Management Controller          3.3                  3.3                 
HP FlexFabric 10Gb 2-port 554FLB Ada firmware : 4.9.311.2 firmware : 4.9.311.2
HP FlexFabric 10Gb 2-port 554FLB Ada firmware : 4.9.311.2 firmware : 4.9.311.2
HP FlexFabric 10Gb 2-port 554M Adapt firmware : 4.6.247.5 firmware : 4.9.311.2!
HP FlexFabric 10Gb 2-port 554M Adapt firmware : 4.6.247.5 firmware : 4.9.311.2!

 

We also have colleagues in another country, experiencing the same issue independantly of us identifying it, so its not just a few of us, I guess.

 

Hampannabk_emu
Occasional Visitor

Re: Issues with 10Gb Pass Thru Module in C7000 (spontaneous "redundancy lost")

I am observing the wiered issue with respect pass-thru module being used c7000 enclosures with part Number 412152-B21 & 507019-B21.

 

Where in the pass-thru module works well with enclosure having part number 507019-B21, but when used with enclosure having part number 412152-B21, Links on pass-thru modules doesnt come up.  

 

Below is the port info command output

 

Port# info

Uplink Downlink
Port Ena Link Speed 1G-Neg Type Approved Link Speed Neg
1 dis down -- on SR SFP+ yes down -- on
2 dis down -- on N/A no down -- on
3 dis down -- on N/A no down -- on
4 ena down -- on SR SFP+ yes down -- on
5 dis down -- on N/A no down -- on
6 ena down -- on SR SFP+ yes down -- on
7 dis down -- on N/A no down -- on
8 dis down -- on N/A no down -- on
9 dis down -- on N/A no down -- on
10 dis down -- on N/A no down -- on
11 dis down -- on N/A no down -- on
12 ena down -- on SR SFP+ yes down -- on
13 ena down -- on N/A no down -- on
14 ena down -- on N/A no down -- on
15 dis down -- on N/A no down -- on
16 dis down -- on N/A no down -- on

 

 

i actually have an uplink from Cisco MDS connected on port 4. and unsuccessfully tried below command to bring the link  up

 

Port# ena 4 ena

Port# Port 4 uplink is UP at 10Gb
WARNING: Downlink failed to come up on port 4
Port 4 uplink has been brought DOWN


Port#

 

 

Can anyone make me understand the actual difference of using Pass-thru with both these enclosures. as the same pass-thru module works fine in one and not in other.