Simpler Navigation for Servers and Operating Systems
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
BladeSystem Server Blades
cancel
Showing results for 
Search instead for 
Did you mean: 

C7000 & BL460 " Device Identification Data Error"

chuckk281
Trusted Contributor

C7000 & BL460 " Device Identification Data Error"

Problems after an upgrade. Kevin is looking for info:

 

******************************************************************************************************************

Good Afternoon All,

 

We recently decided to upgrade our enclosure using the following media

Smart Update Firmware DVD ISO v9.00 - April 12, 2010

 

We upgraded the OA modules to version: 3.00 ( March 19th 2010 )

The Virtual Connect Ethernet Modules to: 2.33

The Virtual Connect FC modules to: 1.40

 

The blades got updated before we kicked off the upgrade of the enclosures

ILO at version: 1.81

System Rom at: l15 ( 2009-7-10)

FC HBA: 280A4

Smart Array Contr: 1.84

 

The blade upgrade was fine that all worked without any problems, next up came the enclosure upgrade the upgrade completed we power cycled the enclosure afterwards just for a clean boot of all the components when the enclosure and virtual connect modules all booted up again, I noticed a nice big Yellow exclamation mark next to the device bays. " Device Identification Data Error" being the error. Help please?

******************************************************************************************************

 

Monty provided some info:

*************************************************************************************************************

Device Identification Data Errors are bad. :smileymad:

 

This means the OA finds checksum errors in one of the FRU eeproms on the server or mezz.  When this happens – the OA writes a message to the syslog indicating which mezz card has the problem, and removes this mezz card from the list of ports on that server.

 

The reason the problem was shown after the enclosure was power cycled is that the OA keeps a cached copy of the FRU eeprom information in memory for performance and only reads the server blade raw FRU eeproms if:

  • The server blade is inserted
  • The OA CLI ‘reset server’ command is used to remotely ‘reseat’ the server blade
  • The OA is replaced or rebooted
  • The entire enclosure is power cycled

 

The FRU eeproms are only written when a Virtual Connect profile is applied to a server blade.

 

The Onboard Administrator and Virtual Connect rely on the FRU eeprom information to determine server connectivity. When the OA detects FRU eeprom data errors it removes that mezz from the list of connections to that server – which will result in Virtual Connect Manager not being able to configure that mezz card with a server profile.

 

The enclosure name in the screenshot includes “Staging” – so I think this is a not a production environment.

 

  • There is no process for a customer to correct FRU eeprom data errors. 
  • Downgrading the OA will not change the behavior – this is a hardware issue that all OA FW versions will report
  • Replacing these FC HBA mezz cards will remove the error and must be done before these servers can be deployed into production.

**************************************************************************************************************

 

Have you encountered this problem? Join the conversation and less us know what you did to solve it.