Storage Boards Cleanup
To make it easier to find information about HPE Storage products and solutions, we are doing spring cleaning. This includes consolidation of some older boards, and a simpler structure that more accurately reflects how people use HPE Storage.
HPE StoreVirtual Storage / LeftHand
cancel
Showing results for 
Search instead for 
Did you mean: 

P4500 Lefthand SAN

njdgomez23
Occasional Advisor

P4500 Lefthand SAN

Hello All, 

I have a P4500 that keeps failing over every 30 mins or so since this morning. It first hung / fell over at around 4.30 am and was brought back up at around 9.30am. Since then, the array only stays up for around 30mins before doing the same again. I have manage to grab what logs i can, but nothing much stands out apart from a number of network port issues and 1 alert for a firmware issue. 

Alerts.log

Event Log (32029) [01000000]: 2016 Jul 07 03:01:39 CSLH01.CITYSPRINT.CO.UK: E32090600 System System SNMP community string 'public' has been deleted.

Event Log (32030) [01000000]: 2016 Jul 07 03:01:44 CSLH01.CITYSPRINT.CO.UK: E32090300 System System SNMP community string is 'public' with access control 'default'.

Event Log (32031) [01000000]: 2016 Jul 07 09:25:53 CSLH01.CITYSPRINT.CO.UK: EF2000100 System System Event manager has been started.

Event Log (32032) [01000000]: 2016 Jul 07 09:25:53 CSLH01.CITYSPRINT.CO.UK: E010A0100 HW System Network Interface 'Motherboard:Port1' Status = 'Up'.

Event Log (32033) [01000000]: 2016 Jul 07 09:25:55 CSLH01.CITYSPRINT.CO.UK: E02000204 System System 'Email' Configuration Status = 'Failed'.

Event Log (32034) [01000000]: 2016 Jul 07 09:26:17 CSLH01.CITYSPRINT.CO.UK: E420D0100 System admin User 'admin' login 'OK'.

Event Log (32035) [01000000]: 2016 Jul 07 09:26:34 CSLH01.CITYSPRINT.CO.UK: E420D0100 System admin User 'admin' login 'OK'.

 Sysevent.log

798 | 07/07/2016 | 02:23:31 | System Firmware Error #0x01 |

 

kernel.error:

Jul  7 02:50:46 CSLH01 kernel: tg3: eth1: transmit timed out, resetting

Jul  7 02:50:46 CSLH01 kernel: tg3: DEBUG: MAC_TX_STATUS[00000008] MAC_RX_STATUS[00000000]

Jul  7 02:50:46 CSLH01 kernel: tg3: DEBUG: RDMAC_STATUS[00000008] WDMAC_STATUS[00000000]

Jul  7 02:50:46 CSLH01 kernel: tg3: tg3_stop_block timed out, ofs=2c00 enable_bit=2

Jul  7 02:50:46 CSLH01 kernel: tg3: tg3_stop_block timed out, ofs=2400 enable_bit=2

Jul  7 02:50:47 CSLH01 kernel: tg3: tg3_stop_block timed out, ofs=1800 enable_bit=2

Jul  7 02:50:47 CSLH01 kernel: tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2

Jul  7 02:50:47 CSLH01 kernel: bonding: bond0: Warning: No 802.3ad response from the link partner for any adapters in the bond

 

From kernel.info:

Jul  7 02:50:46 CSLH01 kernel: NETDEV WATCHDOG: eth1: transmit timed out

Jul  7 02:50:47 CSLH01 kernel: tg3: eth1: Link is down.

Jul  7 02:50:47 CSLH01 kernel: bonding: bond0: link status definitely down for interface eth1, disabling it

Jul  7 02:50:50 CSLH01 kernel: tg3: eth1: Link is up at 1000 Mbps, full duplex.

Jul  7 02:50:50 CSLH01 kernel: tg3: eth1: Flow control is off for TX and off for RX.

Jul  7 02:50:50 CSLH01 kernel: bonding: bond0: link status definitely up for interface eth1.

Cheers

2 REPLIES
Stor_Mort
HPE Pro

Re: P4500 Lefthand SAN

Hi njd, 

You actually have a P4500 G2 system. I've seen a few of these where the NIC chip on the system board fails and causes these errors in the kernel log.

Jul  7 02:50:46 CSLH01 kernel: tg3: eth1: transmit timed out, resetting

Jul  7 02:50:46 CSLH01 kernel: tg3: DEBUG: MAC_TX_STATUS[00000008] MAC_RX_STATUS[00000000]

Jul  7 02:50:46 CSLH01 kernel: tg3: DEBUG: RDMAC_STATUS[00000008] WDMAC_STATUS[00000000]

Jul  7 02:50:46 CSLH01 kernel: tg3: tg3_stop_block timed out, ofs=2c00 enable_bit=2

The only solution is to replace the system board. If you have a support contract, please open a support case at 1-800-633-3600. 

 

I am an HPE employee - HPE StoreVirtual Support
njdgomez23
Occasional Advisor

Re: P4500 Lefthand SAN

Hi Thanks,

Yes we replaced the system board and all is ok once again :)