Simpler Navigation for Servers and Operating Systems
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
Server Clustering
cancel
Showing results for 
Search instead for 
Did you mean: 

SL4540 Gen8 backup/clone problem

SOLVED
Go to solution
Highlighted

SL4540 Gen8 backup/clone problem

CMU version 7.2

 

Backup/Clone target:

SL4540 Gen8 running RHEL 6.4 x86_64

BIOS: 2/10/2014

B120i: 4.50

iLO: 1.50

 

When trying to backup one of the SL nodes, the system successfully reboots, pxe boots, and then goes into an endless loop of various Call Trace outputs and USB disconnect/uhci_hcd events (see attached screenshot)

 

The backup job eventually times out and fails, but the SL node continues in this endless loop

 

I have blacklisted ahci as suggested in the user guide for the B120i controller.

I have also blacklisted hpsa to prevent the P420i controller from loading

 

When kicking off the backup job, I select sda partition 3...this is where the / partition resides

 

Has anyone run into an issue like this?

Any suggestions?

 

5 REPLIES
Chintala
Advisor

Re: SL4540 Gen8 backup/clone problem

Hello Ryan,

 

Is this a internal cluster or customer cluster ?

If it is a customer cluster, please raise support call at local hp support center.

 

Is this node has Mellanox cards ? If yes, what is the firmware version ?

 

I couldn't find the screen shot attached. Can you please attach it again, when you see the stack traces.

 

Regards,

Abhishek Chintala

Chintala
Advisor
Solution

Re: SL4540 Gen8 backup/clone problem

Hello Ryan,

 

Please find the patch (PATCH-CMU_7.2.1-X86_64-0002) on hpsc site. This patch fixes the CMU netboot kernel crashes seen on servers with Mellanox NICs.

 

 

 

Patch management -> Find patches by product -> HP Insight Cluster Management Utility -> Insight Cluster Management Utility V7.2 -> PATCH-CMU_7.2.1-X86_64-0002.

 

Let us know how it goes.

 

Regards,

Abhishek Chintala

 

Re: SL4540 Gen8 backup/clone problem

This is an internal testing cluster...

 

Yes it does have Mellanox cards...

 

[root@SL4540-01 ~]# ethtool -i eth0
driver: mlx4_en
version: 2.1.6 (Aug 27 2013)
firmware-version: 2.30.3200
bus-info: 0000:04:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

 

But I am booting off of the 1Gb onboard NIC

 

I've attached the screenshot again

Chintala
Advisor

Re: SL4540 Gen8 backup/clone problem

Have you applied the patch and tried it again ?

 

If not, please apply the patch mentioned in my above post,  and try it again.

 

Let us know how it goes.

Re: SL4540 Gen8 backup/clone problem

After applying the patch I was able to successfully pull a backup image...

 

Thank you for the help!