Operating System - Linux
1828220 Members
2094 Online
109975 Solutions
New Discussion

ICE-Linux: Image capture fails to reset machine at end

 
SOLVED
Go to solution
Sarah Nordstrom
Frequent Advisor

ICE-Linux: Image capture fails to reset machine at end

I am having an issue where an image is captured from a Linux machine (RHEL4 U6) however after the capture, ICE-Linux fails to reset the machine, so the task fails (and deletes the image!). I am getting the following errors, which is odd because obviously it was able to reset the machine at the start of the imaging, and we are not having any issues with the iLO or general management of this machine other than this. Any help appreciated. Full log attached.

---

Checking to see if power is off.
Attempting to reboot server via SSH.
Failed: Unable to create SSH connection: No route to host
Attempting to reboot via SOAP.
Attempting to reboot server via ACPI.
Unable to communicate with the management processor: Error retrieving BMC for server. Root cause:Error
connecting to iLO at 192.168.28.157.
11 REPLIES 11
William Athanasiou
Occasional Advisor

Re: ICE-Linux: Image capture fails to reset machine at end

After you see this message, can you still perform operations on this node, such as powering it on/off and initiating another capture? Or do you need to rediscover it?

Is the address for the iLO correct?
Sarah Nordstrom
Frequent Advisor

Re: ICE-Linux: Image capture fails to reset machine at end

The address for iLO is correct, and I can still perform all operations including powering on/off, re-doing the capture right away (which fails again in the same way), etc. after the operation fails.
Mitchell Kulberg
Valued Contributor

Re: ICE-Linux: Image capture fails to reset machine at end

Hi,

I would just like to clarify your last post.

Are you saying that after the node fails to deploy with the "Error retrieving BMC" error, you can immediately use the power on or power off functions? I'm only asking because that would be unusual.

What version of ICE-Linux are you using? V2.0 or 2.01?

After it fails, can you go to the "All Systems" display and verify whether the system and it's iLO are properly associated? That means that the system is there with a gren check box in the MP column, AND the iLO is there and it says "in server "

Thanks,
Sarah Nordstrom
Frequent Advisor

Re: ICE-Linux: Image capture fails to reset machine at end

Yes, that's what I meant. I just tried using 'power off' right after the failure, and it powered off fine with the below messages. The association is still in place, and all MP functionality seems fine, other than this capture failure.

---

Checking to see if power is on.
Attempting to power down via management processor
Completed succesfully
Sarah Nordstrom
Frequent Advisor

Re: ICE-Linux: Image capture fails to reset machine at end

Also, version information of our system:

Operating system: Linux
Version: Systems Insight Manager 5.2 with Update 2 - Linux
Build version: C.05.02.02.00
Build date: 2008-07-04 10:18

HP Insight Control Environment for Linux 2.00.01
Mitchell Kulberg
Valued Contributor
Solution

Re: ICE-Linux: Image capture fails to reset machine at end

Hi,

Sorry. Not meanin to ignore you.
I keep getting called away.
I honestly don't know what is causing your particular symptom. The "Error connecting to iLO" error is very odd.

There are log files that might tell us something. The SIM log files are on /var/opt/mx/logs

The two biggies are mx.log and mxdomainmgr.0.log

Check these files when you see the error.

You might look in there to see if anything meaningful is being displayed. I will try and look at this more next week, but don't have the time today. Sorry.

Sarah Nordstrom
Frequent Advisor

Re: ICE-Linux: Image capture fails to reset machine at end

Nothing unusual appears in either of the log files given. I've opened a case on this with HP, they've had me delete both the iLO and server itself from SIM and re-discover/identify. Having the same issue after doing that. Imaging all other servers we have works fine (found that out today), it's only imaging this one specific server that's an issue. BIOS and iLO firmware are up to date.
Mitchell Kulberg
Valued Contributor

Re: ICE-Linux: Image capture fails to reset machine at end

VERY interesting.

The message specifically says that it's having trouble contacting the iLO, so I think you should test iLO connectivity.

Try using telnet, ssh, and web, to connect to the iLO, and make sure you use the username and password you specified during the ICE-Linux installation.

While connected through the web, go to the Administration tab and then select "Management". On that screen click "View XML reply" and see if the data that comes up looks OK. Post it here if there is nothing sensitive.

Thanks,
Sarah Nordstrom
Frequent Advisor

Re: ICE-Linux: Image capture fails to reset machine at end

Telnet is disabled on the iLO, but SSH and web operate fine, and I have double checked the password.
XML output looks fine, and is attached (numbers censored with #).
Sarah Nordstrom
Frequent Advisor

Re: ICE-Linux: Image capture fails to reset machine at end

The issue is resolved! The support representative suggested a few things to try, and this is what eventually fixed it.
1. Delete server and iLO from HP SIM.
2. Delete the pxelinux.cfg config file for this server's MAC.
3. Remove all DHCP leases for this server's MAC.
4. Boot ICE-Linux and let it autodiscover.
5. Capture an image.

Everything works now. The reboot via SSH fails, then it tries via SOAP, which succeeds.
Sarah Nordstrom
Frequent Advisor

Re: ICE-Linux: Image capture fails to reset machine at end

Final solution was:
1. Delete server and iLO from HP SIM.
2. Delete the pxelinux.cfg config file for this server's MAC.
3. Remove all DHCP leases for this server's MAC.
4. Boot ICE-Linux and let it autodiscover.
5. Capture an image.