Insight Control for Linux
cancel
Showing results for 
Search instead for 
Did you mean: 

Capturing Image from BL680c G5 Server

Capturing Image from BL680c G5 Server

Can you please advice how can I capture a linux image from my BL680c G5 Server. I have tried the user guide page 26 but when I run the command /etc/init.d/hpasm reconfigure it says hpasm not found. Am I supposed to install more SP's or RPM's. My ILO is at 1.43 and the Firmware is at 8.

Any advice
57 REPLIES
Christopher Grandinetti
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

Mario,

If the node you are trying to capture the image from was deployed with ICE-Linux, it should have hpasm installed and there should be no need for to run hpasm -reconfigure.

Is this a pre-existing node that was not deployed with ICE-Linux?

If so, you must either deploy the PSP's using ICE-Linux or login to the node and manually install hpasm.

-Chris

Re: Capturing Image from BL680c G5 Server

I was able to run hpasm reconfigure without any issue on this preinstalled Linux node. I am still not able to capture the linux image. I tried to boot this blade via PXe and I get an internal stack error after the prompt

Boot to default BIOS local Storage
Boot to ICE-linux RAM disk environment

It select the ICElinux and then the stack error appears

Peter Havens
Advisor

Re: Capturing Image from BL680c G5 Server

Mario,

Can you post the details of the stack error that you are seeing?

Thanks,

-Peter
Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

Stack Error is below

Int 14: CR2 ffffd030 err 00000000 EIP c01102a0 cs 00000060 flags 00010086
Stack c0110495 c0110545 c010fa9f 00000000 c011f0be c030f0a8 c03be1e0 20600000
Donna Firkser
Regular Advisor

Re: Capturing Image from BL680c G5 Server

Mario,

I have a couple of questions:

a) Did you use ICE-Linux to install the system from which you are trying to capture the image? Or was the system already installed with a supported OS?

b) What OS is running on the system you are trying to capture?

c) On the system you are trying to capture, can you send us the output of the following command.
# /etc/init.d/hpasm status

Thanks,
Donna

Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

The OS was already installed on this Blade.

The output is of /etc/init.d/hpasm status
Status of hp-ilo:
hp_ilo is loaded... [ OK ]

Using high performance hp-OpenIPMI device driver


hp-OpenIPMI Status:
Module Size Used by
ipmi_si 51928 2
ipmi_devintf 18192 4
ipmi_msghandler 41864 2 ipmi_si,ipmi_devintf

hpasmxld is running... [ OK ]
Status of Foundation Agents (cmafdtn): cmathreshd cmahostd cmapeerd

cmathreshd is running... [ OK ]

cmahostd is running... [ OK ]

cmapeerd is running... [ OK ]
Status of Server Agents (cmasvr): cmastdeqd cmahealthd cmaperfd cpqriisd cmasm2d cmarackd

cmastdeqd is running... [ OK ]

cmahealthd is running... [ OK ]

cmaperfd is running... [ OK ]

cpqriisd is running... [ OK ]

cmasm2d is stopped... [ OK ]

cmarackd is stopped... [ OK ]
Status of Storage Agents (cmastor): cmaeventd cmaidad cmafcad cmaided cmascsid cmasasd

cmaeventd is stopped... [ OK ]

cmaidad is stopped... [ OK ]

cmafcad is stopped... [ OK ]

cmaided is stopped... [ OK ]

cmascsid is stopped... [ OK ]

cmasasd is stopped... [ OK ]
Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

Apologies for the bits and pieces answer

a) The OS was already installed on the Blade whose image I am trying to capture.

b) REDHAT linux 4 update 6 on x64

c) On the system you are trying to capture, can you send us the output of the following command.
# /etc/init.d/hpasm status
Status of hp-ilo:
hp_ilo is loaded... [ OK ]

Using high performance hp-OpenIPMI device driver


hp-OpenIPMI Status:
Module Size Used by
ipmi_si 51928 2
ipmi_devintf 18192 4
ipmi_msghandler 41864 2 ipmi_si,ipmi_devintf

hpasmxld is running... [ OK ]
Status of Foundation Agents (cmafdtn): cmathreshd cmahostd cmapeerd

cmathreshd is running... [ OK ]

cmahostd is running... [ OK ]

cmapeerd is running... [ OK ]
Status of Server Agents (cmasvr): cmastdeqd cmahealthd cmaperfd cpqriisd cmasm2d cmarackd

cmastdeqd is running... [ OK ]

cmahealthd is running... [ OK ]

cmaperfd is running... [ OK ]

cpqriisd is running... [ OK ]

cmasm2d is stopped... [ OK ]

cmarackd is stopped... [ OK ]
Status of Storage Agents (cmastor): cmaeventd cmaidad cmafcad cmaided cmascsid cmasasd

cmaeventd is stopped... [ OK ]

cmaidad is stopped... [ OK ]

cmafcad is stopped... [ OK ]

cmaided is stopped... [ OK ]

cmascsid is stopped... [ OK ]

cmasasd is stopped... [ OK ]
Donna Firkser
Regular Advisor

Re: Capturing Image from BL680c G5 Server

So far so good. Did you discover the server and its iLO with SIM? And was SIM able to make the iLO to server association? You can tell by bringing up the SIM GUI and looking for the server and iLO in the "All Systems" display. You should see your server in this display (e.g. foo) and the iLO too (e.g. foo-cp). If SIM was able to make the correct association during discovery, the iLO entry should like something like "foo-cp in Server foo" and the "MP" column for the server "foo" entry should have a green check mark. The server must be properly discovered and registered in SIM before you can deploy or capture an image.
Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

sim was able to discover the blade and ilo

Mitchell Kulberg
Valued Contributor

Re: Capturing Image from BL680c G5 Server

Mario,

Where and when exactly did you see this stack trace message?

In the CMS system log? CMS SIM log? In the SIM GUI? On the managed node? Etc...

And tell me at what point in the capture did you see it.

Also, you wrote that the SIM was able to discover the blade and the iLO. Can you verify that the the iLO is properly associated with the server as described above?

Also, there sre some additional steps that need to be done when you use SIM to discover a node. These are:

- Verify the iLO association
- Run 'configuire or repair agents' against that node
- Apply an ICE-Linux license

Can you verify that these steps were performed?

Thanks,
Mitch
Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

Mario,

I see these messages on the console of the BL680C when I boot up the target server ( BL680C) in PXE boot mode

I dont see this at the capture


ILO is running.

Ran the configure or repair agent and here is the output

Configuration of agents started, waiting for it to be completed.
Configure Agents and Providers (START) ...
Configuring SSH authentication (START)...
Configure SSH for host based authentication (DONE)................. [SUCCESS]

LINUX configuration command (START)...
Configuring SNMP Settings (START)...
Stopping SNMP daemon...
Stopping snmpd: [ OK ]
Adding read community string public
Trap destination address ifdssim already in /etc/snmp/snmpd.conf
Restarting SNMP Daemon...
Starting snmpd: [ OK ]
Setting SNMP trap destination / SNMP read community string (DONE)....[SUCCESS]


Set Trust relationship to "Trust by Certificate" (START) ...
Setting Trust for System Management Home Page.
Stopping System Management Home Page
Stopping hpsmhd:
Copying /var/opt/mx/tmp/ifdssim.pem to /opt/hp/hpsmh/certs
Restarting System Management Home Page
Starting hpsmhd:
...hpsmhd: Could not determine the server´s fully qualified domain name, using 1
72.25.41.247 for ServerName
[ OK ]
Set Trust relationship to "Trust by Certificate" (DONE)............. [SUCCESS]


Setting admin password/Trust for Insight Management Agents 7.1/earlier (START)..
Setting admin password/Trust for legacy HP Server Management Agents..[SKIPPED] H
P Server Management Drivers and Agents, is not installed.


Linux configuration commands (DONE)................................. [SUCCESS]

Re-identifying system to get update information (START) ...
Re-Identification of system (DONE).............................. [SUCCESS].

Subscribing to WBEM / WMI indications ...
Subscribe to WBEM / WMI Indications (DONE).......................... [FAILED]
Check whether target system met the requirements and all of the software require
d to support indications is installed.
WBEM protocol settings are not valid/enabled for this system in HP SIM. Check yo
ur HP SIM settings.


ICE Linux license is applied

Thanks,
Mitch
Mitchell Kulberg
Valued Contributor

Re: Capturing Image from BL680c G5 Server

You see the stack trace on the console of the managed system when you PXE boot? Correct?

Let's try this:

- Record the MAC addres from the managed server

-Shut down the managed node

-Go the the SIM CMS and delete the managed node and it's iLO from SIM

-go to /opt/repository/boot/pxelinux.cfg and delete any files you see there that match the MAC address you recorded in step 1

- power on the node and have it PXE boot and watch the console.

- It should boot up into the ICE-Linux ramdisk and automatically be rediscovered in SIM.

- After about 3-5 minutes, the node should shut down and reboot back into the OS you had running

Please let me know if this works, or if you get the stack trace message when it boots the ramdisk.

Thanks
Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

When I powered on the node and rebooted via pxe it kept on going in a fornever end loop. When I typed in maintenance on the prompt it gave the stack error. One thing that I have noticed is the initally when I tried to capture a linux image I was get an error at step 1 now it is at step 2. the error is below

Setting one time PXE.
Could not set one time PXE:
Error retrieving BMC for server. Root cause:Could not determine the BMC associated with the server (admelprd1)
in the database. Probably not discovered yet.


Please let me know if this works, or if you get the stack trace message when it boots the ramdisk.
Mitchell Kulberg
Valued Contributor

Re: Capturing Image from BL680c G5 Server

This new error you're getting is the result of SIM losing the server to iLO association. So it doesn't know which iLO belongs to its server. Let's not worry about that too much right now. It will go away if we ever get this working.

The fact that you got this error though, and the fact that you get into the endless loop of PXE booting, makes me thing you didn't get the server properly deleted in SIM.

Shut down the managed node.

On the CMS, go to 'all systems', select that server, and select its iLO, and click 'delete' at the bottom of the screen.

Then, log into the CMS and cd to /opt/repository/boot/pxelinux.cfg

In there you will see files with names that look like MAC addresses. You need to delete any files that match any of the MAC addresses on your managed server.

Once those files are deleted, try PXE booting the managed node again and let me know what happens. It should NOT loop, but instead boot into the RAMdisk and discover itself in SIM.

Watch the rediscovery on the console of the managed node. If it works, you should see the system start to reboot after about 2 to 5 minutes. If it stack traces going into the ramdisk, then we need to look elsewhere.
Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

followed the process and it stacktraced going to ramdisk
Mitchell Kulberg
Valued Contributor

Re: Capturing Image from BL680c G5 Server

That is what I was afraid of...

So there is something special about this system that is causing our ramdisk, which normally boots fine on all supported platforms, to crash on yours.

First - is there any special hardware on this server? Any mezzanine cards or storage blades or accellerators or anything that might be considered a deviation from a standard server? How many CPUs do you have on this node and how many cores in each?

Second - I have to ask. Is your firmware up to date? Our software relies on recent, if not the latest versions of firmware for your server's BIOS and iLO. Consider downloading and booting the latest firmware update CD. It upgrades not just the iLO and BIOS, but just about every other thing in your server.

Third - If that doesn't reveal anything, I'd like you to reset the system BIOS to factory defaults and try again.

Let me know the resutls of these items.
Thanks,
Mitch
Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

This is a BL680C 4 quad core CPU's. No mezzanine cards and or storage blades.

firmware version is at 8. ILO is at 1.5

Mitchell Kulberg
Valued Contributor

Re: Capturing Image from BL680c G5 Server

Well - it doesn't get more up to date than that.

Let me see if anyone around here knows anything about this.

Mitch
Mitchell Kulberg
Valued Contributor

Re: Capturing Image from BL680c G5 Server

On a long shot, did you try to reset the BIOS to factory defaults? It's not like this is a known problem or anything like that, but it couldn't hurt.

In the meantime, I am checking to see if we have tested with this particular configuration.
Mitchell Kulberg
Valued Contributor

Re: Capturing Image from BL680c G5 Server

Mario,

Above you listed out the stack trace error you got.

How did you copy that?

Did you copy it off the screen or do you have some type of terminal connected to the console?

We would be interested in getting the surrounding text from the crash.

By the way, we are continuing to try and reproduce what you are seeing.

Mitch
Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

I tried it with the restoring of the bios to default factory setting but no luck.

The stack trace was typed by hand. The surrounding text is of uncompressing linux ..... kernel..

Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

this is what I see in the sim page
Mario Couthino
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

Here is the complete message on the stack trace

loading /icletoolkitboot/kernel
loading /icletoolkitboot/initrd.img

uncompress linux ok booting kernel

Than I get the stack error
Christopher Grandinetti
Frequent Advisor

Re: Capturing Image from BL680c G5 Server

Hello Mario,

We installed a a BL680c with(4) quad-core cpus into our c7000 enclosure that had 5 existing nodes installed with SUSE 10 SP1 running ICE-Linux. The BL680c was previously installed with RHEL4U5 and had BIOS version 10/18/2007 and iLO version 1.50.

We tried the image capture and it worked fine; we saw no issues:

puma1:/opt/repository/image/puma6-1f29-6f64-3-61-200804250320 # ls -ltr
total 1422660
-rw------- 1 root root 1455380480 Apr 25 15:25 puma6-1f29-6f64-3-61-200804250320.tar.gz

So, then we upgraded the BIOS firmware to the latest version: 02/13/2008.

We again tested the image capture and it worked fine too:

puma1:/opt/repository/image # ls -ltr
total 0
drwxr-xr-x 2 root root 104 Apr 25 15:20 puma6-1f29-6f64-3-61-200804250320
drwxr-xr-x 2 root root 112 Apr 25 16:21 puma6-2-1f29-6f64-3-61-200804250421
puma1:/opt/repository/image # cd puma6-2-1f29-6f64-3-61-200804250421
puma1:/opt/repository/image/puma6-2-1f29-6f64-3-61-200804250421 # ls -ltr
total 1423929
-rw------- 1 root root 1456680960 Apr 25 16:26 puma6-2-1f29-6f64-3-61-200804250421.tar.gz
puma1:/opt/repository/image/puma6-2-1f29-6f64-3-61-200804250421 #

We'll continue to try to reproduce this issue your are seeing.

In the meantime, can you send us the output of 'hplog -v' from the BL680c.

Also, please type 'hpasmcli' to get to the hpasmcli> prompt and reply with the following output: from your blade:

hpasmcli> show serial bios
BIOS console redirection port is currently set to COM1/9600.

hpasmcli> show serial embedded
Embedded serial port A: COM2
Embedded serial port B: Disabled

hpasmcli> show serial virtual
The virtual serial port is currently COM1.

hpasmcli> show server (lots of output so I won't paste it here)

Thanks,
-Chris