Server Management - Systems Insight Manager
1834673 Members
2089 Online
110069 Solutions
New Discussion

Re: HP SIM CMS stops responding after sending a install software and firmware task to more then 10 srv

 

HP SIM CMS stops responding after sending a install software and firmware task to more then 10 srv

This is a summary of the problem that we are experiencing on the HP SIM CMS. We have started experiencing this problem back in version 4.2 and it is still present in version 4.2 SP1. We can replicate this problem easily and have reinstalled the HP SIM CMS server 5 times since on two different hardware (both dl380g3).

The problem:
Every time we issue a Deploy -> Deploy Drivers, Firmware and Agents -> Install Software and Firmware task to 10 or more machines the HP SIM server stops responding on the network and the solution is to reboot it. This happens when you deploy more then one component to 10 or more machines. For example the last task that freezed the CMS server was the following one to 11 machines ( server had to be rebooted two times, after I rebooted it the first time it froze again).


When I deploy one component to more then 10 machines HP SIM processes the first 10 machines correctly ( the job will randomly select 10 machines from the target group and will deploy the component correctly to the first ten machines). After that the job stops and in the status for that tas it appears it never completes installing the remaining machines. Say if the total number of machines is 20 it will report correctly for the first 10 machines and it will not increase the total number of machines in the status filed for at least 20 minutes after which it will report process timeoute for the remaining 10 machines.

Here is the configuration of the HP SIM Server:

Model: Proliant DL380G3
1 CPU Intel Xeon 2.80 GHz
2GB of memory

Windows 2003
HP SIM installed using a local administrator account
HP SIM version 4.2 SP1
Microsoft SQL Server 2000 - 8.00.818 (Intel X86) May 31 2003 16:08:15 Copyright (c) 1988-2003 Microsoft Corporation Standard Edition on Windows NT 5.2 (Build 3790: ) installed locally on the CMS server

After all the machines have been discovered I followed pg 25-46 of the new SSH pdf document to configure SSH (version 4/2005) on the CMS server. I was able to fix the trust relationship between the CMS server and all the managed servers (180 servers) and was able to run the "configure and repair agents" task as well as "copy agents settings task" successfully on any number of servers.

I'm failing to run "install software and firmware task" on more then 10 servers. There are no entries in the Systems or Application log on the CMS server that would indicate a problem.

Anybody experienced the same problem? I would really appreciate to resolve it.
1 REPLY 1
Szabolcs Petho_1
Regular Advisor

Re: HP SIM CMS stops responding after sending a install software and firmware task to more then 10 srv

Hi Anamarija,

We have experienced very similar.

I tried to send installation to 11 machines.
Up to 10 minutes it showed only 10 machines at Total Systems and we got 3 Succeeded and 7 Failed statuses.
When the task finished the 11th server is also displayed, Total Systems went to 11, but I see 1 In Progress and 10 Process timeout statuses...

Also I don't know why the original task window shows that one server status is In Progress, but if I query it from menu Logs->View Task Results it shows that one server is Succeeded.

It also happens that hpsmhd.exe eats all the processor time and the task never completes. All servers are in In progress state. In this case I can only kill that hpsmhd.exe process and restart the HP Systems Management Homepage service. After it the task is going on with the same anomaly what I wrote at the beginning.

In our case the cause of failures is that hpsmhd.exe fails, but it is a different story, you can find it in thread:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=823570

Regards, Szabolcs