Server Management - Systems Insight Manager
cancel
Showing results for 
Search instead for 
Did you mean: 

Running Jobs not completing

Rob Buxton
Honored Contributor

Running Jobs not completing

Hi All,
HPSIM 5.0 SP3.
I've got several tasks that are 0% complete and in a running state.
They never end, you cannot stop and delete them.
Anyone seen similar and know of a way to remove them.

I've had a quick look through the various mx jobs, but there doesn't seem to be a way of killing actual jobs by ID.

A couple of the tasks seem to relate to a deployment of a PSP to a new Server that never completed. Another is a custom command.

We also had an issue where events were not being e-mailed and we had to restart HPSIM.
It seems SP3 has made a few things a bit fragile.
3 REPLIES
OlivierV
Trusted Contributor

Re: Running Jobs not completing

Hello Rob.
Do you have pending instances of these tasks ?

I had a similar issues on our SIM 4.2 PSP 2 where some HW polling tasks had many pending instances. Today this is back to normal state. The only difference I see is that one of these tasks was taking 3 minutes to complete and now it takes 1 second (HW status polling (ping only) for about 40 servers on a SMHD loop where the CMS is also connected, very fast so !).
I suppose one of the machines in this list was causing an issue (delay) and making the other tasks taking a long time. Could you try to disable one the these tasks ans see how the others behave ?
You may have to restart the SIM service if you have too many pending instances.

I was considering upgrading my SIM 4.2 PSP 2 to SIM5 SP3 but I think I will upgrade it to SIM5 SP2 only, too many issues with this release. And no news from HP for the moment it seems :-((

Regards.
Scott Shaffer
Esteemed Contributor

Re: Running Jobs not completing

Hey Rob, there are a couple of issues here and I think I can help.

There are 3 known ways this can happen.

First, if you have WBEM enabled but don't have any 'WBEM-only devices' (like HP-UX or SAN-storage) then you might want to disable WBEM. It can get in an unfriendly state and cause this. If you do have either of those, leave it on - we are working on this problem right now and should have a fix in a couple of months.

Second, there are unusual situations where an iLO can get messed up - maybe you've seen a state where you browse in and get some strange telnet-looking screen? Resetting the iLO fixes it and its pretty rare, but this causes SIM trouble - the web server sort-of responds and we get stuck. Again, we're working on making this more bullet-proof for a future release.

Finally, there are issues with tasks like this where sometimes they don't seem to respond properly - we're not sure what these are.

Restarting SIM clears them out, but of course the tasks might or might not be finished.
Dude, we've been totally misled by our album covers!
Rob Buxton
Honored Contributor

Re: Running Jobs not completing

Scott,

Thanks, ahhh WBEM again!
We do use it for our VMWare based Servers and are looking to extend its usage to our Solaris Servers.
The two jobs that failed was an initial Software Deploy task to two Servers. The first completed, the second looks like it did deploy the Software but didn't reboot the Server. So this deploy task and the initiating task were both in a running state.

I've rebooted the Server that was doing the deployment failed on, and overnight the two tasks associated with it that I'd tried to cancel / delete are now in a cancelled state.

The remaining job is our custom command that runs whenever a Critical Server is detected. I've tried killing it, unlike the other jobs it comes up with a Cancel or Kill option. Tried both but it doesn't do much. That doesn't seem to be stopping anything so I'll leave it for now.