Server Management - Systems Insight Manager
cancel
Showing results for 
Search instead for 
Did you mean: 

How to get a trustful list of problematic ILO’s ?

 
SOLVED
Go to solution
Highlighted
Occasional Contributor

How to get a trustful list of problematic ILO’s ?

Hello,

 

Almost every day I encounter the problem that a random ILO isn’t reachable (not pingable or pingable but user cannot logon anymore because ILO hangs, etc…). I work on a site with more than 500 physical servers (G7, G6, G5, G4 and even G3 servers with all kinds of ILO’s) so the probability to get this problem is high.

 

I want to get a realistic overview of ILO’s which are not usable at some time (not pingable, cannot logon, ILO not detected anymore, …). When I simply look in the “all management processors” collection, then I don’t get a good result. Some criticals seems perfectly reachable, other ILO’s stay green while they are not reachable. The task “daily system identification” gives me a more trustful result. The target details for the device in the target instance is very useful. Probably this task is responsible for the logging entries “SSH login:… (sim server) and SSH logout:  … (simserver)” in the ILO event log of the server. I want to write a script to extract the ILO info in order to get a list of ILO’s where there is something wrong. This script can be daily scheduled.

 

I already spent a lot of time to explore the tables in the SIM database and the logs on the sim server but with no success. My question is in which table or log can I find the target details information ? Can somebody help me ?

 

 

Thanks,

 

Alain

 

3 REPLIES 3
Highlighted
Honored Contributor

Re: How to get a trustful list of problematic ILO’s ?

The status of the iLO in SIM (i.e.. Green, amber etc.) is not necessarily a reflection of whether we can ping the iLO. I would create a group for all you iLOs and then set up ping timeouts/retires specifically on that group. The hardware status polling tasks checks the overall state of the iLO. You can create a report and use the SQL generate to create your task.

If my post was helpful please award me Kudos! or Points :)
Highlighted
Occasional Contributor
Solution

Re: How to get a trustful list of problematic ILO’s ?

Hello shocko,

 

I’ve tried your proposal but it didn’t completely help me but your idea was a good hint to look for a real solution. I’ve created a new collection with all the ILO’s we want to manage. Then I’ve set a new Identify Systems task for this new collection running each day. In the task instance result I can get a detailed log with a status for each ILO. The problem with the hardware status polling task is that I can’t get a log.

 

Thanks !

Highlighted
Honored Contributor

Re: How to get a trustful list of problematic ILO’s ?

Glad i could help ;)

If my post was helpful please award me Kudos! or Points :)