ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

ILO, IML issues DL360 G7 -- cpqHeEventLogCondition

 
Jaime Halas
Occasional Contributor

ILO, IML issues DL360 G7 -- cpqHeEventLogCondition

There was a blue screen on one of our new G7s and the IML stated entries which indicated the aforementioned blue screen along with an “uncorrectable memory error” immediately following.  Our current memory monitor did not detect the memory failure and long story short, the way the query is written, it will not find an error such as this.  This led us to perhaps query the IML for “critical” failures and catch this issue as well as others which might be out in the client base.  I can query the MIB (cpqHeEventLogCondition, 1.3.6.1.4.1.232.6.2.11.2)  which indicates other(1), ok(2), degraded(3), or failed(4) for the various statuses. This MIB is described as “The overall condition of the Integrated Management Log feature.”  Judging by some documentation out there (unknown validity), the statuses are defined by the following:

                other(1): When System Event Log is not supported or unknown
                ok(2): If there are no degraded, warning, nor failed entries

                degraded(3): If there are degraded or warning but no failed entries
                failed(4): If there are failed entries

On the particular client that blue screened/memory error, there were several critical entries in the IML, some of them being Network Adapter Link Down, Power Supplies not Redundant etc. Investigation started with the failed(4) status but once the blue screen/memory error entries were marked as acknowledged/repaired, the cpqHeEventLogCondition indicated ok(2).  I found this confusing as there were still “critical” and “caution” severity messages within the IML.  Bottom line, what I need to know is what exactly triggers a failed(4) status?  It can’t be a “critical” severity entry generally, perhaps the IML is smart enough to pick the most critical? What is the criteria to throw us into an overall failed status?