Server Clustering
1748089 Members
5039 Online
108758 Solutions
New Discussion

CMU 7.1 - Actions and Alerts Issue

 
BrentGee
Advisor

CMU 7.1 - Actions and Alerts Issue

Hi Again:

 

Trying to gather statistcs in the new gui. On BL2x220 G6s, no problem everything works great. However, on my BL2x220 G5s, I can't see any statistics whatsoever. I tried using the native commands and the collectl commands. No data whatsoever.

 

Is there a way I can troubleshoot this?

2 REPLIES 2
Rakshika
Advisor

Re: CMU 7.1 - Actions and Alerts Issue

Hello,

 

Please check following things;

- Is passwordless ssh enabled from head node to problematic nodes?

- Is CMU monitoring agent (cmu_cn-7.1-1.x86_64.rpm) is installed on those nodes?

- Do you have firewall running between head node and problematic compute node?

- Are monitoing agents running on problematic nodes?

  Please run this command on head node;

   # /opt/cmu/bin/pdsh -w <nodelist> ps -ef | grep Monitoring | /opt/cmu/bin/dshbak -c

       <nodelist> should be hostname of nodes having problem. eg: n[1-5]

- Look in these logs to debug issue;

    Head node: 

         /opt/cmu/log/MainMonitoringDaemon_<head node name>.log

    Node on which secondary monitoring daemon is ruuning (above command will show this):          

         /opt/cmu/log/SecondaryServerMonitoring_<compute node name>.log

    Problematic compute nodes: 

        /opt/cmu/log/SmallMonitoringDaemon_n1.log

           

If it doesn't solve problem, please post above things here and send us logs.

BrentGee
Advisor

Re: CMU 7.1 - Actions and Alerts Issue


@Rakshika wrote:

Hello,

 

Please check following things;

- Is passwordless ssh enabled from head node to problematic nodes?

YES

- Is CMU monitoring agent (cmu_cn-7.1-1.x86_64.rpm) is installed on those nodes?

YES

- Do you have firewall running between head node and problematic compute node?

NO

- Are monitoing agents running on problematic nodes?

YES

 

 

Will get back to you with regard to the following instructions:

 

  Please run this command on head node;

   # /opt/cmu/bin/pdsh -w <nodelist> ps -ef | grep Monitoring | /opt/cmu/bin/dshbak -c

       <nodelist> should be hostname of nodes having problem. eg: n[1-5]

- Look in these logs to debug issue;

    Head node: 

         /opt/cmu/log/MainMonitoringDaemon_<head node name>.log

    Node on which secondary monitoring daemon is ruuning (above command will show this):          

         /opt/cmu/log/SecondaryServerMonitoring_<compute node name>.log

    Problematic compute nodes: 

        /opt/cmu/log/SmallMonitoringDaemon_n1.log

           

If it doesn't solve problem, please post above things here and send us logs.


Sorry for the delay. However, now that all of the other issues have been resolved, I will be able to concentrate on my final problem with 7.1. Thank you for all of these suggestions.