Simpler Navigation coming for Servers and Operating Systems
Coming soon: a much simpler Servers and Operating Systems section of the Community. We will combine many of the older boards, and you won't have to click through so many levels to get at the information you need. If you are looking for an older board and do not find it, check the consolidated boards, as the posts are still there.
Server Clustering
cancel
Showing results for 
Search instead for 
Did you mean: 

CMU 7.1 - Actions and Alerts Issue

Highlighted
BrentGee
Advisor

CMU 7.1 - Actions and Alerts Issue

Hi Again:

 

Trying to gather statistcs in the new gui. On BL2x220 G6s, no problem everything works great. However, on my BL2x220 G5s, I can't see any statistics whatsoever. I tried using the native commands and the collectl commands. No data whatsoever.

 

Is there a way I can troubleshoot this?

2 REPLIES
Rakshika
Advisor

Re: CMU 7.1 - Actions and Alerts Issue

Hello,

 

Please check following things;

- Is passwordless ssh enabled from head node to problematic nodes?

- Is CMU monitoring agent (cmu_cn-7.1-1.x86_64.rpm) is installed on those nodes?

- Do you have firewall running between head node and problematic compute node?

- Are monitoing agents running on problematic nodes?

  Please run this command on head node;

   # /opt/cmu/bin/pdsh -w <nodelist> ps -ef | grep Monitoring | /opt/cmu/bin/dshbak -c

       <nodelist> should be hostname of nodes having problem. eg: n[1-5]

- Look in these logs to debug issue;

    Head node: 

         /opt/cmu/log/MainMonitoringDaemon_<head node name>.log

    Node on which secondary monitoring daemon is ruuning (above command will show this):          

         /opt/cmu/log/SecondaryServerMonitoring_<compute node name>.log

    Problematic compute nodes: 

        /opt/cmu/log/SmallMonitoringDaemon_n1.log

           

If it doesn't solve problem, please post above things here and send us logs.

BrentGee
Advisor

Re: CMU 7.1 - Actions and Alerts Issue


Rakshika wrote:

Hello,

 

Please check following things;

- Is passwordless ssh enabled from head node to problematic nodes?

YES

- Is CMU monitoring agent (cmu_cn-7.1-1.x86_64.rpm) is installed on those nodes?

YES

- Do you have firewall running between head node and problematic compute node?

NO

- Are monitoing agents running on problematic nodes?

YES

 

 

Will get back to you with regard to the following instructions:

 

  Please run this command on head node;

   # /opt/cmu/bin/pdsh -w <nodelist> ps -ef | grep Monitoring | /opt/cmu/bin/dshbak -c

       <nodelist> should be hostname of nodes having problem. eg: n[1-5]

- Look in these logs to debug issue;

    Head node: 

         /opt/cmu/log/MainMonitoringDaemon_<head node name>.log

    Node on which secondary monitoring daemon is ruuning (above command will show this):          

         /opt/cmu/log/SecondaryServerMonitoring_<compute node name>.log

    Problematic compute nodes: 

        /opt/cmu/log/SmallMonitoringDaemon_n1.log

           

If it doesn't solve problem, please post above things here and send us logs.


Sorry for the delay. However, now that all of the other issues have been resolved, I will be able to concentrate on my final problem with 7.1. Thank you for all of these suggestions.