1754366 Members
4509 Online
108813 Solutions
New Discussion юеВ

HP-SNMP-AGENTS errors

 
SOLVED
Go to solution
Marco Correnti Techno S
Frequent Advisor

HP-SNMP-AGENTS errors

Hi all

We have a lot of old servers running old ILO version (< 4) and old OS version (RHEL 3-6) and we need to monitor them for HW errors.

So we have configured ILO and installed the RPM needed (for servers with ILO < 4 part of the traps are managed directly by OS).

Sometimes for some of them the /var file system is fill up for errors of the hp-snmp-agents daemons.

Any idea how to solve this issue ?

Thanks in advance

Marco

===============
=====SYSINFO===
===============
ProLiant DL380 G6
System ROM: P62 07/24/2009
iLO 2 Firmware Version: 2.31 11/06/2017 (not updated to the last version 2.33 as the Fixes are not important for us)

ILO configuration
iLO 2 SNMP Alerts: Enabled
Forward Insight Manager Agent SNMP Alerts: Enabled
SNMP Pass-thru: Enabled

====================
=====OS & RPM=======
====================

[root@DB2 LOG]# cat /etc/redhat-release
Red Hat Enterprise Linux ES release 4 (Nahant Update

[root@DB2 ~]# uname -a
Linux DB2 2.6.9-89.ELsmp #1 SMP Mon Apr 20 10:34:33 EDT 2009 i686 i686 i386 GNU/Linux

[root@DB2 LOG]# rpm -qa | fgrep hp-        these are the LAST versions available fo RH4
hp-ilo-8.5.0-1.rhel4
hp-health-8.6.0.24-24
hp-snmp-agents-8.6.0.18-16

========================================
=====/VAR/LOG/HP-SNMP-AGENTS/CMA.LOG====
========================================

Starting hp-ilo:
Starting Health agent (cmahealthd):
Starting Standard Equipment agent (cmastdeqd):
Starting Host agent (cmahostd):
Starting Threshold agent (cmathreshd):
Starting RIB agent (cmasm2d):
Starting Rack Infrastructure Info Srv (cpqriisd):
Starting Rack agent (cmarackd):
Jul 13 23:55:49 DB2 cmarackd[5883]: Failed to detect rack, cmarackd exiting....
Starting Performance agent (cmaperfd):
Starting SNMP Peer (cmapeerd):
Jul 13 23:55:50 DB2 cmapeerd[5933]: Started hp Advanced Server Management Agents Peer Manager
Starting Storage Event Logger (cmaeventd):
Starting FCA agent (cmafcad):
Jul 13 23:55:50 DB2 cmastdeqd[5675]: getISASystemID: Bypassed Obsolete code
Starting SAS agent (cmasasd):
Starting IDA agent (cmaidad):
Jul 13 23:55:50 DB2 cmasasd[6032]: SAS agent (cmasasd) does not find any supported SAS controller
Jul 13 23:55:50 DB2 cmapeerd[5933]: [hpasmpeer] Connect agent: cmasasd
Starting IDE agent (cmaided):
Jul 13 23:55:50 DB2 cmapeerd[5933]: [hpasmpeer] Connect agent: cmastdeqd
Starting SCSI agent (cmascsid):
Jul 13 23:55:50 DB2 cmascsid[6176]: for line in `grep -n $1 /proc/scsi/scsi | cut -d: -f1`
Jul 13 23:55:50 DB2 cmascsid[6176]: do
Jul 13 23:55:50 DB2 cmascsid[6176]: endline=`expr $line + 2`
Jul 13 23:55:50 DB2 cmascsid[6176]: sed -n "$line,${endline}p" /proc/scsi/scsi >> $2
Jul 13 23:55:50 DB2 cmascsid[6176]: done
Jul 13 23:55:50 DB2 cmascsid[6176]: SCSI agent (cmascsid) does not find any supported dumb SCSI controller
Jul 13 23:55:50 DB2 cmapeerd[5933]: [hpasmpeer] Connect agent: cmascsid
Starting NIC Agent Daemon (cmanicd):
Copyright (c) 2005 Hewlett-Packard Development Company, L.P.
Copyright (c) 2001, NAI Labs

cmanic started INFO: cmanic monitoring /var/log/messages for link changes, cpqnic_trapd.c,2767
Jul 13 23:55:53 DB2 cmapeerd[5933]: [hpasmpeer] Connect agent: cmahealthd

[OK]
Jul 13 23:55:54 DB2 cmapeerd[5933]: [hpasmpeer] Connect agent: cmasm2d
Jul 13 23:55:54 DB2 cmapeerd[5933]: [hpasmpeer] Connect agent: cmathreshd
Jul 13 23:55:55 DB2 cmafcad[6008]: Compaq CISS Driver version 2.6.20
Jul 13 23:55:55 DB2 cmapeerd[5933]: [hpasmpeer] Connect agent: cmafcad
Jul 13 23:55:56 DB2 cmahostd[5698]: Info: SM BIOS Structures present.
Jul 13 23:55:56 DB2 cmahostd[5698]: Info: cpqHoGUID is obtained from SMBIOS record type 226
Jul 13 23:55:58 DB2 cmapeerd[5933]: [hpasmpeer] Connect agent: cmahostd
Jul 13 23:56:05 DB2 cmapeerd[5933]: [hpasmpeer] Connect agent: cmaided

cmanic: got error -1 before end of iml log for slot 0 port 3, eid=182, logevt.c,429
SendRequest() msgsnd: Invalid argument
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=183, logevt.c,429
SendRequest() msgsnd: Invalid argument
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=184, logevt.c,429
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument


--These messages are also present among the others

Jul 13 23:56:11 DB2 cmafcad[6008]: cmafcad: Receive failed: Identifier removed (PEER3004)
SendRequest() msgsnd: Invalid argument
....
Jul 13 23:56:11 DB2 cmahostd[5698]: cmahostd: Receive failed: Identifier removed (PEER3004)
SendRequest() msgsnd: Invalid argument
....
Jul 13 23:56:11 DB2 cmahealthd[5655]: cmahealthd: Receive failed: Identifier removed (PEER3004)
SendRequest() msgsnd: Invalid argument
....
Jul 13 23:56:11 DB2 cmasasd[6032]: cmasasd: Receive failed: Identifier removed (PEER3004)
Jul 13 23:56:11 DB2 cmastdeqd[5675]: cmastdeqd: Receive failed: Identifier removed (PEER3004)
Jul 13 23:56:11 DB2 cmascsid[6176]: cmascsid: Receive failed: Identifier removed (PEER3004)
Jul 13 23:56:11 DB2 cmaided[6127]: cmaided: Receive failed: Identifier removed (PEER3004)
SendRequest() msgsnd: Invalid argument
....
Jul 13 23:56:11 DB2 cmathreshd[5740]: cmathreshd: Receive failed: Identifier removed (PEER3004)
Jul 13 23:56:11 DB2 cmapeerd[5933]: hpasmpeerd: Receive failed: Identifier removed (PEER3004)
Jul 13 23:56:11 DB2 cmapeerd[5933]: Terminating hp Advanced Server Management Agents Peer Manager...
Jul 13 23:56:11 DB2 cmasm2d[5804]: cmasm2d: Receive failed: Identifier removed (PEER3004)
Jul 13 23:56:11 DB2 cmapeerd[5933]: Did not get ACK from [cmasasd]:6040
Jul 13 23:56:11 DB2 cmapeerd[5933]: Did not get ACK from [cmastdeqd]:5705
Jul 13 23:56:11 DB2 cmapeerd[5933]: Did not get ACK from [cmascsid]:6183
SendRequest() msgsnd: Invalid argument
Jul 13 23:56:11 DB2 cmapeerd[5933]: Did not get ACK from [cmahealthd]:5692
SendRequest() msgsnd: Invalid argument
Jul 13 23:56:11 DB2 cmapeerd[5933]: Did not get ACK from [cmasm2d]:5817
SendRequest() msgsnd: Invalid argument
Jul 13 23:56:11 DB2 cmapeerd[5933]: Did not get ACK from [cmathreshd]:5766
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
Jul 13 23:56:11 DB2 cmapeerd[5933]: Did not get ACK from [cmafcad]:6051
SendRequest() msgsnd: Invalid argument
Jul 13 23:56:11 DB2 cmapeerd[5933]: Did not get ACK from [cmahostd]:5737
SendRequest() msgsnd: Invalid argument
Jul 13 23:56:11 DB2 cmapeerd[5933]: Did not get ACK from [cmaided]:6132
SendRequest() msgsnd: Invalid argument
SendRequest() msgsnd: Invalid argument
Jul 13 23:56:11 DB2 cmapeerd[5933]: done.

....
....
Jul 13 23:56:41 DB2 cmahealthd[5655]: ERROR: ioctl dev_cevt EVT_JALOG failed!
Jul 13 23:56:41 DB2 cmahealthd[5655]: MAIN: Agents code Broke for 1Jul 13 23:57:10 DB2 cmaidad[6073]: cmaidad: Can't get message queue.
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=-2147483463, logevt.c,429
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=-2147483462, logevt.c,429
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=-2147483461, logevt.c,429
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=-2147483460, logevt.c,429
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=-2147483459, logevt.c,429
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=-2147483458, logevt.c,429
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=-2147483457, logevt.c,429
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=-2147483456, logevt.c,429
cmanic: got error -1 before end of iml log for slot 0 port 3, eid=-2147483455, logevt.c,429


---this error fills the /var file system

 

 

LOG from another server

DL580 G7, ILO 3 version 1,89, RH 4.8

Same RPM, ILO configuration and kernel version of the previous one.

.......

........

Jul 26 12:02:11 FP4 cmafcad[6491]: cmafcad: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmastdeqd[6148]: cmastdeqd: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmapeerd[6411]: hpasmpeerd: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmaidad[6549]: cmaidad: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmascsid[6597]: cmascsid: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmasasd[6516]: cmasasd: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmasm2d[6271]: cmasm2d: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmahealthd[6112]: cmahealthd: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmathreshd[6225]: cmathreshd: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmaided[6572]: cmaided: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmapeerd[6411]: Terminating hp Advanced Server Management Agents Peer Manager...
Jul 26 12:02:11 FP4 cmahostd[6197]: cmahostd: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmaperfd[6389]: cmaperfd: Receive failed: Identifier removed (PEER3004)
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmasasd]:6521
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmasm2d]:6276
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmascsid]:6602
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmastdeqd]:6153
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmathreshd]:6230
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmafcad]:6496
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmaidad]:6554
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmahealthd]:6117
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmaided]:6577
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmahostd]:6202
Jul 26 12:02:11 FP4 cmapeerd[6411]: Did not get ACK from [cmaperfd]:6394
Jul 26 12:02:11 FP4 cmapeerd[6411]: done.
Jul 26 12:02:41 FP4 cmaidad[6549]: cmaidad: Can't get message queue.
Jul 26 12:02:41 FP4 cmahealthd[6112]: ERROR: ioctl dev_cevt EVT_JALOG failed!
Jul 26 12:02:41 FP4 cmahealthd[6112]: MAIN: Agents code Broke for 1hpGetSemID() semget failed: No such file or directory
cmanic: unable to open /dev/cpqhealth/cevt [2] logevt.c,379
hpGetSemID() semget failed: No such file or directory
cmanic: unable to open /dev/cpqhealth/cevt [2] logevt.c,663
hpGetSemID() semget failed: No such file or directory
cmanic: unable to open /dev/cpqhealth/cevt [2] logevt.c,379
hpGetSemID() semget failed: No such file or directory
cmanic: unable to open /dev/cpqhealth/cevt [2] logevt.c,663
hpGetSemID() semget failed: No such file or directory
cmanic: unable to open /dev/cpqhealth/cevt [2] logevt.c,379
hpGetSemID() semget failed: No such file or directory
cmanic: unable to open /dev/cpqhealth/cevt [2] logevt.c,663
hpGetSemID() semget failed: No such file or directory
hpGetSemID() semget failed: No such file or directory
hpGetSemID() semget failed: No such file or directory
hpGetSemID() semget failed: No such file or directory
hpGetSemID() semget failed: No such file or directory
hpGetSemID() semget failed: No such file or directory

4 REPLIES 4
Marco Correnti Techno S
Frequent Advisor

Re: HP-SNMP-AGENTS errors

UPDATE :

In my test environment I'm trying to solve this issue. Most of the time the problems are :

cmasm2d[12622]: WARNING: Cannot open /dev/cpqhealth/cdt

cmahealthd[12489]: ERROR: Failed to open /dev/cpqhealth/crom

cmasm2d[12622]: cmasm2d : RIB_GETSMIF ioctl failed

cmahealthd[11143]: ERROR: ioctl dev_cevt EVT_JALOG failed!

cmahostd[11222]: cmahostd: Receive failed: Identifier removed (PEER3004)

cmapeerd[11450]: Terminating hp Advanced Server Management Agents Peer Manager...

This is true every time I restart the hp-snmp-agents.

I have removed the following RPMs that I have used in order to use the command ipmitool to set the ILO IP address via OS.

  • OpenIPMI-devel  
  • OpenIPMI-tools  
  • OpenIPMI 
  • OpenIPMI-libs

Now it seems that the above problems have disappeared, still checking ...

Bunsol
HPE Pro

Re: HP-SNMP-AGENTS errors

We have to understand if the errors disappeared after restarting the agents or removing the RPM's.

 

If the issue has disappeared after restarting the snmp agents then we would requst you to update to a newer version of snmp agents.

 

 


If you feel this was helpful please click the KUDOS! Thumbs below!
I am an HPE employee.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Marco Correnti Techno S
Frequent Advisor
Solution

Re: HP-SNMP-AGENTS errors

Many thanks Bunsol for the reply.

I have just found that the cause of the issue is a software that our servers is running that when start remove indiscriminately the IPC resources owned by root including those used by the HP SNMP Agent and by the HP Health monitor.

In the cma.log there was this message indicating this event

xxxx xxxx   Receive failed: Identifier removed (PEER3004)

We can therefore consider this problem solved.

Regards

Marco

Sunitha_Mod
Moderator

Re: HP-SNMP-AGENTS errors

@Marco Correnti Techno S

Hi Marco,  

Awesome! Glad to know the problem has been resolved. 

Thanks,
Sunitha G
I'm an HPE employee.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo