Operating System - HP-UX
1834814 Members
2302 Online
110070 Solutions
New Discussion

DLPI error on MC/Service Guard environment

 
robertoguido
Occasional Contributor

DLPI error on MC/Service Guard environment

Hello there,

I'm trying to start a node in a cluster but I receive this message on my terminal :

"tlrhci cmcld: Aborting! Failed to communicate with DLPI"

On /var/adm/syslog/syslog .log

Jul 18 12:57:16 tlrhci cmclconfd[2115]: Unable to lookup cluster information in CDB: No such file or directory
Jul 18 12:57:17 tlrhci cmcld: Lookup of link /nodes/tlrhci/networks/lan/lan3/peers failed.
Jul 18 12:57:17 tlrhci cmcld: Unable to send DLPI info request, Not a stream
Jul 18 12:57:17 tlrhci cmcld: cl_abort: abort cl_kepd_printf failed: Invalid argument
Jul 18 12:57:17 tlrhci cmcld: cl_kepd_printf, fstat: kepd_fd=7, st_dev=1073741827, st_ino=1005, st_rdev=-486539264
Jul 18 12:57:17 tlrhci cmcld: Aborting! Failed to communicate with DLPI
Jul 18 12:57:22 tlrhci cmsrvassistd[2121]: Unable to notify ServiceGuard main daemon (cmcld): Connection reset by peer
Jul 18 12:57:22 tlrhci cmsrvassistd[2114]: The cluster daemon aborted our connection.


What does it mean DLPI ?

How can I recover from this situation ? (I want to re-join the node in the cluster)

Bye
RobertoGuido
BRG
3 REPLIES 3
melvyn burnard
Honored Contributor

Re: DLPI error on MC/Service Guard environment

looks like you may have a known issue that is fixed by patches, or you have a networking issue.
What version of SG and do you have it patched? You can check this by doing:
what /usr/lbin/cmcld


You may also want to check your networking for bad cables or any other issues.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
rick jones
Honored Contributor

Re: DLPI error on MC/Service Guard environment

DLPI == Data-Link Provider Interface. it is the means by which applications access NICs at the "data-link" (ie link-level) layer
there is no rest for the wicked yet the virtuous have no pillows
Stephen Doud
Honored Contributor

Re: DLPI error on MC/Service Guard environment

Hi Roberto,

I checked our internal case history database for a similar set of messages... found a case that resolved to:

"Problem caused by a lan interface configured in SG, but not connected into the cluster."

Do a cmviewconf and compare the LANs listed with those that are actually connected with the network (lanscan on both nodes) to locate any that are no longer used. You might also consider performing a new cmquerycl and look at the resulting output file and check the comments ServiceGuard inserts at the bottom of each node section such as:

# Possible standby Network Interfaces for lan0: lan1.

If they don't fit the current hardware or network configuration, then the cluster is out of sync with the real setup. If this is the case, consider halting the cluster, performing 'cmdeleteconf -f' and recreating a new cluster ASCII file (cmquerycl -C -n -n ...) and cmapplyconf it (after customization).

If this fails, please consider opening a software support case with the response center to get to the core cause.

-s.