Operating System - Tru64 Unix
1752664 Members
5458 Online
108788 Solutions
New Discussion юеВ

deregister_fd error on TruCluster 1.6 member after reboot.

 
Howard Anderson_2
Frequent Advisor

deregister_fd error on TruCluster 1.6 member after reboot.

I have a ES40 Alpha with Tru64 4.0G and TruCluster 1.6 ASE.
The system is now very slow to boot, and when running is slow to respond, but does not have any sigificant cpu load, or memory usage.
In the daemon.log I am getting (from boot up, and on into 'normal' running) the following messages, looping every few seconds:

'hostname ASE: deregister_fd:not registered.
hostname ASE: Local AseMgr Notice: Can't connect to local agent, retrying.'
'AseMgr Notice: msgsvc OpenChannel: Agent not in target's port map'
This member is not running any cluster services.

The other cluster member is running normaly, and is running the two defined cluster services.

I have not made any recent changes to these systems.

Anybody got an idea what could be causing this error and how to fix it ?

(I know I probalby should have put this in the TruCluster forum, but this one is more active.)

6 REPLIES 6
Ivan Ferreira
Honored Contributor

Re: deregister_fd error on TruCluster 1.6 member after reboot.

I had system slowdown when there are problems with scsi devices, like disks or tape units.

Check your hardware to ensure that no problems exists. Use the uerf command to identy scsi problems. Maybe some disk is "about" to fail.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Pieter 't Hart
Honored Contributor

Re: deregister_fd error on TruCluster 1.6 member after reboot.

what kind of shared storage is used?
I got the impression that one or more of multiple path's is not accessible!
Using 4.0G you probably have memory-channel interconnect?

has this version of trucluster a "clu_check_config" ?
(my exp. only with Tru64 5+ clustering)
Howard Anderson_2
Frequent Advisor

Re: deregister_fd error on TruCluster 1.6 member after reboot.

Thanks for the replys,

clu_check_config does not appear to exist on TruCluster 1.6.

I do not use memory channel. The members are connected to a SAN and LTO tape library via 1Gb Fibre channel and to a network via 100baseT networking for the cluster interconnect and FDDI networking as a backup link.

Ivan, I too thought about scsi issues, as I have seen drastic slow downs when a scsi device is flooding the bus with errors.
I have used uerf -R, but it has only flagged one error on the scsi this morning. (I would rather see none, but I still don't think this is the probelm.)

I have a spare system hard drive (albeit with an older build of the system on it, and I may try booting off that, with the other cluster member shut down (so it doesn't get affected in any way) and see what I get. I could try booting the original drive with the fibre channel connection to the SAN & Tape library disconnected.

Any more thoughts ?


Pieter 't Hart
Honored Contributor

Re: deregister_fd error on TruCluster 1.6 member after reboot.

Howard Thanks for the respons, i thought LAN-clustering was V5.1 up. We entered tru64 clustering with V5 so i have no experience on V4.0G
I should make sure the lan-interconnect is fully functional. Is this a direct cross-cable or is there a connecting component (hub/switch)?
I would also check "external cause" like name-resolution issues. or Ip-adress accidentally in use by another host on the network.

From a post from Ralph Puchner i found the command "clu_ivp -v". Does that exists?
This post also mentions name or ip-addrsess change as possible cause. (I know you stated "no recent changes", but accidents happen).
Unfortunately this thread is not closed with a solution named.

It may be a little rigourous, but if no other suggestions come up it may be a thought to clu_remove_node and re-add it.

if all used applications are cluster-aware, it should be not much problem to make the "new" node a full functionally member.
Howard Anderson_2
Frequent Advisor

Re: deregister_fd error on TruCluster 1.6 member after reboot.

I have now ascertained that the problem was a failing system hard drive. (You were quite right Ivan.)
I have replaced this with a shiny new 36.4 Gb 15k rpm drive, restored from a vdump backup and the system is now booted and working at full speed, and without the deregister_fd message.

Many thanks for your advice.

For info the clu_ivp highlighted the absence of my NetRAIN network adapter details in /etc/rc.config, but otherwise this was fine.

Howard Anderson_2
Frequent Advisor

Re: deregister_fd error on TruCluster 1.6 member after reboot.

See my previous post...