ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

HP Proliant DL580 G2

 
Carine_2
Occasional Visitor

HP Proliant DL580 G2

I got this message at the event viewer:-
source :- ClusSvc
ID :- 1123
Description
-----------
The node lost communication with cluster node.

The error message occurred at all the 6 servers. Could find the exact problem.

Please advise

Thank you
Carine
7 REPLIES
Joseph Loo
Honored Contributor

Re: HP Proliant DL580 G2

hi,

have u refer to these 2 Microsoft KB:

http://support.microsoft.com/default.aspx?scid=kb;en-us;892422

http://support.microsoft.com/default.aspx?scid=kb;en-us;311081

regards.
what you do not see does not mean you should not believe
Carine_2
Occasional Visitor

Re: HP Proliant DL580 G2

Thanks for your reply Joseph... Actually we hv look every possibility and including help frm HP helpdesk.. until now(2months) we still got the error .. kept occurred at the event viewer non-stop. Once a while, MOM 2005 server send alert informed that the "The node lost communication wt cluster node"..

An advice frm HP that it could be the SAN hitachi need to provide the correct drivers but customer diagree because they said the HP machine is frm HP and all necessary drivers shd come frm HP.

Please advise..

Thank you
Carine
Steven Clementi
Honored Contributor

Re: HP Proliant DL580 G2

Carine:

Event 1123 is a Network communications error. Is it possible that one of the nodes has a faulty nic and causing intermintent issues that you might not visibly see at any given single point in time.

Could be a faulty network switch as well.


This is not related to drivers for your storage, it should have nothing to do with storage.


One thing you can do is provide a backup cluster communications connection. This would require another network interface in each server, configured for Internal Cluster Communications only. (as well as 6 additional ports on the network).

I would have the customer run some diagnositcs on their network hardware first, then check the NIC's on the cluster nodes.


Steven
Steven Clementi
HP Master ASE, Storage and Clustering
MCSE (NT 4.0, W2K, W2K3)
VCP (ESX2, Vi3, vSphere4, vSphere5)
RHCE
NPP3 (Nutanix Platform Professional)
Carine_2
Occasional Visitor

Re: HP Proliant DL580 G2

Thanks Steven...

One of the engineer did ran the HP diagnostics for all the nic and it's seem no problem/no errors.

We did told customer to check their switch and they insist that their switch is ok.

For all servers that is connected to SAN storage, in the event viewer there is an error stated event id 11 and event id 15.

OK.. your statement that possible that one of the node has a faulty nic. Would that caused all 6 server to lost communications? oh I got hints... I can see at the event viewer where is the source of the lost communication server.

If to asked to provide backup cluster communication, customer will definately diagree with it.

Any other diagnostics to recommend to check the nic?

Thank you
Carine
Prashant (I am Back)
Honored Contributor

Re: HP Proliant DL580 G2

Check (IML)

(EVENT LOGS)

In order to get more clues for this problem, please help to gather the following information:

1. Which Services we are running on the Cluster.
2. Are we using Teaming on the same. If yes then follow
Does this problem occur if you temporarily disable the teaming? Refer to
the following MS KB article to understand why we need to disable teaming
before further troubleshooting.
254101 Network adapter teaming and server clustering
http://support.microsoft.com/?id=254101
3. Did you have the latest NIC driver installed on this cluster?
4. Download the MPS Reporting utility and run it on each node, and then
send the output CAB files generated back to me.

MPS Report:
http://download.microsoft.com/download/b/b/1/bb139fcb-4aac-4fe5-a579-30b0bd915706/MPSRPT_CLUSTER.EXE
This will help you to validate the Cluster.

Prashant S.
Nothing is impossible
Steven Clementi
Honored Contributor

Re: HP Proliant DL580 G2

The cluster nodes replicate Event entries to all nodes so you would see the entry on all 6 nodes.

I can't think of any other diagnostics for nic's.

Can you post more info about Events 11 and 15 so we can see them?


Are the internal communications (heartbeat connection) teamed? or just single nic's?


Steven
Steven Clementi
HP Master ASE, Storage and Clustering
MCSE (NT 4.0, W2K, W2K3)
VCP (ESX2, Vi3, vSphere4, vSphere5)
RHCE
NPP3 (Nutanix Platform Professional)
Carine_2
Occasional Visitor

Re: HP Proliant DL580 G2

Btw, I m actually helping out my colleague, he is the one doing those configuration and testing.. I will try to answer all the questions..

To answer Prashant S. questions :-

1. Which Services we are running on the Cluster.
Windows Server 2003 cluster
3. Did you have the latest NIC driver installed on this cluster?
Yes
4. Download the MPS Reporting utility and run it on each node, and then
send the output CAB files generated back to me.
Can I send to your personal email? because here the max size is only 1MB.

To answer Steven questions :-

1. Can you post more info about Events 11 and 15 so we can see them?
Please refer to the attachment

Are the internal communications (heartbeat connection) teamed? or just single nic's?
Single nic's