Re: Itanium & Alpha Server crashed

Harry M · ‎06-07-2008

Dear Friends,
we have Four Alpha server with 7.3-2 OS & one Itanium(I64) server with 8.31H1 OS, the Alpha servers are connected in cluster using Memory Channel/LAN and Itanium through Lan Cluster alone. All were working fine,recently Itanium was crashed and removed from existing cluster and parllely one alpha node also get crashed ......I suspect some Network issue was happend at that time and since LAN was disturbed the Itanium node got crashed and parallely by Alpha node....please share your knowledge regarding this issue...if u have any queries reply ...........

with regards
Harichander

Harry M · ‎06-07-2008

gf

Harry M · ‎06-07-2008

reply.......

Rob Leadbeater · ‎06-07-2008

Hi,

Crashed how exactly ? Any error messages to go on ?

Please help us to help you by supplying much more information.

Cheers,

Rob

Robert Gezelter · ‎06-07-2008

Harichander,

Without more detailed information, it is virtually impossible to identify the cause of the crash.

As a first question, were the systems properly configured to write crash dumps to disk in the event of a failure? If so, were the dumps from these crashes saved?

- Bob Gezelter, http://www.rlgsc.com

Hoff · ‎06-07-2008

What might you mean by "LAN was disturbed?" You are here aware of and reporting something -- what? -- that happened to the LAN that then apparently triggered the OpenVMS I64 Integrity server and an OpenVMS Alpha node to crash? Yes, an unstable network can cause cluster member crashes, usually when a timeout fires, or when connectivity is restored. You'll typically see CLUEXIT crashes here.

An unstable network makes for an unstable cluster.

If you have HP support, call support now.

If you don't have HP support, consider calling in help. Particularly if this is a production environment, as your periodic pings to this topic might imply.

If you wish to pursue this here in ITRC, first acquire the crashdumps minimally, and post the CLUE CRASH output acquired from the crashdumps here as an attached text file.

If you're not collecting crashdumps for each node, you're effectively not operating in what most would consider a production configuration. Configuring to acquire and to save crashdumps are often central to resolving failures and quickly resuming production operations. Fix that now, and reboot the nodes at your next opportunity.

For what is known of this now, this could potentially be buggy software, buggy hardware, buggy network, excessive radiation or magnetic fields triggering memory errors or other similar instabilities, cluster configuration issues, problems with the local power supply, who really knows?

Stephen Hoffman
HoffmanLabs LLC

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Itanium & Alpha Server crashed

Itanium & Alpha Server crashed