Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

diagnosing memory chip failures on DS25 Alpha servers

 

diagnosing memory chip failures on DS25 Alpha servers

we had a cluster member go down the last several days and initiate auto-restart action and is having no success in rejoining the cluster now... during conversational boot when the server is checking memory the console displays "Error detected by Pchip0.... GPError 00A0000... PCI Read data parity error" followed eventually by the boot prompt... my difficulty here is deciphering that message about PCI read data parity error... is it implying that software installed through the PRODUCT command in the POlyCenter facility is at the heart of the problem here.?? or is that way off-base? I would also like to evaluate the ERRLOG.SYS from the down system... any suggestion on how to make that available to the other nodes in cluster??
You Brought Two Too Many
5 REPLIES
Karl Rohwedder
Honored Contributor

Re: diagnosing memory chip failures on DS25 Alpha servers

In this case, I assume PCI means the PCI bus and has nothing to do with PCSI - the product installation facility.
If your nodes share a common system disk, the errlog can by found under SYS$SYSDEVICE:[SYSn.SYSERR], where N represents the system root.

regards Kalle

Re: diagnosing memory chip failures on DS25 Alpha servers

you are so right , they do indeed share a common system disk and I am working on that analysis now... and you seem to confirm my take on that PCI error message not being from something associated with PolyCenter
You Brought Two Too Many
Bill Hall
Honored Contributor

Re: diagnosing memory chip failures on DS25 Alpha servers

I would think this is a failure of either one of your PCI adapters or the Pchip if this console error is returned every time the server is power cycled or an init is executed. I know the Pchips are located on the system board on the ES40 and I would expect the same for the DS25. Probably time to log a service call if you have a maintenance contract.

We've seen some very interesting crashes and data corruptions with certain Pchip failures... and no errlog entries to help pinpoint the problem.

Bill.
Bill Hall
Wim Van den Wyngaert
Honored Contributor

Re: diagnosing memory chip failures on DS25 Alpha servers

If you can, power cycle the box.
I've seen many problem never coming back after that (disk, cpu).

Wim
Wim

Re: diagnosing memory chip failures on DS25 Alpha servers

Bill wrote >> I would think this is a failure of either one of your PCI adapters or the Pchip if this console error is returned every time the server is power cycled or an init is executed. I know the Pchips are located on the system board on the ES40 and I would expect the same for the DS25. Probably time to log a service call if you have a maintenance contract.

We've seen some very interesting crashes and data corruptions with certain Pchip failures... and no errlog entries to help pinpoint the problem.

Byron has added>> 11/17/2006 H-P support helped identify failures in recently replaced HBA card used for fiber-channel interface to SAN storage volumes which compounded the problems already occuring on this PCI Bus.. my thanks to all who contributed to the solution... it was eventually hardware replacement as the ultimate solution.
You Brought Two Too Many