1752590 Members
4012 Online
108788 Solutions
New Discussion юеВ

Clustering

 
Ayman Altounji
Valued Contributor

Clustering

Two 6400R servers within a cluster. Both are recieving within event viewer constant error messeges which reads:
Event Id 11
the driver detected a controller error on |device|scsi|cpqfcalm1
this error doesn't appear to effect the cluster at present which is working, but errors now appear once a minute..
can anyone issist as to what error relates to.
thankyou.
3 REPLIES 3
Ayman Altounji
Valued Contributor

Re: Clustering

The Windows NT miniport driver for the Compaq Fibre Channel Host Controller Arbitrated loop is CPQFCALM.SYS. This driver may be listed as the source of error entries contained in the Windows NT/2000 system event log similar to the following:

Event ID 11 [ CPQFCALM ]
- The driver detected a controller error on \Device\ScsiPort6

Event ID 15
- Device not ready for access

All event log entries generated by CPQFCALM which relate to Fibre Channel events WILL HAVE an event ID of 11. Although the system event log may contain other event IDs (such as 9 or 15) which list CPQFCALM as the source, the CPQFCALM driver DOES NOT generate these errors. Event log entries that list CPQFCALM as the source and have event IDs other than 11, are actually generated by the Windows NT SCSIPORT driver acting on behalf of CPQFCALM.

DECODING WINDOWS NT CPQFCALM EVENT ID 11 ENTRIES:

To help troubleshoot the cause of CPQFCALM event ID 11 errors, look at the first 32-bit value on the row labeled 0010. This information is located under "EVENT DETAIL" in the DATA box (refer to the bitmap attached to this IQ document for an example). Then, using the table below, look up this value to further define the cause of the error.

TABLE FOR DECODING CPQFCALM EVENT ID 11 ENTRIES:

0x01
An unknown Tachyon status value was received.

0x04
An unexpected SCSI frame was received.

0x06
A fatal error was detected by Tachyon, or a request for memory failed. The controller is taken offline.

0x07
A fatal hardware error occurred. The controller is taken offline.

0x09
Fibre Manager warning

0x109
Count of loop initialization primitives received.

0x209
Count of elastic store errors detected.

0x309
The port lost synchronization with the incoming data or the port is not receiving a signal from a node.

0x409
A laser fault was detected.

0x509
A loop timeout occurred because the loop was in a particular state for too long.

0x609
A laser fault, loss of synchronization, loss of signal or an elastic store error occurred.

0x709
The Tachyon is in an offline state.

0x0A
Count of node logins received from remote nodes.

0x0B
An unknown single frame sequence was received.

0x0C
SCSI Command received.

0x0D
SCSI Data received.

0x0E
An unknown multiple frame sequence was received.

0x0F
Count of sequences aborted by a remote node.

0x10
Free buffers.

0x20
Controller ALPA has changed.

0x21
Count of loop down conditions that caused all previous node logins to be invalid.

0x22
Target Reset complete

0x23
Target Reset failed.

0x24
Count of node logouts

0x25
A recoverable PCI bus error occurred.

0x26
An unrecoverable PCI bus error occurred. The controller is taken offline.

0x27
FCMNGR_LOOP_PROBLEM

0x28
A loop problem has occurred that has prevented the host controller from reinitializing the loop.

4000
The link went down during the transmission of a frame.

CCC0002
The driver failed to obtain an arbitrated loop physical address for the controller. This indicates a fibre channel loop problem. The controller is taken offline.

CCC0003
The driver failed to initialize the controller due to a hardware or configuration error.

E0nnnnnn
One or more elastic stores were detected in 60 seconds prior to the event log entry being generated. The 3 bytes nnnnnn contain the hexadecimal count that occurred.


Also, the information provided by the third and fourth values on row 0020 and 0030 can further assist in troubleshooting the error. If the event is associated with a particular device on the Fiber Channel Loop, these values will provide the Windows NT SCSI address (Bus number, Target ID, and logical Unit number) of that device. If these values each contain 000000ff, then the event is not a
Ayman Altounji
Valued Contributor

Re: Clustering

Where can I get a copy of the entire document you referenced Orion_Quest? I've been searching the Compaq web site for it and cannot find it.

Please email any info at shinerburke@webzone.net

Thanks!!!
Ayman Altounji
Valued Contributor

Re: Clustering

I'm interested in getting a copy as well. Please send to jonabishop@hotmail.com thanks.