ProLiant Servers (ML,DL,SL)
1823915 Members
3027 Online
109667 Solutions
New Discussion юеВ

The device, \Device\Scsi\cpqfcalm1, did not respond within the timeout period.

 
Markus Klingelhoefer
Occasional Advisor

The device, \Device\Scsi\cpqfcalm1, did not respond within the timeout period.

Server ProLiant 6500 4x450MHz, 3.5GB,
3xRA4000, 1xRA4100, FC-Hub.
Used as SAP Central Instance & DB-Server (Oracle)

We updated from Win NT to W2K AdvSrv.
I updated also the BIOS and the ROM's of the RA's. I used the latest dirivers from smartstart 6.20.

Now we have performance of problems and the following messages in the eventlog:

Event Type: Error
Event Source: cpqfcalm
Event Category: None
Event ID: 9
Date: 23.04.2003
Time: 13:45:13
User: N/A
Computer: ERLSV063
Description:
The device, \Device\Scsi\cpqfcalm1, did not respond within the timeout period.
Data:
0000: 00100000 006a0001 00000000 c0040009
0010: 50000101 00000000 00000000 00000000
0020: 00000000 00000000 00000000 00000000
0030: 00000000 00000007

Event Type: Warning
Event Source: Disk
Event Category: None
Event ID: 51
Date: 23.04.2003
Time: 13:45:13
User: N/A
Computer: ERLSV063
Description:
An error was detected on device \Device\Harddisk3\DR3 during a paging operation.
Data:
0000: 00220004 00720001 00000000 80040033
0010: 0000012d 00000000 00000000 00000000
0020: 00000000 00000000 00000001 00000000
0030: 00000002 0000002a 00000e00 00000000
0040: 8700482a 00002084 0008

With larger load more error messages are written in the log.

Does someone have an idea, which causes this disturbance?

Thanks for any help!
6 REPLIES 6
Terry Hutchings
Honored Contributor

Re: The device, \Device\Scsi\cpqfcalm1, did not respond within the timeout period.

What firmware is currently on the RA4100s? Are all three storage systems at the latest firmware? If this only started happening after upgrading to Win 2000, then I would say this would be the cause (firmware).

Also is securepath installed? If so, has this software been reinstalled since the upgrade?
The truth is out there, but I forgot the URL..
Terry Hutchings
Honored Contributor

Re: The device, \Device\Scsi\cpqfcalm1, did not respond within the timeout period.

Another thing which may cause this is mismatched GBICs. I would recommend checking the GBICs in the loop to verify they're all exactly the same. This is unlikely if the problem only started after the upgrade to Win 2000.
The truth is out there, but I forgot the URL..
Markus Klingelhoefer
Occasional Advisor

Re: The device, \Device\Scsi\cpqfcalm1, did not respond within the timeout period.

We killed Win NT before we installed Win 2000 new.
The firmware version of all the RA 4x00's was 2.48 at the time we installed Win 2000. Yesterday I made an ROM-update to version 2.60. The errors are still there. We have no securepath installed.

The drivers (cpqFCALM, cpqFCFTR and cpqFCAC) are at level 5.20.0.32

There is also an new HBA (FC2101) for MSA 1000 installed, but not running now. We plan to migrate to MSA 1000 storagesystem. could this new HBA cause the problems?
Terry Hutchings
Honored Contributor

Re: The device, \Device\Scsi\cpqfcalm1, did not respond within the timeout period.

If this new HBA isn't currently being used, then I would say this probably doesn't have anything to do with the problem.

It appears this may be a problem with mismatched GBICs. I would recommend verifying all the GBICs are from the same manufacturer.
The truth is out there, but I forgot the URL..
Eric_143
New Member

Re: The device, \Device\Scsi\cpqfcalm1, did not respond within the timeout period.

There are two fibre channel diagnostics utilities on the SmartStart CD. One tests the physical path, and the other does actual read tests to the various arrays. You might want to try these tools.
Markus Klingelhoefer
Occasional Advisor

Re: The device, \Device\Scsi\cpqfcalm1, did not respond within the timeout period.

We attached the old RA4x00 over the new FC-HBA (via FC-switch for 1GB/2GB speed conversion). Then fewer warnings and no more errors came.

After the migration to MSA 1000 the problem is repaired. Unfortunately we do not know now, which was the cause of the disturbance.

Thanks to all for your input.