Operating System - Tru64 Unix
1839199 Members
3058 Online
110137 Solutions
New Discussion

Re: SCSI CAM Errors

 
G-MAN
Frequent Advisor

SCSI CAM Errors

On the Jan 2 we starting having a problem where our backups are not running. In investigating the problem i found out that cron is not starting until 1:04 am and the system rebooted at 12:30 am. The backups are schedule to run at 12:45 am and we have never had a problem until jan 2 and the same thing happened on Jan 3. I was also checking the Root mail and I found all kinds of SCSI errors but I don't know what to make of them.
We recently upgraded from 5.1A to 5.1B. The system is an ES40 connected to an MA6000 with two HSG60 controllers.

Subject: EVM ALERT [700]: SCSI event

======================= Binary Error Log event =======================
EVM event name: sys.unix.binlog.hw.scsi

Binary error log events are posted through the binlogd daemon, and
stored in the binary error log file, /var/adm/binary.errlog. This
event is used to report all SCSI device errors, including disk,
tape, HSZ raid events, and adapter errors.

======================================================================

Formatted Message:
SCSI event

Event Data Items:
Event Name : sys.unix.binlog.hw.scsi
Priority : 700
PID : 329
PPID : 1
Event Id : 487
Timestamp : 02-Jan-2005 02:57:28
Host IP address : 128.100.102.2
Host Name : BLOOR-ES40
User Name : root
Format : SCSI event
Reference : cat:evmexp.cat:300

Variable Items:
subid_class (INT32) = 199
subid_num (INT32) = 0
subid_unit_num (INT32) = 0
subid_type (INT32) = 34
binlog_event (OPAQUE) = [OPAQUE VALUE: 1224 bytes]

============================ Translation =============================
Sequence number of error: 226426882
Time of error entry: 02-Jan-2005 02:51:27
Host name: BLOOR-ES40

SCSI CAM ERROR PACKET
Controller type: DISK
SCSI device class: DEC SIM
Bus Number: 0
Target number: 0
Lun Number: 0

Name of routine that logged the event: ss_perform_timeout
Event information: timeout on disconnected request

############### Entry End ###############

Event information: Active CCB at time of error

############### Entry End ###############
6 REPLIES 6
Mohamed  K Ahmed
Trusted Contributor

Re: SCSI CAM Errors

Sometimes I get the same message on my ES40 running V 5.1B. What type of tape drive do you have, and what kind of SCSI card is it connected to?

Mohamed
G-MAN
Frequent Advisor

Re: SCSI CAM Errors

The tape drive is a DAT 12/24 Compaq model connected to a KZPBA-CA SCSI controller. We have had to replace this Tape Drive many times due to failures.
Ralf Puchner
Honored Contributor

Re: SCSI CAM Errors

simple timeout and scsi-bus reset, check cabling and configuration rules
Help() { FirstReadManual(urgently); Go_to_it;; }
G-MAN
Frequent Advisor

Re: SCSI CAM Errors

What do you mean by configuration rules. Nothing has changed. These events are pointing to the internal disk drives I am wondering if it is the scsi controller, again everything is internal to the ES40 so I cannot imagine a loose cable or damaged cable would be the cause.
Any thoughts ?
Ralf Puchner
Honored Contributor

Re: SCSI CAM Errors

configuration rules means the rules how, what to connect to any cable. There are rules defined e.g. only on changer per scsi bus, console settings for termination etc. which devices is supported, which firmware is necessary etc.... named configuration rules!

The ES40 uses internally also cables etc, maybe there are some termination problems (see console variables). If you are lost, open a call within the HP support center and offer binary.errlog for analyze.
Help() { FirstReadManual(urgently); Go_to_it;; }
G-MAN
Frequent Advisor

Re: SCSI CAM Errors

I finally made it in front of the system over the weekend and found one of the disk drives lights just solid. Every time I tried to issue a command to check the logs or the drive the box just hung for 6-7 mins before it would become resposive again. So I decided to replace the drive with a spare I had, luckily for me the drive was just a secondary backup drive.
Anyway the drive swap seemed to clear the errors completely and I have had no issues so far. Thank you for all your replies.