Operating System - Tru64 Unix
1839228 Members
3183 Online
110137 Solutions
New Discussion

Re: Cluster Problem

 
admin1979
Super Advisor

Cluster Problem

Hello,

We have a TRU Cluster V5.1A node. It was not reachable and nothing was getting displayed on the screen. Tried to ssh but no help.
Hence rebooted the cluster node and it has come online.
In the logs we found few error events.
Please find the attachment for the relevant logs.

Logs show error event pertaining to (B9 T0 L0)
TAPE and DEC SIM.
So looks like the tape library connected to host has some problem. But not sure what DEC SIM is.

Could you please spare a thought on this?


Thanx,
admin
10 REPLIES 10
Mobeen_1
Esteemed Contributor

Re: Cluster Problem

DEC SIM is the SCSI device class.
admin1979
Super Advisor

Re: Cluster Problem

Ok...so its just a class....so is it correct when I say there is only problem with the tape drive?
Mobeen_1
Esteemed Contributor

Re: Cluster Problem

It looks....would you mind posting the error (cut-paste, only relevant section?)
admin1979
Super Advisor

Re: Cluster Problem

Here is the relevant part in Binary.errlog

----- EVENT INFORMATION -----

EVENT CLASS ERROR EVENT
OS EVENT TYPE 199. CAM SCSI
SEQUENCE NUMBER 15231.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Sat Apr 24 10:06:19 2010
OCCURRED ON SYSTEM bwgc559
SYSTEM ID x000B0022
SYSTYPE x00000000
PROCESSOR COUNT 2.
PROCESSOR WHO LOGGED x00000000

----- UNIT INFORMATION -----

CLASS x0001 TAPE
SUBSYSTEM x0001 TAPE
BUS # x0009
x0240 LUN x0
TARGET x0




Thanx,
admin
Mobeen_1
Esteemed Contributor

Re: Cluster Problem

Please check the drive
Rob Leadbeater
Honored Contributor

Re: Cluster Problem

Hi,

I wouldn't have expected a SCSI tape error to completely lock up a node.

Did any of the other members of the cluster log any useful information ?

Take a look at the EVM log files - usually easiest through sysman.

Cheers,

Rob
admin1979
Super Advisor

Re: Cluster Problem

The other members did not recr any relevant errors. And the Tape Library is quite old. The tape backupruns for hours and during the backup , the system gets very much occupied as per the observation. So we felt that would have hanged the system.
John Manger
Valued Contributor

Re: Cluster Problem

Examine the binary.errlog SCSI events using DECevent rather than uerf - DECevent gives much more information about the event(s), and it -might- shed some light on what happened.

John M
Nobody can serve both God and Money
admin1979
Super Advisor

Re: Cluster Problem

Hi...

Never used DEC Event viewer....I will check.
admin1979
Super Advisor

Re: Cluster Problem

As mentioned above.