Operating System - Tru64 Unix
1752568 Members
5299 Online
108788 Solutions
New Discussion юеВ

Re: ADVFS messages

 
admin1979
Super Advisor

ADVFS messages


Hello,

We noticed the following SCSI error in binary.errlog of TRU64System V5.1B.

----- EVENT INFORMATION -----

EVENT CLASS ERROR EVENT
OS EVENT TYPE 199. CAM SCSI
SEQUENCE NUMBER 19474.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Sat Oct 10 06:34:44 2009
OCCURRED ON SYSTEM bwga137
SYSTEM ID x000B0022
SYSTYPE x00000000
PROCESSOR COUNT 2.
PROCESSOR WHO LOGGED x00000000

----- UNIT INFORMATION -----

CLASS x0000 DISK
SUBSYSTEM x0000 DISK
BUS # x0000
x001E LUN x6
TARGET x3



This is the output of ,

WEBSERVER > scu show edt

CAM Equipment Device Table (EDT) Information:

Bus/Target/Lun Device Type ANSI Vendor ID Product ID Revision N/W
-------------- ----------- ------ --------- ---------------- -------- ---
0 1 0 Direct SCSI-2 DEC HSZ70 V73Z W
0 1 1 Direct SCSI-2 DEC HSZ70 V73Z W
0 1 2 Direct SCSI-2 DEC HSZ70 V73Z W
0 1 3 Direct SCSI-2 DEC HSZ70 V73Z W
0 1 4 Direct SCSI-2 DEC HSZ70 V73Z W
0 1 5 Direct SCSI-2 DEC HSZ70CCL V73Z W
0 1 6 Direct SCSI-2 DEC HSZ70 V73Z W
0 1 7 Direct SCSI-2 DEC HSZ70 V73Z W
0 2 0 Direct SCSI-2 DEC HSZ70 V73Z W
0 2 1 Direct SCSI-2 DEC HSZ70 V73Z W
0 2 2 Direct SCSI-2 DEC HSZ70 V73Z W
0 2 3 Direct SCSI-2 DEC HSZ70 V73Z W
0 2 4 Direct SCSI-2 DEC HSZ70 V73Z W
0 2 5 Direct SCSI-2 DEC HSZ70 V73Z W
0 2 6 Direct SCSI-2 DEC HSZ70 V73Z W
0 3 0 Direct SCSI-2 DEC HSZ70 V73Z W
0 3 1 Direct SCSI-2 DEC HSZ70 V73Z W
0 3 2 Direct SCSI-2 DEC HSZ70 V73Z W
0 3 3 Direct SCSI-2 DEC HSZ70 V73Z W
0 3 6 Direct SCSI-2 DEC HSZ70 V73Z W
2 0 0 CD-ROM SCSI-2 COMPAQ CD-224E 9.5B N
6 6 0 Sequential SCSI-2 COMPAQ SDT-9000 4.20 N
7 0 0 Direct SCSI-2 COMPAQ BD036635C5 B020 W
7 1 0 Direct SCSI-2 COMPAQ BD036635C5 B020 W


Now we wish to know which hard disk in question is giving error and what filesystem domain/fileset is related to it.
Once we identify the disk, is there anyway we can take preventive measures to have minimal damage?

Please help.

Thanx,
admin


20 REPLIES 20
Vladimir Fabecic
Honored Contributor

Re: ADVFS messages

Please post output of:
# hwmgr -show scsi
# ls -lR /etc/fdmns
In vino veritas, in VMS cluster
admin1979
Super Advisor

Re: ADVFS messages

Thanx....here is the reply,


> hwmgr -show scsi

SCSI DEVICE DEVICE DRIVER NUM DEVICE FIRST
HWID: DEVICEID HOSTNAME TYPE SUBTYPE OWNER PATH FILE VALID PATH
-------------------------------------------------------------------------
78: 0 WEBSERVER disk none 2 1 dsk0 [0/1/0]
79: 1 WEBSERVER disk none 2 1 dsk1 [0/1/1]
80: 2 WEBSERVER disk none 2 1 dsk2 [0/1/2]
81: 3 WEBSERVER disk none 2 1 dsk3 [0/1/3]
82: 4 WEBSERVER disk none 0 1 dsk4 [0/2/0]
83: 5 WEBSERVER disk none 0 1 dsk5 [0/2/1]
84: 6 WEBSERVER disk none 0 1 dsk6 [0/2/2]
85: 7 WEBSERVER disk none 0 1 dsk7 [0/2/3]
86: 8 WEBSERVER disk none 0 1 dsk8 [0/2/4]
87: 9 WEBSERVER disk none 0 1 dsk9 [0/2/5]
88: 10 WEBSERVER disk none 0 1 dsk10 [0/2/6]
89: 11 WEBSERVER disk none 0 1 scp0 [0/1/5]
90: 12 WEBSERVER disk none 0 1 dsk11 [0/3/0]
91: 13 WEBSERVER disk none 0 1 dsk12 [0/3/1]
93: 15 WEBSERVER disk none 2 1 dsk14 [0/3/6]
94: 16 WEBSERVER cdrom none 0 1 cdrom0 [2/0/0]
95: 18 WEBSERVER disk none 0 1 dsk15 [7/0/0]
96: 19 WEBSERVER disk none 0 1 dsk16 [7/1/0]
97: 17 WEBSERVER tape none 0 1 tape0 [6/6/0]
102: 20 WEBSERVER disk none 2 1 dsk17 [0/1/4]
103: 21 WEBSERVER disk none 0 1 (null)
175: 14 WEBSERVER disk none 2 1 dsk20 [0/3/2]
176: 22 WEBSERVER disk none 2 1 dsk21 [0/3/3]
177: 23 WEBSERVER disk none 0 1 dsk22 [0/1/6]
178: 24 WEBSERVER disk none 0 1 dsk23 [0/1/7]


> ls -lR /etc/fdmns
total 112
-r-------- 1 root system 0 Mar 31 2004 .advfslock_archive_domain
-r-------- 1 root system 0 Mar 22 2004 .advfslock_backup_domain
-r-------- 1 root system 0 Mar 13 2004 .advfslock_cluster_root
-r-------- 1 root system 0 Mar 13 2004 .advfslock_cluster_usr
-r-------- 1 root system 0 Mar 13 2004 .advfslock_cluster_var
-r-------- 1 root system 0 Mar 13 2004 .advfslock_fdmns
-r-------- 1 root system 0 Mar 22 2004 .advfslock_log1_domain
-r-------- 1 root system 0 Mar 22 2004 .advfslock_log2_domain
-r-------- 1 root system 0 Mar 22 2004 .advfslock_opt_domain
-r-------- 1 root system 0 Mar 13 2004 .advfslock_oradata_domain
-r-------- 1 root system 0 Mar 13 2004 .advfslock_root1_domain
-r-------- 1 root system 0 Mar 13 2004 .advfslock_root2_domain
-r-------- 1 root system 0 Mar 13 2004 .advfslock_root_domain
-r-------- 1 root system 0 Oct 24 2005 .advfslock_struppi_backup
-r-------- 1 root system 0 Mar 13 2004 .advfslock_usr_domain
drwxr-xr-x 2 root system 8192 Mar 31 2004 archive_domain
drwxr-xr-x 2 root system 8192 Mar 22 2004 backup_domain
drwxr-xr-x 2 root system 8192 Mar 13 2004 cluster_root
drwxr-xr-x 2 root system 8192 Mar 13 2004 cluster_usr
drwxr-xr-x 2 root system 8192 Mar 13 2004 cluster_var
drwxr-xr-x 2 root system 8192 Mar 22 2004 log1_domain
drwxr-xr-x 2 root system 8192 Mar 22 2004 log2_domain
drwxr-xr-x 2 root system 8192 Mar 22 2004 opt_domain
drwxr-xr-x 2 root system 8192 Mar 15 2004 oradata_domain
drwxr-xr-x 2 root system 8192 Mar 13 2004 root1_domain
drwxr-xr-x 2 root system 8192 Mar 13 2004 root2_domain
drwxr-xr-x 2 root system 8192 Mar 13 2004 root_domain
drwxr-xr-x 2 root system 8192 Oct 24 2005 struppi_backup
drwxr-xr-x 2 root system 8192 Mar 13 2004 usr_domain


/etc/fdmns/archive_domain:
total 0
lrwxr-xr-x 1 root system 16 Mar 31 2004 dsk17c -> /dev/disk/dsk17c

/etc/fdmns/backup_domain:
total 0
lrwxr-xr-x 1 root system 16 Mar 22 2004 dsk21d -> /dev/disk/dsk21d

/etc/fdmns/cluster_root:
total 0
lrwxr-xr-x 1 root system 15 Mar 13 2004 dsk0a -> /dev/disk/dsk0a

/etc/fdmns/cluster_usr:
total 0
lrwxr-xr-x 1 root system 15 Mar 13 2004 dsk0e -> /dev/disk/dsk0e

/etc/fdmns/cluster_var:
total 0
lrwxr-xr-x 1 root system 15 Mar 13 2004 dsk0f -> /dev/disk/dsk0f

/etc/fdmns/log1_domain:
total 0
lrwxr-xr-x 1 root system 16 Mar 22 2004 dsk20d -> /dev/disk/dsk20d

/etc/fdmns/log2_domain:
total 0
lrwxr-xr-x 1 root system 16 Mar 22 2004 dsk21a -> /dev/disk/dsk21a

/etc/fdmns/opt_domain:
total 0
lrwxr-xr-x 1 root system 16 Mar 22 2004 dsk20a -> /dev/disk/dsk20a

/etc/fdmns/oradata_domain:
total 0
lrwxrwxrwx 1 root system 16 Mar 15 2004 dsk14c -> /dev/disk/dsk14c

/etc/fdmns/root1_domain:
total 0
lrwxr-xr-x 1 root system 15 Mar 13 2004 dsk1a -> /dev/disk/dsk1a

/etc/fdmns/root2_domain:
total 0
lrwxr-xr-x 1 root system 15 Mar 13 2004 dsk2a -> /dev/disk/dsk2a

/etc/fdmns/root_domain:
total 0
lrwxr-xr-x 1 root system 16 Mar 13 2004 dsk16a -> /dev/disk/dsk16a

/etc/fdmns/struppi_backup:
total 0
lrwxr-xr-x 1 root system 16 Oct 24 2005 dsk21e -> /dev/disk/dsk21e

/etc/fdmns/usr_domain:
total 0
lrwxr-xr-x 1 root system 16 Mar 13 2004 dsk16g -> /dev/disk/dsk16g

Venkatesh BL
Honored Contributor

Re: ADVFS messages

>BUS # x0000
> x001E LUN x6
> TARGET x3

> 93: 15 WEBSERVER disk none 2 1 dsk14 [0/3/6]

/etc/fdmns/oradata_domain:
total 0
lrwxrwxrwx 1 root system 16 Mar 15 2004 dsk14c -> /dev/disk/dsk14c

Looks like 'oradata_domain' to me.
admin1979
Super Advisor

Re: ADVFS messages

Ok fine ......that surely is. Which is currently mounted and has data.
So next question is is the disk is really gone/going bad?
Takeing the data backup is the only precaution to take?
This is the clustered node.
Vladimir Fabecic
Honored Contributor

Re: ADVFS messages

Disk with problem is:
BUS # x0000
x001E LUN x6
TARGET x3
That means BUS 0 TARGET 3 LUNN 6, ant that is:
93: 15 WEBSERVER disk none 2 1 dsk14 [0/3/6]
So have a look for dsk14:
"etc/fdmns/oradata_domain:
total 0
lrwxrwxrwx 1 root system 16 Mar 15 2004 dsk14c -> /dev/disk/dsk14c"
So oradata_domain is related.

"Once we identify the disk, is there anyway we can take preventive measures to have minimal damage?"

Depends on many things. This message may not be problematic, maybe it is just write block retry. Also depends on your storage configuration, are all virtual disks mirrored?

The best preventive measure to have minimal damage is to do backup of data every day.

Do you need any other information?
In vino veritas, in VMS cluster
admin1979
Super Advisor

Re: ADVFS messages

HSZ70 > show disks

Are all disks mirrored? I really dont know.
Following is the HSZ Disk configuration.
But not sure how to relate ( e.g. disk14 ) from below disks. Any idea?

HSZ > show disks

Name Type Port Targ Lun Used by
---------------------------------------------
DISK10000 disk 1 0 0 MIRROR0
DISK10100 disk 1 1 0 RAIDFS2
DISK10200 disk 1 2 0 D106
D107
DISK10300 disk 1 3 0 MIRRDB1
DISK20000 disk 2 0 0 MIRROR0
DISK20100 disk 2 1 0 MIRROR1
DISK20200 disk 2 2 0 MIRRDB1
DISK20300 disk 2 3 0 RAIDTLS1
DISK30000 disk 3 0 0 MIRROR1
DISK30100 disk 3 1 0 SPARESET
DISK30200 disk 3 2 0 RAIDTLS1
DISK30300 disk 3 3 0 RAIDFS1
DISK40000 disk 4 0 0 MIRRDB2
DISK40100 disk 4 1 0 RAIDTLS1
DISK40200 disk 4 2 0 MIRRDB2
DISK40300 disk 4 3 0 RAIDFS2
DISK50000 disk 5 0 0 RAIDTLS1
DISK50100 disk 5 1 0 RAIDFS1
DISK50200 disk 5 2 0 SPARESET
DISK50300 disk 5 3 0 D302
D303
DISK60000 disk 6 0 0 RAIDFS1
DISK60100 disk 6 1 0 RAIDFS2
DISK60200 disk 6 2 0 RAIDFS2
DISK60300 disk 6 3 0 RAIDFS1
Venkatesh BL
Honored Contributor

Re: ADVFS messages

Did you get any error message from AdvFS? aying meta-data read/write failed, etc? If not, you should be Ok.

As mentioned by Vladimir, keep a backup handy.
admin1979
Super Advisor

Re: ADVFS messages

So far no error. But still would like to know which disk it actually is from the show disks output. So that I can relate the disks in future. Also it will help me identify the physical location of the disk.
Vladimir Fabecic
Honored Contributor

Re: ADVFS messages

Please post output of:
HSZ70 > show unit full
HSZ70 > show failed
HSZ70 > show spare
In vino veritas, in VMS cluster