Operating System - Tru64 Unix
1753230 Members
3294 Online
108792 Solutions
New Discussion юеВ

Re: system crash

 
Yong_7
Frequent Advisor

Re: system crash

Hi HY,

I can't access your attachment at this time.( due to company firewall masterwork ).

from other gurus posts, I think you got the point that the problem is with a advfs domain.

the reason to look at controller configuration is trying to find out whether you have RAID set under that oracle_domain to pinpoint the specific physical disk/disks.

you may have a look at /var/adm/binaryerr.log file by
#uerf -R -o full | more

any finding about bad spot on the hard drive ? or it just simple advfs domain panic ? or whatever.

in any case, have a look at advfs Admin manual there, and "# verify " is the thing we should do, plus have a look at salvage by
"#man salvage "( you may need fix hard drive if that's the root cause )

4.0D is not supported by vendor. just make sure keep patch level up-to-date.

http://www1.itrc.hp.com/service/patch/search.do?pageContextName=tru%3A%3A&admit=-682735245+1084892897444+28353475

regards !

YJ
hy_3
Frequent Advisor

Re: system crash

#uerf -R -o full | more
uerf version 4.2-011 (122)


********************************* ENTRY 1. *********************************

----- EVENT INFORMATION -----

EVENT CLASS ERROR EVENT
OS EVENT TYPE 199. CAM SCSI
SEQUENCE NUMBER 5149.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Wed May 19 15:22:49 2004
OCCURRED ON SYSTEM sjbdaa
SYSTEM ID x00070016
SYSTYPE x00000000
PROCESSOR COUNT 2.
PROCESSOR WHO LOGGED x00000000

----- UNIT INFORMATION -----

CLASS x001F UNKNOWN
SUBSYSTEM x0000 DISK
BUS # x0010
x043F LUN x7

TARGET x7

----- CAM STRING -----

ROUTINE NAME targ_send_comp

----- CAM STRING -----

Target SEND failed

----- CAM STRING -----

ERROR TYPE Soft Error Detected (recovered)

----- CAM STRING -----

Active CCB at time of error
ERROR - os_std, os_type = 11, std_type = 10


----- ENT_CCB_SCSIIO -----
*MY ADDR xEFE37580
CCB LENGTH x00C0
FUNC CODE x01
CAM_STATUS x0013 CAM_UNEXP_BUSFREE
PATH ID 16.
TARGET ID 7.
TARGET LUN 7.
CAM FLAGS x00001480
CAM_DIR_OUT
CAM_SIM_QFRZDIS
CAM_SIM_QHEAD
*PDRV_PTR xEFE37228
*NEXT_CCB x00000000
*REQ_MAP x00000000
VOID (*CAM_CBFCNP)() x00674B58
*DATA_PTR x07A5FD60
DXFER_LEN x0000008C
*SENSE_PTR xEFE37250
SENSE_LEN xA4
CDB_LEN x06
SGLIST_CNT x0000
CAM_SCSI_STATUS x0000 SCSI_STAT_GOOD
SENSE_RESID x00
RESID x00000000
CAM_CDB_IO x000000000000018C0000E00A
CAM_TIMEOUT x00000005
MSGB_LEN x0000
VU_FLAGS x0000
TAG_ACTION x00

----- ENT_SENSE_DATA -----

ERROR CODE x0000 CODE x0
SEGMENT x00
SENSE KEY x0000 NO SENSE
INFO BYTE 3 x00
INFO BYTE 2 x00
INFO BYTE 1 x00
INFO BYTE 0 x00
ADDITION LEN x00
CMD SPECIFIC 3 x00
CMD SPECIFIC 2 x00
CMD SPECIFIC 1 x00
CMD SPECIFIC 0 x00
ASC x00
ASQ x00
FRU x00
SENSE SPECIFIC x000000
ADDITIONAL SENSE
0000: 00000000 00000000 00000000 00000000 *................*
0010: 00000000 00000000 00000000 00000000 *................*
0020: 00000000 00000000 00000000 00000000 *................*
0030: 00000000 00000000 00000000 00000000 *................*
0040: 00000000 00000000 00000000 00000000 *................*
0050: 00000000 00000000 00000000 00000000 *................*
0060: 00000000 00000000 00000000 00000000 *................*
0070: 00000000 00000000 00000000 00000000 *................*
0080: 00000000 00000000 00000000 00000000 *................*
0090: 00000000 00000000 7E250000 00005E3C *..........%~<^..*
00A0: 00000000 *.... *
The messages above repeat every 10 seconds.Please help me analyze them.Thank you.


Michael Schulte zur Sur
Honored Contributor

Re: system crash

Hi,

do you have decevent on your machine?
If so, post the relevant part from
dia -R | more
It is more precise than uerf.

thanks,

Michael
Yong_7
Frequent Advisor

Re: system crash

Hi hy,

the message in your last post indicates a bad block replacement sequence, and system was able to self-heal itself. if you have many them there, this could be a sign that necessary hard drive replacement, also noticed alpha 1200 has loooong life, so does your storage i guess.

DECevent may help more if you paid for that,
uerf is universal, now CA is in charge for all.

anyway, you still need address that advfs domain first. that's the key.

Good Luck !

YJ
Ralf Puchner
Honored Contributor

Re: system crash

the error within the message files indicates an advfs write problem. So if the check of the involved disks indicates a hardware problem, please replace the disks.

If you need the data within the domains, use verify or salvage to try to repair the domains. Please read the admin guide first explaining how to use the tools. Be sure a backup exists if all fails....

Help() { FirstReadManual(urgently); Go_to_it;; }