- Community Home
- >
- Servers and Operating Systems
- >
- Legacy
- >
- Operating System - Tru64 Unix
- >
- system crash
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-16-2004 08:22 PM
тАО05-16-2004 08:22 PM
system crash
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-16-2004 09:22 PM
тАО05-16-2004 09:22 PM
Re: system crash
Do you have the crash dump? Some of the things that you can do is
1. From the console prompt do things like
show ????
do a help if you need to know your
options.
Look out for any environment errors etc
2. After the system is up, you can always
look at your errorlog and dump
(if present) using the stack dump anal
ANAL/SYS
regards
Mobeen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-16-2004 09:27 PM
тАО05-16-2004 09:27 PM
Re: system crash
It's look like there's a problem with the metadata for the /oracle.
Cheers
Nicolas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-16-2004 09:53 PM
тАО05-16-2004 09:53 PM
Re: system crash
I am sorry, i did not realise that there was an attachment. As our friend has highlighted, it looks like the following are the problem areas
bs_osf_complete: metadata write failed
Oct 22 13:48:42 sjbdaa vmunix: AdvFS Domain Panic; Domain oracle_domain Id 0x36f305f6.0004e6f2
Oct 22 13:48:42 sjbdaa vmunix: An AdvFS domain panic has occurred due to either a metadata write error or an internal inconsistency. This domain is being rendered inaccessible.
Oct 22 13:48:42 sjbdaa vmunix: Please refer to guidelines in AdvFS Guide to File System Administration regarding what steps to take to recover this domain.
Oct 22 13:50:42 sjbdaa vmunix: AdvFS I/O error:
Oct 22 13:50:42 sjbdaa vmunix: Volume: /dev/rzb128d
Oct 22 13:50:42 sjbdaa vmunix: Tag: 0xfffffff7.0000
Oct 22 13:50:42 sjbdaa vmunix: Page: 175
Oct 22 13:50:42 sjbdaa vmunix: Block: 3952
Oct 22 13:50:42 sjbdaa vmunix: Block count: 16
Oct 22 13:50:42 sjbdaa vmunix: Type of operation: Write
Oct 22 13:50:42 sjbdaa vmunix: Error: 5
Oct 22 13:50:42 sjbdaa vmunix:
Oct 22 13:50:42 sjbdaa vmunix: bs_osf_complete: metadata write failed
Oct 22 13:50:42 sjbdaa vmunix: AdvFS Domain Panic; Domain dbf_domain Id 0x36f30612.000011f7
Oct 22 13:50:42 sjbdaa vmunix: An AdvFS domain panic has occurred due to either a metadata write error or an internal inconsistency. This domain is being rendered inaccessible.
Oct 22 13:50:42 sjbdaa vmunix: Please refer to guidelines in AdvFS Guide to File System Administration regarding what steps to take to recover this domain.
Oct 24 22:51:06 sjbdaa vmunix:
regards
Mobeen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-16-2004 11:55 PM
тАО05-16-2004 11:55 PM
Re: system crash
Looking at the timestamps of the errors i posted in my previous message, it looks like they are pretty old, dated as far as Oct ?
Are we missing something
regards
Mobeen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-17-2004 02:45 AM
тАО05-17-2004 02:45 AM
Re: system crash
May 17 11:50:35 sjbdaa vmunix: Volume: /dev/rzb128f
May 17 11:50:35 sjbdaa vmunix: Tag: 0xfffffff7.0000
May 17 11:50:35 sjbdaa vmunix: Page: 510
May 17 11:50:35 sjbdaa vmunix: Block: 8784
May 17 11:50:35 sjbdaa vmunix: Block count: 32
May 17 11:50:35 sjbdaa vmunix: Type of operation: Write
May 17 11:50:35 sjbdaa vmunix: Error: 5
Hi,
it seems you have a problem with your hsz50.
Can you post
show this full
show other (if redundant)
show disk full
show mirror
show fail
thanks,
Michael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-17-2004 01:20 PM
тАО05-17-2004 01:20 PM
Re: system crash
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-17-2004 05:09 PM
тАО05-17-2004 05:09 PM
Re: system crash
No problem. You can use all the show commands that were requested without any issues on a running system. Trust me for that :-). Go ahead and give us the details
regards
Mobeen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-17-2004 08:40 PM
тАО05-17-2004 08:40 PM
Re: system crash
you can either use the controller port, swcc or the hszterm.
Michael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-17-2004 09:38 PM
тАО05-17-2004 09:38 PM
Re: system crash
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-18-2004 03:10 AM
тАО05-18-2004 03:10 AM
Re: system crash
I can't access your attachment at this time.( due to company firewall masterwork ).
from other gurus posts, I think you got the point that the problem is with a advfs domain.
the reason to look at controller configuration is trying to find out whether you have RAID set under that oracle_domain to pinpoint the specific physical disk/disks.
you may have a look at /var/adm/binaryerr.log file by
#uerf -R -o full | more
any finding about bad spot on the hard drive ? or it just simple advfs domain panic ? or whatever.
in any case, have a look at advfs Admin manual there, and "# verify " is the thing we should do, plus have a look at salvage by
"#man salvage "( you may need fix hard drive if that's the root cause )
4.0D is not supported by vendor. just make sure keep patch level up-to-date.
http://www1.itrc.hp.com/service/patch/search.do?pageContextName=tru%3A%3A&admit=-682735245+1084892897444+28353475
regards !
YJ
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-18-2004 07:33 PM
тАО05-18-2004 07:33 PM
Re: system crash
uerf version 4.2-011 (122)
********************************* ENTRY 1. *********************************
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 199. CAM SCSI
SEQUENCE NUMBER 5149.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Wed May 19 15:22:49 2004
OCCURRED ON SYSTEM sjbdaa
SYSTEM ID x00070016
SYSTYPE x00000000
PROCESSOR COUNT 2.
PROCESSOR WHO LOGGED x00000000
----- UNIT INFORMATION -----
CLASS x001F UNKNOWN
SUBSYSTEM x0000 DISK
BUS # x0010
x043F LUN x7
TARGET x7
----- CAM STRING -----
ROUTINE NAME targ_send_comp
----- CAM STRING -----
Target SEND failed
----- CAM STRING -----
ERROR TYPE Soft Error Detected (recovered)
----- CAM STRING -----
Active CCB at time of error
ERROR - os_std, os_type = 11, std_type = 10
----- ENT_CCB_SCSIIO -----
*MY ADDR xEFE37580
CCB LENGTH x00C0
FUNC CODE x01
CAM_STATUS x0013 CAM_UNEXP_BUSFREE
PATH ID 16.
TARGET ID 7.
TARGET LUN 7.
CAM FLAGS x00001480
CAM_DIR_OUT
CAM_SIM_QFRZDIS
CAM_SIM_QHEAD
*PDRV_PTR xEFE37228
*NEXT_CCB x00000000
*REQ_MAP x00000000
VOID (*CAM_CBFCNP)() x00674B58
*DATA_PTR x07A5FD60
DXFER_LEN x0000008C
*SENSE_PTR xEFE37250
SENSE_LEN xA4
CDB_LEN x06
SGLIST_CNT x0000
CAM_SCSI_STATUS x0000 SCSI_STAT_GOOD
SENSE_RESID x00
RESID x00000000
CAM_CDB_IO x000000000000018C0000E00A
CAM_TIMEOUT x00000005
MSGB_LEN x0000
VU_FLAGS x0000
TAG_ACTION x00
----- ENT_SENSE_DATA -----
ERROR CODE x0000 CODE x0
SEGMENT x00
SENSE KEY x0000 NO SENSE
INFO BYTE 3 x00
INFO BYTE 2 x00
INFO BYTE 1 x00
INFO BYTE 0 x00
ADDITION LEN x00
CMD SPECIFIC 3 x00
CMD SPECIFIC 2 x00
CMD SPECIFIC 1 x00
CMD SPECIFIC 0 x00
ASC x00
ASQ x00
FRU x00
SENSE SPECIFIC x000000
ADDITIONAL SENSE
0000: 00000000 00000000 00000000 00000000 *................*
0010: 00000000 00000000 00000000 00000000 *................*
0020: 00000000 00000000 00000000 00000000 *................*
0030: 00000000 00000000 00000000 00000000 *................*
0040: 00000000 00000000 00000000 00000000 *................*
0050: 00000000 00000000 00000000 00000000 *................*
0060: 00000000 00000000 00000000 00000000 *................*
0070: 00000000 00000000 00000000 00000000 *................*
0080: 00000000 00000000 00000000 00000000 *................*
0090: 00000000 00000000 7E250000 00005E3C *..........%~<^..*
00A0: 00000000 *.... *
The messages above repeat every 10 seconds.Please help me analyze them.Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-18-2004 10:34 PM
тАО05-18-2004 10:34 PM
Re: system crash
do you have decevent on your machine?
If so, post the relevant part from
dia -R | more
It is more precise than uerf.
thanks,
Michael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-19-2004 05:40 AM
тАО05-19-2004 05:40 AM
Re: system crash
the message in your last post indicates a bad block replacement sequence, and system was able to self-heal itself. if you have many them there, this could be a sign that necessary hard drive replacement, also noticed alpha 1200 has loooong life, so does your storage i guess.
DECevent may help more if you paid for that,
uerf is universal, now CA is in charge for all.
anyway, you still need address that advfs domain first. that's the key.
Good Luck !
YJ
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-19-2004 07:54 AM
тАО05-19-2004 07:54 AM
Re: system crash
If you need the data within the domains, use verify or salvage to try to repair the domains. Please read the admin guide first explaining how to use the tools. Be sure a backup exists if all fails....