Operating System - HP-UX
1832978 Members
2680 Online
110048 Solutions
New Discussion

Re: help,my box has problem again,fanit

 
SOLVED
Go to solution
thebeatlesguru
Regular Advisor

help,my box has problem again,fanit

today morning ,i cant telnet one of my hp server, and i cant ftp to it also,i think maybe the inetd has broken,so i goto the machine room use the console to login,however after i input "root" ,and input"enter" ,it should show "password" ,but there is nothing ,it seems that the console dead.
finally,i restart the server ,and find something in OLDsyslog.log:Sep 12 19:20:23 netf-pd vmunix: DIAGNOSTIC SYSTEM WARNING:
Sep 12 19:20:23 netf-pd vmunix: The diagnostic logging facility has started receiving excessive
Sep 12 19:20:23 netf-pd vmunix: errors from the I/O subsystem. I/O error entries will be lost
Sep 12 19:20:23 netf-pd vmunix: until the cause of the excessive I/O logging is corrected.
Sep 12 19:20:23 netf-pd vmunix: If the diaglogd daemon is not active, use the Daemon Startup command
Sep 12 19:20:23 netf-pd vmunix: in stm to start it.
Sep 12 19:20:23 netf-pd vmunix: If the diaglogd daemon is active, use the logtool utility in stm
Sep 12 19:20:23 netf-pd vmunix: to determine which I/O subsystem is logging excessive errors.
Sep 12 19:20:48 netf-pd vmunix:
Sep 12 19:20:48 netf-pd vmunix: SCSI: Request Timeout -- lbolt: 1344761, dev: 1f008000
Sep 12 19:20:48 netf-pd vmunix: lbp->state: 20
Sep 12 19:20:48 netf-pd vmunix: lbp->offset: 0
Sep 12 19:20:48 netf-pd vmunix: lbp->uPhysScript: 480000
Sep 12 19:20:48 netf-pd vmunix: From most recent interrupt:
Sep 12 19:20:48 netf-pd vmunix: ISTAT: 22, SIST0: 00, SIST1: 04, DSTAT: 00, DSPS: 00480500
Sep 12 19:20:48 netf-pd vmunix: lsp: 444de00
Sep 12 19:20:48 netf-pd vmunix: bp->b_dev: 1f008000
Sep 12 19:20:48 netf-pd vmunix: scb->io_id: 11379
Sep 12 19:20:48 netf-pd vmunix: scb->cdb: 2a 00 00 59 0e 34 00 00 02 00
Sep 12 19:20:48 netf-pd vmunix: lbolt_at_timeout: 1344561, lbolt_at_start: 1341561
Sep 12 19:20:48 netf-pd vmunix: lsp->state: 205
Sep 12 19:20:48 netf-pd vmunix: lbp->owner: 4eb6200
Sep 12 19:20:48 netf-pd vmunix: bp->b_dev: 1f008000
Sep 12 19:20:48 netf-pd vmunix: scb->io_id: 1137d
Sep 12 19:20:48 netf-pd vmunix: scb->cdb: 2a 00 00 87 13 40 00 00 40 00
Sep 12 19:20:48 netf-pd vmunix: lbolt_at_timeout: 1345261, lbolt_at_start: 1344761
Sep 12 19:20:48 netf-pd vmunix: lsp->state: 5
Sep 12 19:20:48 netf-pd vmunix:
Sep 12 19:20:48 netf-pd vmunix: SCSI: Abort abandoned -- lbolt: 1344811, dev: 1f008000, io_id: 11379, status: 2
00
Sep 12 19:21:12 netf-pd vmunix: LVM: vg[0]: pvnum=1 (dev_t=0x1f008000) is POWERFAILED

i think maybe the vg00 has some problem,but i cant tell for sure, please tell me whats wrong with the server.
hihi
6 REPLIES 6
Patrick Chim
Trusted Contributor

Re: help,my box has problem again,fanit

Hi,

For the SCSI error, it seems that the scsi bus is reseted by itself.

At the bottom of the log, a disk of your vg vg[0] (i think it's vg00) is failed. Is the light of your harddisk is ON now ?

In my experience, it's the failure of the harddisk first. When the system check that there is no scsi response from it, the bus is self-reset and so call your system hang.

I think the best way is call the HP support for a checking asap.

Regards,
Patrick
T G Manikandan
Honored Contributor
Solution

Re: help,my box has problem again,fanit

Just check whether the disk c0t8d0 is gone bad.
0x1f008000
The First two hex digits (1f - 31 decimal) denote the major device number.
Then the next digits are
00 - c0
8 - t8
0 - d0

These errors come when the disk stops responding to requests.

Also this could be improper cabling and no proper termination.

Just check your hard disk
using

dd if=/dev/dsk/c0t8d0 of=/dev/null bs=4096

if this is successfull you should get
xxx Blocks in
xxx Blocks out

I think it is the time to replace the disk.
The machine stops responding nothing could be done other than a hard boot.

Revert.

Thanks
thebeatlesguru
Regular Advisor

Re: help,my box has problem again,fanit

#dd if=/dev/dsk/c0t8d0 of=/dev/null bs=4096

x2222889+0 records in
2222889+0 records out


it means that is no problem with harddisk?
hihi
T G Manikandan
Honored Contributor

Re: help,my box has problem again,fanit

check for the proper SCSI termination and cabling.


I have attached a document

check for the

PHKL_24004 1.0 SCSI IO Subsystem Cumulative Patch

Peter Kloetgen
Esteemed Contributor

Re: help,my box has problem again,fanit

Hi Guru,

if dd- command works, there is no problem with your disk.

This problem you described sounds like a SCSI- conflict to me. Please check if you have set SCSI IDs correctly. Is there one of them given to TWO devices??? This would cause a reset of your console and make trouble like you have.

Allways stay on the bright side of life!

Peter
I'm learning here as well as helping
thebeatlesguru
Regular Advisor

Re: help,my box has problem again,fanit

i have found that the problem is caused by one application process,and the harddisk is no problem.
thanks everyone
hihi