Operating System - HP-UX
1832973 Members
2426 Online
110048 Solutions
New Discussion

help,my box has problem again,fanit

 
SOLVED
Go to solution
thebeatlesguru
Regular Advisor

help,my box has problem again,fanit

today morning ,i cant telnet one of my hp server, and i cant ftp to it also,i think maybe the inetd has broken,so i goto the machine room use the console to login,however after i input "root" ,and input"enter" ,it should show "password" ,but there is nothing ,it seems that the console dead.
finally,i restart the server ,and find something in OLDsyslog.log:Sep 12 19:20:23 netf-pd vmunix: DIAGNOSTIC SYSTEM WARNING:
Sep 12 19:20:23 netf-pd vmunix: The diagnostic logging facility has started receiving excessive
Sep 12 19:20:23 netf-pd vmunix: errors from the I/O subsystem. I/O error entries will be lost
Sep 12 19:20:23 netf-pd vmunix: until the cause of the excessive I/O logging is corrected.
Sep 12 19:20:23 netf-pd vmunix: If the diaglogd daemon is not active, use the Daemon Startup command
Sep 12 19:20:23 netf-pd vmunix: in stm to start it.
Sep 12 19:20:23 netf-pd vmunix: If the diaglogd daemon is active, use the logtool utility in stm
Sep 12 19:20:23 netf-pd vmunix: to determine which I/O subsystem is logging excessive errors.
Sep 12 19:20:48 netf-pd vmunix:
Sep 12 19:20:48 netf-pd vmunix: SCSI: Request Timeout -- lbolt: 1344761, dev: 1f008000
Sep 12 19:20:48 netf-pd vmunix: lbp->state: 20
Sep 12 19:20:48 netf-pd vmunix: lbp->offset: 0
Sep 12 19:20:48 netf-pd vmunix: lbp->uPhysScript: 480000
Sep 12 19:20:48 netf-pd vmunix: From most recent interrupt:
Sep 12 19:20:48 netf-pd vmunix: ISTAT: 22, SIST0: 00, SIST1: 04, DSTAT: 00, DSPS: 00480500
Sep 12 19:20:48 netf-pd vmunix: lsp: 444de00
Sep 12 19:20:48 netf-pd vmunix: bp->b_dev: 1f008000
Sep 12 19:20:48 netf-pd vmunix: scb->io_id: 11379
Sep 12 19:20:48 netf-pd vmunix: scb->cdb: 2a 00 00 59 0e 34 00 00 02 00
Sep 12 19:20:48 netf-pd vmunix: lbolt_at_timeout: 1344561, lbolt_at_start: 1341561
Sep 12 19:20:48 netf-pd vmunix: lsp->state: 205
Sep 12 19:20:48 netf-pd vmunix: lbp->owner: 4eb6200
Sep 12 19:20:48 netf-pd vmunix: bp->b_dev: 1f008000
Sep 12 19:20:48 netf-pd vmunix: scb->io_id: 1137d
Sep 12 19:20:48 netf-pd vmunix: scb->cdb: 2a 00 00 87 13 40 00 00 40 00
Sep 12 19:20:48 netf-pd vmunix: lbolt_at_timeout: 1345261, lbolt_at_start: 1344761
Sep 12 19:20:48 netf-pd vmunix: lsp->state: 5
Sep 12 19:20:48 netf-pd vmunix:
Sep 12 19:20:48 netf-pd vmunix: SCSI: Abort abandoned -- lbolt: 1344811, dev: 1f008000, io_id: 11379, status: 2
00
Sep 12 19:21:12 netf-pd vmunix: LVM: vg[0]: pvnum=1 (dev_t=0x1f008000) is POWERFAILED

i think maybe the vg00 has some problem,but i cant tell for sure, please tell me whats wrong with the server.
hihi
6 REPLIES 6
Patrick Chim
Trusted Contributor

Re: help,my box has problem again,fanit

Hi,

For the SCSI error, it seems that the scsi bus is reseted by itself.

At the bottom of the log, a disk of your vg vg[0] (i think it's vg00) is failed. Is the light of your harddisk is ON now ?

In my experience, it's the failure of the harddisk first. When the system check that there is no scsi response from it, the bus is self-reset and so call your system hang.

I think the best way is call the HP support for a checking asap.

Regards,
Patrick
T G Manikandan
Honored Contributor
Solution

Re: help,my box has problem again,fanit

Just check whether the disk c0t8d0 is gone bad.
0x1f008000
The First two hex digits (1f - 31 decimal) denote the major device number.
Then the next digits are
00 - c0
8 - t8
0 - d0

These errors come when the disk stops responding to requests.

Also this could be improper cabling and no proper termination.

Just check your hard disk
using

dd if=/dev/dsk/c0t8d0 of=/dev/null bs=4096

if this is successfull you should get
xxx Blocks in
xxx Blocks out

I think it is the time to replace the disk.
The machine stops responding nothing could be done other than a hard boot.

Revert.

Thanks
thebeatlesguru
Regular Advisor

Re: help,my box has problem again,fanit

#dd if=/dev/dsk/c0t8d0 of=/dev/null bs=4096

x2222889+0 records in
2222889+0 records out


it means that is no problem with harddisk?
hihi
T G Manikandan
Honored Contributor

Re: help,my box has problem again,fanit

check for the proper SCSI termination and cabling.


I have attached a document

check for the

PHKL_24004 1.0 SCSI IO Subsystem Cumulative Patch

Peter Kloetgen
Esteemed Contributor

Re: help,my box has problem again,fanit

Hi Guru,

if dd- command works, there is no problem with your disk.

This problem you described sounds like a SCSI- conflict to me. Please check if you have set SCSI IDs correctly. Is there one of them given to TWO devices??? This would cause a reset of your console and make trouble like you have.

Allways stay on the bright side of life!

Peter
I'm learning here as well as helping
thebeatlesguru
Regular Advisor

Re: help,my box has problem again,fanit

i have found that the problem is caused by one application process,and the harddisk is no problem.
thanks everyone
hihi