System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

POWERFAILED messag in syslog

 
SOLVED
Go to solution
Craig Williams
Frequent Advisor

POWERFAILED messag in syslog

Hi,

We are having a problem HP-UX B.11.00 A9000/800 server where we get the below error message in the syslog which then stops us from logging into the server and we have to reboot it. It will stay up for a while and then happen again.

Here is the error we are getting in the syslog:-

Oct 4 08:34:29 bhxcs06 vmunix: DIAGNOSTIC SYSTEM WARNING:
Oct 4 08:34:29 bhxcs06 vmunix: The diagnostic logging facility has started receiving e
xcessive
Oct 4 08:34:29 bhxcs06 vmunix: errors from the I/O subsystem. I/O error entries will
be lost
Oct 4 08:34:29 bhxcs06 vmunix: until the cause of the excessive I/O logging is correct
ed.
Oct 4 08:34:29 bhxcs06 vmunix: If the diaglogd daemon is not active, use the Daemon St
artup command
Oct 4 08:34:29 bhxcs06 vmunix: in stm to start it.
Oct 4 08:34:29 bhxcs06 vmunix: If the diaglogd daemon is active, use the logtool utili
ty in stm
Oct 4 08:34:29 bhxcs06 vmunix: to determine which I/O subsystem is logging excessive e
rrors.
Oct 4 08:34:36 bhxcs06 vmunix:
Oct 4 08:34:36 bhxcs06 vmunix: SCSI: Request Timeout -- lbolt: 6735469, dev: 1f03f000
Oct 4 08:34:36 bhxcs06 vmunix: lbp->state: 4060
Oct 4 08:34:36 bhxcs06 vmunix: lbp->offset: f8
Oct 4 08:34:36 bhxcs06 vmunix: lbp->uPhysScript: f87fb000
Oct 4 08:34:36 bhxcs06 vmunix: From most recent interrupt:
Oct 4 08:34:36 bhxcs06 vmunix: ISTAT: 22, SIST0: 00, SIST1: 04, DSTAT: 00
, DSPS: f87fb500
Oct 4 08:34:36 bhxcs06 vmunix: lsp: 42186000
Oct 4 08:34:36 bhxcs06 vmunix: bp->b_dev: 1f03f000
Oct 4 08:34:36 bhxcs06 vmunix: scb->io_id: 30878cc
Oct 4 08:34:36 bhxcs06 vmunix: scb->cdb: 2a 00 00 a4 28 96 00 00 02 00

Oct 4 08:34:36 bhxcs06 vmunix: lbolt_at_timeout: 6735269, lbolt_at_start:
6732269
Oct 4 08:34:36 bhxcs06 vmunix: lsp->state: 205
Oct 4 08:34:36 bhxcs06 vmunix: lbp->owner: 43843200
Oct 4 08:34:36 bhxcs06 vmunix: bp->b_dev: 1f03f000
Oct 4 08:34:36 bhxcs06 vmunix: scb->io_id: 30878d9
Oct 4 08:34:36 bhxcs06 vmunix: scb->cdb: 2a 00 00 bc 66 a0 00 00 10 00

Oct 4 08:34:36 bhxcs06 vmunix: lbolt_at_timeout: 6735969, lbolt_at_start:
6735469
Oct 4 08:34:36 bhxcs06 vmunix: lsp->state: 5
Oct 4 08:34:37 bhxcs06 vmunix:
Oct 4 08:34:37 bhxcs06 vmunix: SCSI: Abort abandoned -- lbolt: 6735504, dev: 1f03f000, io
_id: 30878cc, status: 200
Oct 4 08:34:38 bhxcs06 vmunix: SCSI: Request Timeout -- lbolt: 6735669, dev: 1f03f000
Oct 4 08:34:38 bhxcs06 vmunix: lbp->state: 4060
Oct 4 08:34:38 bhxcs06 vmunix: lbp->offset: f8
Oct 4 08:34:38 bhxcs06 vmunix:
Oct 4 08:34:38 bhxcs06 vmunix: lbp->uPhysScript: f87fb000
Oct 4 08:34:38 bhxcs06 vmunix: From most recent interrupt:
Oct 4 08:34:38 bhxcs06 vmunix: ISTAT: 22, SIST0: 00, SIST1: 04, DSTAT: 00
, DSPS: f87fb500
Oct 4 08:34:38 bhxcs06 vmunix: lsp: 42343600
Oct 4 08:34:38 bhxcs06 vmunix: bp->b_dev: 1f03f000
Oct 4 08:34:38 bhxcs06 vmunix: scb->io_id: 30878cd
Oct 4 08:34:38 bhxcs06 vmunix: scb->cdb: 2a 00 00 01 04 f0 00 00 10 00

Oct 4 08:34:38 bhxcs06 vmunix: lbolt_at_timeout: 6735469, lbolt_at_start:
6732469
Oct 4 08:34:38 bhxcs06 vmunix: lsp->state: 205
Oct 4 08:34:38 bhxcs06 vmunix: lbp->owner: 4f379100
Oct 4 08:34:38 bhxcs06 vmunix: bp->b_dev: 1f03f000
Oct 4 08:34:38 bhxcs06 vmunix: scb->io_id: 30878d1
Oct 4 08:34:38 bhxcs06 vmunix: scb->cdb: 2a 00 00 ab 43 b0 00 00 10 00

Oct 4 08:34:38 bhxcs06 vmunix: lbolt_at_timeout: 6736169, lbolt_at_start:
6735669
Oct 4 08:34:38 bhxcs06 vmunix: lsp->state: 5
Oct 4 08:34:39 bhxcs06 vmunix:
Oct 4 08:34:39 bhxcs06 vmunix: SCSI: Abort abandoned -- lbolt: 6735711, dev: 1f03f000, io
_id: 30878cd, status: 200
Oct 4 08:35:05 bhxcs06 vmunix: LVM: vg[1]: pvnum=0 (dev_t=0x1f03f000) is POWERFAILED


5 REPLIES 5
Highlighted
Francis_12
Trusted Contributor
Solution

Re: POWERFAILED messag in syslog

Hello,

It seems that you have major disks problems at the SCSI layer.

Oct 4 08:34:36 bhxcs06 vmunix: bp->b_dev: 1f03f000

The faulty device should be c3t15d0

Oct 4 08:34:36 bhxcs06 vmunix: scb->cdb: 2a 00 00 a4 28 96 00 00 02 00

The error appeared during a write operation.
#define CMDwrite_ext 0x2A
#define CMDwrite10 0x2A

LVM has therefore marked the faulty device as powerfailed.

You should log a HW call, ask for a HW engineer to replace that disk.

Hope this helps, Bye.

Francis DERDEYN - HP-UX ASCE.
Highlighted
Michael Steele_2
Honored Contributor

Re: POWERFAILED messag in syslog

There are several references to the save device : DEV 1f03f000

And DEV 1f03f000 = c3t16d0

Increasing the timeout to disk often works:

pvchange -t 180 /dev/dsk/c3t1d16
##########################################
#
##########################################

This error:

LVM: vg[1]: pvnum=0 (dev_t=0x1f03f000)

Is referring to the volume group where c3t16d0 resides. LVM: vg[1] can be found if you list out the vgs in /etc/lvmtab.

strings /etc/lvmtab

vg[1] will be the second vg listed.
##########################################
#
##########################################

Please attach the following for further evaluation:

STM > TOOLS > UTILITY > RUN > LOGTOOL > FILE > VIEW > RAW SUMMARY.
Support Fatherhood - Stop Family Law
Highlighted
Michael Steele_2
Honored Contributor

Re: POWERFAILED messag in syslog

Sorry, still working on coffee this morning: c3t15d0 is correct.
Support Fatherhood - Stop Family Law
Highlighted
Steven E. Protter
Exalted Contributor

Re: POWERFAILED messag in syslog

You may have a disk that failed, or someone might have unplugged your disks from power for a time period.

lbolt has always led to eventual disk replacement in my shop except when we swapped out hot swap drives. When this is done, a similar message is repeated in the logs until the system is rebooted.

I am attaching a handy script named disk.status to this post. If a disk has dropped of the face of the machine, it will remain in /etc/lvmtab. This program will spot it and alert you. I would recommend frequent runs until you have the problem locked down.

If you use the script, do change the email address. I've gotten disk.status emails from a half dozen systems in the past few weeks. I don't own any of them.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Highlighted
KCS_1
Respected Contributor

Re: POWERFAILED messag in syslog

Hi,

At first, you should see the under URL which linked relevant problem above message.

http://www2.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&admit=-1335382922+1065421825446+28353475&docId=200000062972816

After then, if you still have a problem,

Verify C1T16d0 in the VG01 that the problem was intermitted or failed following commands under like

# diskinfo -v c1t16d0

# ioscan -fnkCdisk

# dd if=/dev/rdsk/c1t16d0 of=/dev/null bs=64k

Might you find out some unnormal symptom.


Thanks.


Easy going at all.