HPE 9000 and HPE e3000 Servers
1820204 Members
3761 Online
109620 Solutions
New Discussion юеВ

powerfailed message in syslog

 
SOLVED
Go to solution
Dave Chamberlin
Trusted Contributor

powerfailed message in syslog

While reviewing my syslog.log file, a few days ago there were a number of messages prefaced with:
LVM: vg[4]:pvnum=0 (dev_t=0x1f003000) is POWERFAILED
SCSI: request timeout.....
(many other related lines....)
The messages haven't repeated since. Is this telling me that one of my disks is going bad?
5 REPLIES 5
Patrick Wessel
Honored Contributor

Re: powerfailed message in syslog

Dave,
This message does not tell you that one of your disks has a hardware problem. It points out that the LVM wasn't able to finish a request or response to that disk with in a defined timeframe. The guys who wrote the LVM thought a power failed disk could be the only problem for it, but it isn't

You can decode the address that way:

1f: major Number
00: bus
3: target
0: lun
00: some flags

You can figure out on which bus the disk is by:
ioscan -fC ext_bus

One cause of the message you found is a high IO load. If the message never reappears I wouldn't mind
There is no good troubleshooting with bad data
Brian M. Fisher
Honored Contributor
Solution

Re: powerfailed message in syslog

The errors could be the result of one or more of the following:


If the error is accompanied by a message about pv[#] returned to vg[#], then the error can usually be attributed to a low timeout value on the disk driver. By default, this timeout is 30 seconds.

Increase the timeout up to the maximum of 180 seconds:
pvchange -t 180 /dev/dsk/disk_device

Increasing the timeout will not affect I/O performance on the disk.


Make sure that the latest SCSI/LVM patch (and its dependencies) are installed. For s800 10.20, this patch is:
PHKL_16751 :LVM:JFS:PCI:SCSI:SIG_IGN:SIGCLD:LITS:

As with all patches, please use the Patch Database at http://itrc.hp.com to determine the latest version.


Check for an I/O bottleneck on the disk.
sar -d

A high amount of traffic on a disk can cause severe performance problems and can cause requests to timeout.


If the error is NOT accompanied by a message about pv[#] returned to vg[#], then the error can usually be attributed to a hardware problem on the disk. DO NOT install patches on the system until the hardware has been diagnosed.

Change the timeout value on the disk and watch for further errors. Contact HP Hardware Support immediately if the errors persist.

If the powerfail messages are accompanied by lbolt errors
For example:

SCSI: Request Timeout -- lbolt: #######, dev: ######

check the SCSI controller connections/terminators. Make sure all connections are tight. If the errors persist, contact HP Hardware support immediately.


Check for an I/O bottleneck on the disk.
sar -d

A high amount of traffic on a disk can cause severe performance problems and can mimic a hardware issue.

This information is from HP document#KBRC00000668
http://us-support2.external.hp.com/cki/bin/doc.pl/

Brian
<*(((>< er
Perception IS Reality
Dave Chamberlin
Trusted Contributor

Re: powerfailed message in syslog

I checked my system activity for the time of the errors and there was minimal I/O. I did not get the ...returned to vg.. messages so it may be a hardware problem. I have the PHKL_16751 patch installed and the connections are tight. The disk is mirrored so I will watch for further errors. Thanks for the input.
John Meyer
New Member

Re: powerfailed message in syslog

Hi,

I have had the same problem here. If the disk's are Jamica Hotswap drives we have found out that a bunch have bad header boards and need to be replaced and upgrade the firmware. Contact HP and have a CE come on-site to look at it.

Re: powerfailed message in syslog

There is some hardware problem with the disk or the controller in the VG .To find out the disk do the following.
ll /dev/dsk|grep 003000
this will display a disk c3t0d0
lssf /dev/dsk/c3t0d0 by this you can find out the controller hardware path.
run cstm
cstm> map
cstm>run logtool
logtool > fl
from this logtool output you can confirm if it was an intermittent problem in the disk.The PV timeout is applicable only for EMC disks.
From your error the problem could be with the disk or with the controller if the controller had only one disk.
Sundar