<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: LVM POWERFAIL message in Operating System - HP-UX</title>
    <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898966#M403732</link>
    <description>Hi Kevin,&lt;BR /&gt;&lt;BR /&gt;at first - the re-seating and power-cycling of the disk might have worked, but You can conly be sure by going to the whole log entries the disk produced.&lt;BR /&gt;&lt;BR /&gt;cstm should give You the disks' current status, and dd'ing will ensure every single block is still readable.&lt;BR /&gt;the 'lbolt'-errors mean hp-ux was inable to write or read from a specific disk block. &lt;BR /&gt;usually bad block relocation is active and works automatically, so this should have been properly handled by the LVM.&lt;BR /&gt;You should still check that everything is really available.&lt;BR /&gt;&lt;BR /&gt;I'm not completely sure, that Your data is completely mirrored - try the following:&lt;BR /&gt;&lt;BR /&gt;# check mirror states&lt;BR /&gt;for vg in `vgdisplay -v | grep "VG Name" | awk '{print $3}'`; &lt;BR /&gt; do&lt;BR /&gt;   lvdisplay $vg/lv* | egrep 'Name|Mirror'&lt;BR /&gt;done&lt;BR /&gt;&lt;BR /&gt;- this should return something like:&lt;BR /&gt;&lt;BR /&gt;LV Name                     /dev/vg00/lvol9&lt;BR /&gt;VG Name                     /dev/vg00&lt;BR /&gt;Mirror copies               1&lt;BR /&gt;&lt;BR /&gt;not everything HAS to be mirrored, but the chance of the system locking up due to outstanding I/O is lower when there is no outstanding I/O because the target LV is mirrored :)&lt;BR /&gt;&lt;BR /&gt;for every vg in the system, do &lt;BR /&gt;# check no STALE PE exist&lt;BR /&gt;lvdisplay -v /dev/vgNN/lvol* | grep -ci stale&lt;BR /&gt;(should return 0, nothing)&lt;BR /&gt;&lt;BR /&gt;if a LE is marked as stale, things depend:&lt;BR /&gt;if only one PE is stale, go ahead and let hp check it over and then replace the disk; if both are stale, there might be data lost, check if there is missing data, and locate the root issue and possibly recover. (this rarely is the case :)&lt;BR /&gt;&lt;BR /&gt;for the housekeeping part, a weekly defrag is unneccessary - while extent-based filesystems like vxfs is are more prone to some fragmentation, in most cases once per quarter is often enough. exceptions might be highly volatile filesystems like mail spools, etc.&lt;BR /&gt;&lt;BR /&gt;If You can get some downtime at one of the next weekends, then think about the following:&lt;BR /&gt;&lt;BR /&gt;- ignite of whole vg00 to tape&lt;BR /&gt;/opt/ignite/bin/make_tape_recovery -A -a /dev/rmt/0mn&lt;BR /&gt;- reboot the system to runlevel 1&lt;BR /&gt;- fsck -Fvxfs -o full,nolog all filesystems&lt;BR /&gt;&lt;BR /&gt;after that and with no additional disk issues I'd feel confident that everything is still in good shape.&lt;BR /&gt;&lt;BR /&gt;recommended reading would be the STM and EMS manuals, and You could think about running a &lt;BR /&gt;tail -f /var/adm/syslog/syslog.log via ssh from Your workstation, so that You're immediately updated when something goes wrong and to get a feeling for normal and less normal system behaviour.</description>
    <pubDate>Fri, 29 Apr 2005 13:32:51 GMT</pubDate>
    <dc:creator>Florian Heigl (new acc)</dc:creator>
    <dc:date>2005-04-29T13:32:51Z</dc:date>
    <item>
      <title>LVM POWERFAIL message</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898963#M403729</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;I had a problem on our L1000 machine running 11.11, where an Oracle 9.2.0.4 database was having an "Update Statistics" job run while the application that uses that database was active. After several minutes it became clear that there was some kind of deadlock in the system, as simple commands like "bdf" and "ls -ltr" did not respond in the telnet sessions where they were invoked. Also, it became impossible to login into telnet as anything but root.&lt;BR /&gt;&lt;BR /&gt;I tried (logged on as root) to kill (using kill -9 pid) all of the user processes that may have been interfering witht he Oracle Update Stats job, and all but two could be killed. So then the only thing that I could think of doing was to reboot the machine, which I tried to do. When I initiated the restart (shutdown -R 0) everything seemd to go ok, until it got to the point where it needed to unmount the file systems. This "hung" for 1.5 hours while I was at lunch (this machine is not critical in our environment). I checked the Console at this point and saw the following text: &lt;BR /&gt;&lt;BR /&gt;LVM: vg[1]: pvnum=0 (dev_t=0x1f022000) is POWERFAILED&lt;BR /&gt;DIAGNOSTIC SYSTEM WARNING: &lt;BR /&gt;   The diagnostic logging facility has started receiving excessive errors from the I/O subsystem. I/O error entries will be lost until the cause of the excessive I/O logging is corrected.&lt;BR /&gt;If the diaglogd is not active use the Daemon Startup command in stm to start it.&lt;BR /&gt;If the diaglogd daemon is active, use the logtool utility in stm to determine which I/O subsystem is logging exces&lt;BR /&gt;&lt;BR /&gt;at which point the text was cut off as if in mid-sentence.&lt;BR /&gt;&lt;BR /&gt;I have to admit to being a novice when it comes to HP-UX System Admin, so I turned to this forum for answers.&lt;BR /&gt;&lt;BR /&gt;I found several hits on "LVM POWWERFAILED" and they seemed to indicate that there may be an imminent h/w failure of the disk c2t2d0 (which I identified from ""ll /dev/rdsk| grep 22000", found in one of those hits).&lt;BR /&gt;&lt;BR /&gt;Since then I have done the following:&lt;BR /&gt;1) powered of the hung system&lt;BR /&gt;2) powered on the system (which booted normally, but had some lbolt errors in the syslog)&lt;BR /&gt;3) shutdown -hy0&lt;BR /&gt;4) remove power from the system&lt;BR /&gt;5) open the cabinet, and re-seat the drives&lt;BR /&gt;6) power on the system (this time no errors in syslog)&lt;BR /&gt;&lt;BR /&gt;So, after all that, now for my question...&lt;BR /&gt;&lt;BR /&gt;Is there anyway to check the disk that was reported in error, e.g. a similar utility to SACNDISK in Windoze, or some other utility that will report on the health of the disk. I am sure one must exsist and I am assuming that it is my novice status that prevents me from finding it. Any other advice on housekeeping or such would also be welcome. I already run a defrag of the disk on a weekly basis.&lt;BR /&gt;&lt;BR /&gt;Thanks in advance for detailed answers.&lt;BR /&gt;&lt;BR /&gt;Regards&lt;BR /&gt;Kevin</description>
      <pubDate>Fri, 29 Apr 2005 10:54:47 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898963#M403729</guid>
      <dc:creator>Kevin Bingham</dc:creator>
      <dc:date>2005-04-29T10:54:47Z</dc:date>
    </item>
    <item>
      <title>Re: LVM POWERFAIL message</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898964#M403730</link>
      <description>You can check with a dd:&lt;BR /&gt;&lt;BR /&gt;dd if=/dev/rdsk/c20t5d0 of=/dev/null bs=64k&lt;BR /&gt;&lt;BR /&gt;Change the c20t5d0 to your devs...&lt;BR /&gt;&lt;BR /&gt;If any errors - the dd will abort...&lt;BR /&gt;&lt;BR /&gt;Rgds...Geoff</description>
      <pubDate>Fri, 29 Apr 2005 11:00:02 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898964#M403730</guid>
      <dc:creator>Geoff Wild</dc:creator>
      <dc:date>2005-04-29T11:00:02Z</dc:date>
    </item>
    <item>
      <title>Re: LVM POWERFAIL message</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898965#M403731</link>
      <description>Kevin,&lt;BR /&gt;&lt;BR /&gt;If you have support tools installed, you can check for errors logged to the disk with stm.&lt;BR /&gt;&lt;BR /&gt;cstm&lt;BR /&gt;cstm&amp;gt;map&lt;BR /&gt;cstm&amp;gt;select device &amp;lt;#&amp;gt; -- # of disk from list&lt;BR /&gt;cstm&amp;gt;info&lt;BR /&gt;cstm&amp;gt;infolog&lt;BR /&gt;&lt;BR /&gt;Regards,&lt;BR /&gt;Eric</description>
      <pubDate>Fri, 29 Apr 2005 11:07:37 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898965#M403731</guid>
      <dc:creator>erics_1</dc:creator>
      <dc:date>2005-04-29T11:07:37Z</dc:date>
    </item>
    <item>
      <title>Re: LVM POWERFAIL message</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898966#M403732</link>
      <description>Hi Kevin,&lt;BR /&gt;&lt;BR /&gt;at first - the re-seating and power-cycling of the disk might have worked, but You can conly be sure by going to the whole log entries the disk produced.&lt;BR /&gt;&lt;BR /&gt;cstm should give You the disks' current status, and dd'ing will ensure every single block is still readable.&lt;BR /&gt;the 'lbolt'-errors mean hp-ux was inable to write or read from a specific disk block. &lt;BR /&gt;usually bad block relocation is active and works automatically, so this should have been properly handled by the LVM.&lt;BR /&gt;You should still check that everything is really available.&lt;BR /&gt;&lt;BR /&gt;I'm not completely sure, that Your data is completely mirrored - try the following:&lt;BR /&gt;&lt;BR /&gt;# check mirror states&lt;BR /&gt;for vg in `vgdisplay -v | grep "VG Name" | awk '{print $3}'`; &lt;BR /&gt; do&lt;BR /&gt;   lvdisplay $vg/lv* | egrep 'Name|Mirror'&lt;BR /&gt;done&lt;BR /&gt;&lt;BR /&gt;- this should return something like:&lt;BR /&gt;&lt;BR /&gt;LV Name                     /dev/vg00/lvol9&lt;BR /&gt;VG Name                     /dev/vg00&lt;BR /&gt;Mirror copies               1&lt;BR /&gt;&lt;BR /&gt;not everything HAS to be mirrored, but the chance of the system locking up due to outstanding I/O is lower when there is no outstanding I/O because the target LV is mirrored :)&lt;BR /&gt;&lt;BR /&gt;for every vg in the system, do &lt;BR /&gt;# check no STALE PE exist&lt;BR /&gt;lvdisplay -v /dev/vgNN/lvol* | grep -ci stale&lt;BR /&gt;(should return 0, nothing)&lt;BR /&gt;&lt;BR /&gt;if a LE is marked as stale, things depend:&lt;BR /&gt;if only one PE is stale, go ahead and let hp check it over and then replace the disk; if both are stale, there might be data lost, check if there is missing data, and locate the root issue and possibly recover. (this rarely is the case :)&lt;BR /&gt;&lt;BR /&gt;for the housekeeping part, a weekly defrag is unneccessary - while extent-based filesystems like vxfs is are more prone to some fragmentation, in most cases once per quarter is often enough. exceptions might be highly volatile filesystems like mail spools, etc.&lt;BR /&gt;&lt;BR /&gt;If You can get some downtime at one of the next weekends, then think about the following:&lt;BR /&gt;&lt;BR /&gt;- ignite of whole vg00 to tape&lt;BR /&gt;/opt/ignite/bin/make_tape_recovery -A -a /dev/rmt/0mn&lt;BR /&gt;- reboot the system to runlevel 1&lt;BR /&gt;- fsck -Fvxfs -o full,nolog all filesystems&lt;BR /&gt;&lt;BR /&gt;after that and with no additional disk issues I'd feel confident that everything is still in good shape.&lt;BR /&gt;&lt;BR /&gt;recommended reading would be the STM and EMS manuals, and You could think about running a &lt;BR /&gt;tail -f /var/adm/syslog/syslog.log via ssh from Your workstation, so that You're immediately updated when something goes wrong and to get a feeling for normal and less normal system behaviour.</description>
      <pubDate>Fri, 29 Apr 2005 13:32:51 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898966#M403732</guid>
      <dc:creator>Florian Heigl (new acc)</dc:creator>
      <dc:date>2005-04-29T13:32:51Z</dc:date>
    </item>
    <item>
      <title>Re: LVM POWERFAIL message</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898967#M403733</link>
      <description>Hi&lt;BR /&gt;&lt;BR /&gt;Do  lvdisplay -v /dev/vgxx/lvolxx  on all the LVs on the disk and look for any stale extents. And also, do &lt;BR /&gt;# pvdisplay -v /dev/dsk/cxtxdx  on your disk and see the PV status .. If it is unavailable , you will have to replace this disk..&lt;BR /&gt;&lt;BR /&gt;James&lt;BR /&gt;</description>
      <pubDate>Fri, 29 Apr 2005 14:57:29 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898967#M403733</guid>
      <dc:creator>James George_1</dc:creator>
      <dc:date>2005-04-29T14:57:29Z</dc:date>
    </item>
    <item>
      <title>Re: LVM POWERFAIL message</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898968#M403734</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;This message normally comes when the power to the drive is recycled. This can be due to any reason I mean loose connection in power connectors , Power connector extendor etc. You said even after power recycle it was displaying L-Bolt errors and then you refixed disk after powering off the machine again. Did you notice something like loose at this point ?&lt;BR /&gt;&lt;BR /&gt;It seems it was a interminent disk problem /losse connection. Still it is not advised to use this device for important data untill you are sure it is not creating any problems in future. A god advice here will be to mirror this disk with some spare in your system and observe it for some time so that your system is not drived to deadlock if it happens to experience problems again. Allthough dd will check every block of disk now but it is not faithful if your drive has something else then media problems and specially intermediant problems.&lt;BR /&gt;&lt;BR /&gt;HTH,&lt;BR /&gt;Devender</description>
      <pubDate>Fri, 29 Apr 2005 23:20:12 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898968#M403734</guid>
      <dc:creator>Devender Khatana</dc:creator>
      <dc:date>2005-04-29T23:20:12Z</dc:date>
    </item>
    <item>
      <title>Re: LVM POWERFAIL message</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898969#M403735</link>
      <description>Hi &lt;BR /&gt;&lt;BR /&gt;Thanks to all those who have replied.&lt;BR /&gt;&lt;BR /&gt;More info about our machine: we do NOT use mirroring at all, the machine is used only for porting work.&lt;BR /&gt;&lt;BR /&gt;I have tried several of the (applicable) suggestions to find errors on the disk but so far they come back clean. I am currently running the "dd if=/dev/rdsk/c2t2d0 of=/dev/null bs=64k" and I guess that this will probably take a while to complete. I will let you know the results.&lt;BR /&gt;&lt;BR /&gt;Also, during the weekend, we suffered a mains power failure and the UPS did not allow us to shut the machine down properly (it's quite low down on the pecking order as it is used for porting only) so the machine had a power based restart. I checked the syslog and again there was an occurrence of the "lbolt" message, with details below:&lt;BR /&gt;&lt;BR /&gt;---begin snip---&lt;BR /&gt;May  2 20:25:19 hp2 vmunix: SCSI: First party detected bus hang -- lbolt: 3059, bus: 2&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             lbp-&amp;gt;state: 1060&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             lbp-&amp;gt;offset: 40&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             scb-&amp;gt;io_id: 2000003&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             scb-&amp;gt;cdb: 12 00 00 00 80 00&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             lbolt_at_timeout: 2959, lbolt_at_start: 2459&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             lsp-&amp;gt;state: 5&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:     scratch_lsp: 0000000041218800&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:     Pre-DSP script dump [fffffffff87ba020]:&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             00000000 00000000 41020000 f87ba290&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             78344000 0000000a 78351000 00000000&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:     Script dump [fffffffff87ba040]:&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             0e000005 f87ba540 e0100004 f87ba7f8&lt;BR /&gt;May  2 20:25:19 hp2 vmunix:             870b0000 f87ba2d8 98080000 00000005&lt;BR /&gt;May  2 20:25:19 hp2 vmunix: SCSI: Resetting SCSI -- lbolt: 3659, bus: 2&lt;BR /&gt;May  2 20:25:19 hp2 vmunix: SCSI: Reset detected -- lbolt: 3659, bus: 2&lt;BR /&gt;&lt;BR /&gt;---end snip---&lt;BR /&gt;&lt;BR /&gt;I don't know if this sheds any more light on the nature of the problem...&lt;BR /&gt;&lt;BR /&gt;Thanks in advance&lt;BR /&gt;Kevin&lt;BR /&gt;&lt;BR /&gt;PS, I have STM, but where do I find the Doc's for STM and EMS</description>
      <pubDate>Tue, 03 May 2005 04:51:29 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898969#M403735</guid>
      <dc:creator>Kevin Bingham</dc:creator>
      <dc:date>2005-05-03T04:51:29Z</dc:date>
    </item>
    <item>
      <title>Re: LVM POWERFAIL message</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898970#M403736</link>
      <description>Luckily for us the machine was still inside warranty, so I have a new disk and am busy transferring data... &lt;BR /&gt;&lt;BR /&gt;Thanks to all who replied.</description>
      <pubDate>Thu, 05 May 2005 06:39:04 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898970#M403736</guid>
      <dc:creator>Kevin Bingham</dc:creator>
      <dc:date>2005-05-05T06:39:04Z</dc:date>
    </item>
    <item>
      <title>Re: LVM POWERFAIL message</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898971#M403737</link>
      <description>as per previous reply</description>
      <pubDate>Thu, 05 May 2005 06:39:40 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/lvm-powerfail-message/m-p/4898971#M403737</guid>
      <dc:creator>Kevin Bingham</dc:creator>
      <dc:date>2005-05-05T06:39:40Z</dc:date>
    </item>
  </channel>
</rss>

