HPE 9000 and HPE e3000 Servers
1840315 Members
3897 Online
110162 Solutions
New Discussion

Re: resolving POWERFAILED problems.

 
james below_1
Occasional Contributor

resolving POWERFAILED problems.

HI everyone,

About a month ago, I attached a 36gb hvd-fwd drive to my external hvd-fwd port on my k360.
This controller is the same as my internal drives (10/0.[2-6]0).

I can copy large(100mb tar files) quantities of data from the new drive onto the old drive and I can lots of data to the new drive without any problems.

But when I try extracting the 100mb+ tar on the new drive I get the following(on HPUX10.20):

Apr 26 09:45:05 arroyo vmunix: LVM: VG 1 : PV 0 (device 0x1f002000) is POWERFAILED
Apr 26 09:45:05 arroyo vmunix: LVM: Recovered Path (device 0x1f002000) to PV 0 in VG 1.
Apr 26 09:45:05 arroyo vmunix: LVM: Restored PV 0 to VG 1.

I may only get 1 message or I may get a couple dozen.

Under HPUX11, I saw more error messages like
Apr 15 22:43:01 arroyo vmunix: SCSI: Read error -- dev: b 31 0x001000, errno: 126, resid: 1024,
Apr 15 22:43:01 arroyo vmunix: blkno: 7583038, sectno: 15166076, offset: -824903680, bcount: 1024.
Apr 15 22:43:01 arroyo vmunix: SCSI: Write error -- dev: b 31 0x001000, errno: 126, resid: 5120,
Apr 15 22:43:01 arroyo vmunix: blkno: 7530960, sectno: 15061920, offset: -878231552, bcount: 5120.
Apr 15 22:43:01 arroyo vmunix: SCSI: Read error -- dev: b 31 0x001000, errno: 126, resid: 8192,
Apr 15 22:43:01 arroyo vmunix: blkno: 1756800, sectno: 3513600, offset: 1798963200, bcount: 8192.
Apr 15 22:43:01 arroyo vmunix: LVM: Recovered Path (device 0x1f001000) to PV 0 in VG 0.
Apr 15 22:43:01 arroyo vmunix: LVM: Restored PV 0 to VG 0.
---------end cut.

This is my only hpux system. I have tested the drive/cable/terminator on one of my Sun Solaris systems and it performs wonderfully.

One question that I have is, if this is a controller error, why don't I see these messages from the internal drives since they are all on the same bus.

If anyone has any ideas that would be great.

I guess I had also mention that the 36 gb ext drive is a SE drive with a SE<->HVD convertor in the middle.

If this should go to a different group please let me know.

thanks

thanks.
5 REPLIES 5
Anu Mathew
Valued Contributor

Re: resolving POWERFAILED problems.

Hi James,

The log entries are not of the same date in your posting, it says Apr 15 and 26 as far VG[0] is concerned.

On Apr 15, PV[0] on VG[0] has some SCSI read errors.

I would be running some read tests on that drive PV[0] and may even get it replaced.

It looks like intermittent read/write errors.

Please note that VG 0 means the first VG and PV 0 means the first volume in the concerned VG.

Hope this helps,

Anu Mathew
S.K. Chan
Honored Contributor

Re: resolving POWERFAILED problems.

I would probably run "exerciser" in stm to determine if I have disk problem or not. That way you can determine if it's the controller or the disk that's "screwy". Not sure though what's the deal with using SE<->HVD converter, if it has an affect on this. Anyway to run the exerciser ..

# cstm
cstm> map
===> Take note on the "dev num" (1st column) that correspond to the disks (in your case c0t1d0 and c0t2d0)
cstm> sel dev 12
===> Assuming the device number of 1 of the disk is 12.
cstm> info
cstm> infolog
===> Shows general info and if there is any error. Usually not much info you can get from here.
cstm> help
===> See help for list of commands.
cstm> exercise
===> Run exerciser on disk at device num 12. This will take a while.
cstm> exeractlog
===> Look at its activity log.
cstm> exerfaillog
===> Look at its failure log.
cstm> exerinfo
===> Look at its info log.

Once you've done, to select another disk for diagnistics run ..

cstm>unselall
cstm> sel dev

and repeat the above..

If you can post the findings here.
T G Manikandan
Honored Contributor

Re: resolving POWERFAILED problems.

Hello James,
We had POWERFAILED errors on the disk several times and the server stops responding after the errors.
We logged a call to HP.
THe HP service engineer told us the model of the hard disks were SEAGTATE and the firmware version for the hard disk was HP01.He told us an upgrade to disk firmware version to HP04 should solve the problem.Just last week we upgraded the disk firmware.
The disk is to be examined again.


Thanks
G Manikandan
james below_1
Occasional Contributor

Re: resolving POWERFAILED problems.

I performed a "read-only" exercize and it did not return any errors. This corresponds with what I saw on my sun system.

The only difference that I see on my sun system is that I can't push the data to the drive as fast as I can on the hp.
On the sun I can push it around 8-10Mbps read/write.
On the hp, was seeing rates as high as 14-15Mbps.

Under hpux11.0, I had the drive on "vg00" and under hp10.20 it was "vgnew", thats why there are some discrepencies.

c0t2d0 = 36gb disk
I have ran:
dd if=/dev/dsk/c0t4d0 of=/dev/dsk/c0t2d0 bs=2048

and the other way as well:
dd if=/dev/dsk/c0t2d0 of=/dev/dsk/c0t4d0 bs=2048

Both of these produce no errors.
When I run the following
/new is my mount point
cd /new
cp /usr/local/bigtar.tar /new
tar -xvf bigtar.tar
(the tar file uses relative pathnames and is about 150mb)

Now if I run:
cd /new
tar -xvf /usr/local/bigtar.tar
the drive works fine.
pretty strange.

This drive was purchased from seagate and has no hp specific firmware code on it.

Does anyone know if there is a seagate version of hp04? is it possible to flash a seagate drive from seagate with flash code from hp?
james below_1
Occasional Contributor

Re: resolving POWERFAILED problems.

I forgot to mention that the first tar example produces the powerfailed messages.

When I run the following
( /new is my mount point)

cd /new
cp /usr/local/bigtar.tar /new
tar -xvf bigtar.tar
(the tar file uses relative pathnames and is about 150mb)
**This produces the powerfailed messages

But if I run:

cd /new
tar -xvf /usr/local/bigtar.tar

the drive works fine.