Operating System - HP-UX
1832952 Members
2811 Online
110048 Solutions
New Discussion

Re: dmesg errors on HP 10x

 
Norman England
Occasional Advisor

dmesg errors on HP 10x

I have a few (70 or so) servers that I just was assigned to look over and I have been doing some reconfigurations to make the servers perform better. Here is a list of error messages that I don't have a lot of knowledge on.

Remember these are not all on the same machine. All of the machines are b132L, b132L+, or d270's. The b132L(+)'s are configured with either 128 or 256MB RAM, 1 4 GB internal HD, 1 4GB external mirrored HD. The d270's have 256MB RAM and 2 internal 9GB HD (mirrored)....

Any help on any of these errors would be helpful. Please reference which one you are responding to in the reply:

Exhibit A:
SCSI: Resetting SCSI -- lbolt: 855798665, bus: 0
SCSI: Reset detected -- lbolt: 855798665, bus: 0
LVM: vg[0]: pvnum=1 (dev_t=0x1f008000) is POWERFAILED
LVM: vg[0]: pvnum=0 (dev_t=0x1f005000) is POWERFAILED
LVM: vg[0].lv[1] has 2 fault disks on : pvnum=0 (dev=0x1f005000) pvnum=1 (dev=0x1f008000)
LVM: vg[0].lv[2] has 2 fault disks on : pvnum=0 (dev=0x1f005000) pvnum=1 (dev=0x1f008000)
LVM: vg[0].lv[3] has 2 fault disks on : pvnum=0 (dev=0x1f005000) pvnum=1 (dev=0x1f008000)
LVM: vg[0].lv[4] has 2 fault disks on : pvnum=0 (dev=0x1f005000) pvnum=1 (dev=0x1f008000)
LVM: vg[0].lv[5] has 2 fault disks on : pvnum=0 (dev=0x1f005000) pvnum=1 (dev=0x1f008000)
LVM: vg[0].lv[7] has 2 fault disks on : pvnum=0 (dev=0x1f005000) pvnum=1 (dev=0x1f008000)
LVM: vg[0].lv[8] has 2 fault disks on : pvnum=0 (dev=0x1f005000) pvnum=1 (dev=0x1f008000)
LVM: vg[0].lv[9] has 2 fault disks on : pvnum=0 (dev=0x1f005000) pvnum=1 (dev=0x1f008000)
LVM: PV 0 has been returned to vg[0].

Exhibit B:
sysmap: rmap ovflo, lost [9609l,9610l)
sysmap: rmap ovflo, lost [9603l,9604l)

Exhibit C:
Unable to add all file system swap from: /prod/paging. Increase the tunable parameter maxswapchunks by 90 and re-configure your system.

Exhibit D:
lv_syncx: returning error: 126
vxfs: mesg 056: vx_dataioerr - /dev/vg01/lvol1 file system file data read error

Exhibit E:
vxfs: mesg 056: vx_dataioerr - /dev/vg01/lvol1 file system file data read error
vxfs: mesg 055: vx_metaioerr - /dev/vg01/lvol1 file system meta data read error
vxfs: mesg 017: vx_trunc - /data file system inode 1581 marked bad

Exhibit F:
SCSI: Abort Tag -- lbolt: 123034437, dev: 1f00e000, io_id: 562648
LVM: vg[0]: pvnum=1 (dev_t=0x1f00e000) is POWERFAILED
SCSI: First party detected bus hang -- lbolt: 123034637, bus: 0

Exhibit G:
lv_syncx: returning error: 5

Exhibit H:
ps2: invalid ack byte 31 for command ed


Any input would be helpful, let me know if there is any more information you need.

thanks






8 REPLIES 8
Sachin Patel
Honored Contributor

Re: dmesg errors on HP 10x

Hi Norman
A and F. Problem in your cable or controller or disk. It is loosing connection between system and LVM.

Over the weekend i had a problem in one of the system. That was a Controller problem. But sometime we have fibre cable poblem and it report same error message. Some time it is GBIc problme and it reports same error. Some time it is controller on system and still same error.

Sep 29 09:41:08 bessel vmunix: LVM: vg[14]: pvnum=0 (dev_t=0x1f090200) is POWERFAILED
Sep 29 09:41:08 bessel vmunix: LVM: PV 0 has been returned to vg[14].
Sep 30 22:20:54 bessel vmunix: LVM: vg[13]: pvnum=0 (dev_t=0x1f090000) is POWERFAILED
Sep 30 22:20:54 bessel vmunix: LVM: vg[14]: pvnum=0 (dev_t=0x1f090200) is POWERFAILED
Sep 30 22:20:54 bessel vmunix: LVM: vg[15]: pvnum=0 (dev_t=0x1f090400) is POWERFAILED
Sep 30 22:20:54 bessel vmunix: LVM: vg[16]: pvnum=0 (dev_t=0x1f090600) is POWERFAILED
Sep 30 22:20:59 bessel vmunix: LVM: PV 0 has been returned to vg[13].
Sep 30 22:21:09 bessel vmunix: LVM: PV 0 has been returned to vg[14].
Sep 30 22:21:09 bessel vmunix: LVM: PV 0 has been returned to vg[15].
Sep 30 22:21:09 bessel vmunix: LVM: PV 0 has been returned to vg[16].

Sachin
Is photography a hobby or another way to spend $
harry d brown jr
Honored Contributor

Re: dmesg errors on HP 10x

Because these are most likely user workstations, could it be possible that some are treating them like PC's and/or physically powering them off instead of properly shutting them down?
Live Free or Die
John Bolene
Honored Contributor

Re: dmesg errors on HP 10x

Num C: need to increase that kernel parm to use all the swap that is configured.

The others all look like disk problems, either with the disk, cables, power supplies, or controllers.
It is always a good day when you are launching rockets! http://tripolioklahoma.org, Mostly Missiles http://mostlymissiles.com
Sanjay_6
Honored Contributor
Sanjay_6
Honored Contributor
Norman England
Occasional Advisor

Re: dmesg errors on HP 10x

Don't laugh but these servers are not user workstations. They are actually used in a 24x7 production envionment, they are locked in a cabinet in a server room, they have SmartUPS Powerchute software running so that they will shut down automatically if there is a power failure. I don't know who spec'd these systems out but they are very underpowered, and have very little disk space. They also have high disk access, which is why we see these errors all over the place. I just wanted to see if anyone has seen some of these errors and had some idea of what was causing them exactly. thanks
Frank Slootweg
Honored Contributor

Re: dmesg errors on HP 10x

As others mentioned, start with the disks, controllers, cables, etc..

A pointer: Start with the "dev=" information. What follows is an 8-digit hexadecimal number. 1f hex is 31 decimal, which is the block major number of the sdisk driver (see lsdev(1M)). The following *example* will probably point you to the problem disk(s):

ll /dev/dsk/* | grep 008000

This will give the device file name(s). Now do a lssf(1M) command for these device files. This will give you the hardware address. Next "ioscan -f -H ...." and "diskinfo /dev/rdsk/..." will give you more details about the disks, interfaces, etc..

Exhibit B: "sysmap: rmap ovflo, lost [9609l,9610l)" is not a problem, but can become a problem. Try to increase nproc by a factor of 2. If that does not help, then I advise to search the ITRC databases or/and forum on this.