Bad Error Messages ( PVLINK)

P_F · ‎07-27-2007

Hello:

Aside from my syslog filling up I'm getting bad message. Part of the problem I don't know how to know what VG and PV this is since I can figure out the hex and match it up to the name in the /dev

LVM: VG 64 0x2c0000: PVLink 31 0x1f8000 Failed!

Any help please:

############FULL OUTPUT############

0x2c0000

P_F · ‎07-27-2007

Whoops here's the log entry...

Jul 10 06:21:02 pkdh0085 vmunix: DIAGNOSTIC SYSTEM WARNING:

Jul 10 06:21:02 pkdh0085 vmunix: The diagnostic logging facility has started receiving excessive

Jul 10 06:21:02 pkdh0085 vmunix: errors from the I/O subsystem. I/O error entries will be lost

Jul 10 06:21:02 pkdh0085 vmunix: until the cause of the excessive I/O logging is corrected.

Jul 10 06:21:02 pkdh0085 vmunix: If the diaglogd daemon is not active, use the Daemon Startup command

Jul 10 06:21:02 pkdh0085 vmunix: in stm to start it.

Jul 10 06:21:02 pkdh0085 vmunix: If the diaglogd daemon is active, use the logtool utility in stm

Jul 10 06:21:02 pkdh0085 vmunix: to determine which I/O subsystem is logging excessive errors.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: Lost quorum.

Jul 10 06:21:02 pkdh0085 vmunix: This may block configuration changes and I/Os. In order to reestablish quorum at least 7 of the following PVs (represented by c

urrent link) must become available:

Jul 10 06:21:02 pkdh0085 vmunix: <31 0x1d5700> <31 0x1d6200> <31 0x1d6300> <31 0

x1d6400> <31 0x1d6500> <31 0x1d6600> <31 0x1d6700> <31 0x1d7000> <31 0x1d7100> <

31 0x1d7400> <31 0x1d7500> <31 0x1d7600> <31 0x1d7700> <31 0x1d8000> <31 0x1d810

0>

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f5600 Failed!

The PV is still accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f5700 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f6200 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f6300 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f6400 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f6500 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f6600 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f6700 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f7000 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f7100 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f7400 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f7500 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f7600 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f7700 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f8000 Failed!

The PV is not accessible.

Jul 10 06:21:02 pkdh0085 vmunix: LVM: VG 64 0x2c0000: PVLink 31 0x1f8100 Failed!

The PV is not accessible.

Tim Nelson · ‎07-27-2007

If I remember correctly,

The 0x1f is the major type of devices.

the 8 is the controller instance the 00 is target

so something like /dev/dsk/c8t0d0

There are some other threads in this forum with regards to this, search on "lbolt".

Your latest post looks like you lost a number of devices c6t.. through c8t...

Also, ioscan -fnC disk should show NO_HWR. This may also help you determine.

Geoff Wild · ‎07-27-2007

How to decode those device numbers

0x1f8000

The first 2 hex digits (1f hex = 31 decimal) indicate the major device number. Do an lsdev and look for "31". You will find that major block device 31 is SCSI disk. Thus this is a /dev/dsk device node.

The next 2 hex digits (80 hex = 128 decimal) indicate the bus "instance" number or controller number; "c128" in this case.

The next hex digit (0) indicates the SCSI ID or target. "t0" in this case.

The next hex digit (0) indicates the LUN (d0) in this case.

The remaining 2 hex digits are device driver specific.

So, your bad disk is: /dev/rdsk/c128t0d0

Try doing a diskinfo on it - anything?

Rgds...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

P_F · ‎07-27-2007

Hello Tim:

I ran

ioscan -fnCdisk | grep NO_HWR

and found nothing...

Curious...

all the

LV Status
VG Status
PV Status

show available when I run

vgdisplay -v

for all Volume Groups.

P_F · ‎07-27-2007

Hello Geoff:

Weird, we simply don't have a disk name that:

So, your bad disk is: /dev/rdsk/c128t0d0

Geoff Wild · ‎07-27-2007

Actually, it is NO_HW

ioscan -funC disk | grep NO_HW

Rgds...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

Geoff Wild · ‎07-27-2007

Do you have a vg44?

Rgds...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

P_F · ‎07-27-2007

# vgdisplay -v /dev/vg44
vgdisplay: Volume group "/dev/vg44" does not exist in the "/etc/lvmtab" file.
vgdisplay: Cannot display volume group "/dev/vg44".

-----------------------

here's a list of the VGs

/dev/vg00
/dev/vgtools
/dev/vgSANora00
/dev/vgSANora01
/dev/vgSANora02
/dev/vgSANora03
/dev/vgSANora04
/dev/vgSANora05
/dev/vgSANora06
/dev/vgSANora07
/dev/vgSANora08
/dev/vgSANora09
/dev/vgSANora10
/dev/vgSANora11
/dev/vgSANora12
/dev/vgSANora13
/dev/vgSANora14
/dev/vgSANora15
/dev/vgSANora16
/dev/vgESSRestore01
/dev/vgESSRestore02
/dev/vgESSRestore03
/dev/vgESSRestore04
/dev/vgESSCDR01
/dev/vgESSCDR13
/dev/vgESSCDR14
/dev/vgESSCDR15
/dev/vgESSCDR16
/dev/vgESSMCCR03
/dev/vgESSMCCR04
/dev/tgt_vgExtendCDR00
/dev/tgt_vgExtendCDR01
/dev/tgt_vgExtendCDR02
/dev/tgt_vgExtendCDR03
/dev/tgt_vgExtendCDR04
/dev/tgt_vgExtendCDR05
/dev/tgt_vgExtendCDR06
/dev/tgt_vgExtendCDR08
/dev/tgt_vgExtendCDR10
/dev/tgt_vgExtendCDR09
/dev/vgESSMCCR02
/dev/vgESSCDR09
/dev/vgESSCDR11
/dev/vgESSCDR10
/dev/vgESSMCCR01
/dev/tgt_vgExtendCDR07
/dev/vgESSCDR12
/dev/vgESSCDR08
/dev/vgESSCDR07
/dev/vgESSCDR06
/dev/vgESSCDR05
/dev/vgESSCDR04
/dev/vgESSCDR03
/dev/vgESSCDR02

Geoff Wild · ‎07-27-2007

How about:

ll /dev/vg*/group |grep 2c

or

ll /dev/vg*/group |grep 44

Rgds...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

P_F · ‎07-27-2007

FYI... the

#ioscan -funC disk | grep NO_HW

was silent.

P_F · ‎07-27-2007

Ah, that turned up something...

# ll /dev/vg*/group |grep 44

cr--r--r-- 1 root sys 64 0x030000 Mar 22 21:44 /dev/vgSANora01/group

P_F · ‎07-27-2007

I don't see anything wrong?

# vgdisplay -v /dev/vgSANora01
--- Volume groups ---
VG Name /dev/vgSANora01
VG Write Access read/write
VG Status available
Max LV 255
Cur LV 2
Open LV 2
Max PV 16
Cur PV 1
Act PV 1
Max PE per PV 27232
VGDA 2
PE Size (Mbytes) 8
Total PE 27227
Alloc PE 26880
Free PE 347
Total PVG 0
Total Spare PVs 0
Total Spare PVs in use 0

--- Logical volumes ---
LV Name /dev/vgSANora01/ora_db01
LV Status available/syncd
LV Size (Mbytes) 107520
Current LE 13440
Allocated PE 13440
Used PV 1

LV Name /dev/vgSANora01/ora_db02
LV Status available/syncd
LV Size (Mbytes) 107520
Current LE 13440
Allocated PE 13440
Used PV 1

--- Physical volumes ---
PV Name /dev/dsk/c6t2d2
PV Name /dev/dsk/c9t2d2 Alternate Link
PV Name /dev/dsk/c18t5d2 Alternate Link
PV Name /dev/dsk/c21t5d2 Alternate Link
PV Status available
Total PE 27227
Free PE 347
Autoswitch On

Todd McDaniel_1 · ‎07-27-2007

You don't have to decode the hex numbers to find the disk...

___________________________________________
LVM: VG 64 0x2c0000: PVLink 31 0x1f8000 Failed!
___________________________________________

cd /dev/dsk
ls -la |grep 0x2c0000

output gives you some disk you can run pvdisplay on it.

easy as pie!!!

Unix, the other white meat.

Geoff Wild · ‎07-27-2007

Nope - that's not it....

I thought 2c was the minor number or vg...

Example - yesterday, a san admin took away a disk from a server before I removed it from the vg:

VG 64 0x150000: PVLink 31 0x160400 Failed! The PV is not accessible.

0x15 = minor number of group file:

ll /dev/vg*/group |grep 15
crw-r----- 1 root sys 64 0x150000 Jul 11 11:02 /dev/vg21/group

See - now I know the disk was in vg21.

16H = 22D

c22t0d4 was the disk

SO, your first post:

LVM: VG 64 0x2c0000: PVLink 31 0x1f8000 Failed!

doesn't seemed to jive with the rest...

There should be a vg with a group minor number of 0x2c0000 and it should be c128t8d0

Rgds...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

Geoff Wild · ‎07-27-2007

How about:

ll /dev/tgt*/group |grep 0x2c0000

Not to be picky, but keeping vg's named vgXX instead of vgoracle is easier to troubleshoot. And the XX should be the decimal equiv of the minor number...

IE 0x2c000 - I would name the VG vg44

Rgds...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

P_F · ‎07-27-2007

Thanks so much for your help. The SAN folks must have done something?

I just can find

# ls -l /dev/vg* | grep 0x2 | more

crw------- 1 root sys 64 0x200000 Jul 6 17:23 group
brw------- 1 root sys 64 0x200001 Jul 6 17:48 lvol1
crw------- 1 root sys 64 0x200001 Jul 6 17:48 rlvol1
crw------- 1 root sys 64 0x210000 Jul 6 17:23 group
brw------- 1 root sys 64 0x210001 Jul 6 17:47 lvol1
crw------- 1 root sys 64 0x210001 Jul 6 17:47 rlvol1
crw------- 1 root sys 64 0x220000 Jul 6 22:27 group
brw------- 1 root sys 64 0x220001 Jul 7 14:52 lvol1
crw------- 1 root sys 64 0x220001 Jul 7 14:52 rlvol1
br--r----- 1 root sys 64 0x230001 Mar 23 21:30 ESSCDR13
cr--r--r-- 1 root sys 64 0x230000 Mar 23 21:30 group
cr--r----- 1 root sys 64 0x230001 Mar 23 21:30 rESSCDR13
br--r----- 1 root sys 64 0x240001 Mar 23 21:33 ESSCDR14
cr--r--r-- 1 root sys 64 0x240000 Mar 23 21:33 group
cr--r----- 1 root sys 64 0x240001 Mar 23 21:33 rESSCDR14
br--r----- 1 root sys 64 0x250001 Mar 23 21:37 ESSCDR15
cr--r--r-- 1 root sys 64 0x250000 Mar 23 21:36 group
cr--r----- 1 root sys 64 0x250001 Mar 23 21:37 rESSCDR15
br--r----- 1 root sys 64 0x260001 Mar 23 21:43 ESSCDR16
cr--r--r-- 1 root sys 64 0x260000 Mar 23 21:42 group
cr--r----- 1 root sys 64 0x260001 Mar 23 21:43 rESSCDR16
crw------- 1 root sys 64 0x270000 Jul 6 17:23 group
brw------- 1 root sys 64 0x270001 Jul 6 17:51 lvol1
crw------- 1 root sys 64 0x270001 Jul 6 17:51 rlvol1
crw------- 1 root sys 64 0x280000 Jul 6 15:08 group
brw------- 1 root sys 64 0x280001 Jul 6 15:37 lvol1
crw------- 1 root sys 64 0x280001 Jul 6 15:37 rlvol1
br--r----- 1 root sys 64 0x290001 Mar 26 01:28 ESSMCCR03
cr--r--r-- 1 root sys 64 0x290000 Mar 26 01:28 group
cr--r----- 1 root sys 64 0x290001 Mar 26 01:28 rESSMCCR03
br--r----- 1 root sys 64 0x2a0001 Mar 26 01:32 ESSMCCR04
cr--r--r-- 1 root sys 64 0x2a0000 Mar 26 01:32 group
cr--r----- 1 root sys 64 0x2a0001 Mar 26 01:32 rESSMCCR04

P_F · ‎07-27-2007

Oh, this looks like the culprit:

DMC2 root@pkdh0085 [/root]
# vgdisplay -v /dev/tgt_vgExtendCDR01/
vgdisplay: Couldn't read the internal id of volume group "/dev/tgt_vgExtendCDR01/" from "/etc/lvmtab".
vgdisplay: Cannot display volume group "/dev/tgt_vgExtendCDR01/".

FYI...

About the naming...well, that was dictated to us by the application vendor.

P_F · ‎07-27-2007

# ll /dev/tgt*/group |grep 0x2c0000
crw------- 1 root sys 64 0x2c0000 Jul 5 18:33 /dev/tgt_vgExtendCDR01/group

Geoff Wild · ‎07-27-2007

There you go:

# ll /dev/tgt*/group |grep 0x2c0000
crw------- 1 root sys 64 0x2c0000 Jul 5 18:33 /dev/tgt_vgExtendCDR01/group

Your disks are in that vg.

strings /etc/lvmtab |more

/tgt_vgExtendCDR01

the disks follow...

Rgds...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

Geoff Wild · ‎07-27-2007

BTW - though not required, it does help others when searching the forums and reading threads:

http://forums1.itrc.hp.com/service/forums/helptips.do?#28

Thanks...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

P_F · ‎07-27-2007

Thanks to all the folks that responded. On Monday I'll try to see what the SAN folks did to cause this. If it is relevant I'll update the post.

Thanks Again.

P_F · ‎07-30-2007

Question:

Is it ok for me to turn off the diaglogd?

I've stopped both daemons diagmond and diaglogd since they are sending large amounts of info to the syslog.

I restarted both diagmond and diaglogd but when I restarted diaglogd the log messages continued.

Here are the messages in syslog.log:

Jul 29 05:34:10 vmunix: DIAGNOSTIC SYSTEM WARNING:
Jul 29 05:34:10 vmunix: If the diaglogd daemon is not active, use th
e Daemon Startup command
Jul 29 05:34:10 vmunix: in stm to start it.
Jul 29 05:34:10 vmunix: If the diaglogd daemon is active, use the lo
gtool utility in stm
Jul 29 05:34:10 vmunix: to determine which I/O subsystem is logging
excessive errors.

NOTE:

In an earlier email I said I'd report what caused this apparently the entire incident happened earlier in the month and was fixed then but the dialogd was left in a bad state:

Here is the report from July 7 when they set up the temporary array:
------------------------------------------

/var/opt/resmon/log/event.log

Notification Time: Sat Jul 7 14:34:29 2007

pkdh0085 sent Event Monitor notification information:

/adapters/events/TL_adapter/0_0_10_1_0 is >= 1.
Its current value is INFORMATION(1).

Event data from monitor:

Event Time..........: Sat Jul 7 14:34:29 2007
Severity............: INFORMATION
Monitor.............: dm_TL_adapter
Event #.............: 19
System..............: pkdh0085

Summary:
Adapter at hardware path 0/0/10/1/0 : Received an interrupt indicating
that a primitive was received

Description of Error:

lbolt value: 4228

The Fibre Channel Driver received an interrupt indicating
that a primitive was received
Frame Manager Status Register = 0xa002c4b0

Probable Cause / Recommended Action:

The Tachyon TL adapter received a primitive sequence.
No action needed. Informative message.

Additional Event Data:
System IP Address...: 10.206.148.28
Event Id............: 0x468fa47500000002
Monitor Version.....: B.01.00
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_dm_TL_adapter.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 1
Associated OS error log entry id(s):
0x468fa14500000001
Additional System Data:
System Model Number.............: 9000/800/rp7420
OS Version......................: B.11.11
EMS Version.....................: A.04.20
STM Version.....................: A.49.00
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/dm_TL_adapter.htm#19

v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v

Component Data:
Physical Device Path....: 0/0/10/1/0
Vendor Id...............: 0x0000103C
Serial Number(WWN)......: 50060B0000BD6A96

I/O Log Event Data:

Driver Status Code..................: 0x00000013
Length of Logged Hardware Status....: 0 bytes.
Offset to Logged Manager Information: 0 bytes.
Length of Logged Manager Information: 61 bytes.

Manager-Specific Information:

Raw data from FCMS Adapter driver:
00000006 00001084 00000001 00000001 A002C4B0 2F75782F 6B65726E 2F6B6973
752F544C 2F737263 2F636F6D 6D6F6E2F 7773696F 2F74645F 6973722E 63

Torsten. · ‎07-30-2007

Of cause you can disable your diagnostic, it is unsupported (old version) anyway.

But I would update diagnostics and drivers first, then check again and find the root cause of this.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Bad Error Messages ( PVLINK)

Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)

Re: Bad Error Messages ( PVLINK)