1847665 Members
3330 Online
110265 Solutions
New Discussion

LVM errors in the syslog

 
SOLVED
Go to solution
Aziz Zouagui
Frequent Advisor

LVM errors in the syslog

 
10 REPLIES 10
Helen French
Honored Contributor

Re: LVM errors in the syslog

One quick look: DO you have all latest patches applied on the system? If not I would suggest you to load all SCSI, LVM and hardware patches soon.
Life is a promise, fulfill it!
Dario_1
Trusted Contributor

Re: LVM errors in the syslog

Hi!

Another things to check are:

ioscan -fnC disk

and make sure all the disks are CLAIMED.

Then pvdisplay /dev/dsk/cXtXdX on the disks that are giving you errors. You will get an output like this:

--- Physical volumes ---
PV Name /dev/dsk/c4t0d0
VG Name /dev/vgu01
PV Status unavailable
Allocatable yes
VGDA 2
Cur LV 1
PE Size (Mbytes) 4
Total PE 4340
Free PE 2840
Allocated PE 1500
Stale PE 0
IO Timeout (Seconds) 180
Autoswitch On


Make sure IO Timeout is set to something other than default, I will say 180.

To change that:

pvchange -t 180 /dev/dsk/cXtXdX

If this does not fix the problem, contact HP and have the check the SCSI cable, SCSI controller and the disk(s)

Regards,

DR
Steven E. Protter
Exalted Contributor

Re: LVM errors in the syslog

Three ways to get an lbolt, that I know of.

1) Swap out a hot swappable drive while the system is running. No problem, the message will go away next system boot.

2) Timeout issues on a stressed box with load balancing issues and heavy i/o

3) Bad disk, or scsi card or drive cage or cable.

95% of the time in my experience, its #3 and you need the machine serviced.

Take good backups and get someone on site.

The prior suggestions in this thread are also excellent ideas.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
A. Clay Stephenson
Acclaimed Contributor

Re: LVM errors in the syslog

The one thing that I notice is that all the failovers occur from c3 to c4. I would change the primary path for your PV's to use c4 (probably controller Y) and alternate c3 (probably controller X). If the problems disappear then you have a bad host SCSI controller, possibly a bad 12H controller, or cabling problems. A very common cause of exactly your problem is termination. Have you checked the termination on the host SCSI cards? Also, is at least one device on each bus supplying termination power. Come cards have fused termination power; if the fuse blows (and this is the only source of termination power) you are not terminated eventhough the terminators themselves might be perfect.
Surprising, in many cases, even completely unterminated SCSI buses (or terminated on only one end) will work almost perfectly and thus be a very difficult problem to isolate and track down.


It would also help to decode the IO addresses and determine if all your problems are tied to one 12H.

If it ain't broke, I can fix that.
A. Clay Stephenson
Acclaimed Contributor

Re: LVM errors in the syslog

The one thing that I notice is that all the failovers occur from c3 to c4. I would change the primary path for your PV's to use c4 (probably controller Y) and alternate c3 (probably controller X). If the problems disappear then you have a bad host SCSI controller, possibly a bad 12H controller, or cabling problems. A very common cause of exactly your problem is termination. Have you checked the termination on the host SCSI cards? Also, is at least one device on each bus supplying termination power. Come cards have fused termination power; if the fuse blows (and this is the only source of termination power) you are not terminated eventhough the terminators themselves might be perfect.
Surprisingly, in many cases, even completely unterminated SCSI buses (or terminated on only one end) will work almost perfectly and thus be a very difficult problem to isolate and track down.


It would also help to decode the IO addresses and determine if all your problems are tied to one 12H.

If it ain't broke, I can fix that.
Aziz Zouagui
Frequent Advisor

Re: LVM errors in the syslog


How do you decode all those hex numbers to device files ?

is there an easier way of doing this ?

Thank you all for the suggestions, keep them coming.



A. Clay Stephenson
Acclaimed Contributor
Solution

Re: LVM errors in the syslog

Look for the device numbers, e.g. 0x1f031100

The first two hex digits (1f) refer is the major device number. 1f = 31 (dec). Do as lsdev and look for the matching driver. You will find that block major device 31 is "sdisk" - SCSI disk. The next two hex digits (03) refer to the bus instance number. The next hex digit (1) is the SCSI ID and the next hex digit (1) is the LUN. The following 2 hex digits are driver specific. In any event, 1f031100 decodes to /dev/dsk/c3t1d1. You then do an ioscan -fn and find the host bus adapter that corresponds to c3.
If it ain't broke, I can fix that.
Dario_1
Trusted Contributor

Re: LVM errors in the syslog

Aziz:

Check Stephen's answer in the following post for information on how to decode the hex numbers.

http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0xbfdffef4d250d611abdb0090277a778c,00.html

Regards,

DR
John Dvorchak
Honored Contributor

Re: LVM errors in the syslog

I love A. Clay's answer for the technical explaination. Thank you. What I have done in the past, 'cause I didn't know what A. Clay knows is to ls -l (or ell ell) on the /dev/dsk directory and grep for the hex number.
If it has wheels or a skirt, you can't afford it.
Frank Slootweg
Honored Contributor

Re: LVM errors in the syslog

John,

Personally I think that grep(1)-ping for (the last six 'digits' of) the hex number is better.

1f031100 *should* correspond to /dev/dsk/c3t1d1, but if someone messed up, it won't and you will be looking at the wrong disk.

So grep(1) to get the name of the device file, then do a lssf(1M) to get the hardware path and then do a "ioscan -H ..." to get information about the disk. No silly decoding of hex numbers required! :-) After all, HP wrote lssf(1M) for a reason.