Operating System - HP-UX
1819915 Members
2312 Online
109607 Solutions
New Discussion юеВ

Which device is giving me this error?

 
SOLVED
Go to solution
dictum9
Super Advisor

Which device is giving me this error?

N-class running 11.00

I am getting these messages, what do they mean? And what is this STD3 stuff? There is nothing abnormal in the volume groups it refers to.

syslog:

Aug 14 10:25:31 sbapca02 symmir[27097]: 'Incremental Establish' for device STD3 in group vgdata082 - Cannot perform the operati
on because the device is not in a valid state
Aug 14 10:26:46 sbapca02 vmunix:
Aug 14 10:26:46 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x053200, errno: 126, resid: 8192,
Aug 14 10:26:46 sbapca02 vmunix: blkno: 21303360, sectno: 42606720, offset: 339804160, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: SCSI: Async write error -- dev: b 31 0x053200, errno: 126, resid: 8192,
Aug 14 10:26:46 sbapca02 vmunix: blkno: 14223262, sectno: 28446524, offset: 1679718400, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x053200, errno: 126, resid: 8192,
Aug 14 10:26:46 sbapca02 vmunix: blkno: 13372944, sectno: 26745888, offset: 808992768, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: blkno: 10365832, sectno: 20731664, offset: 2024677376, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: blkno: 10027768, sectno: 20055536, offset: 1678499840, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: blkno: 9920208, sectno: 19840416, offset: 1568358400, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: blkno: 7628832, sectno: 15257664, offset: -778010624, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: SCSI: Async write error -- dev: b 31 0x053200, errno: 126, resid: 8192,
Aug 14 10:26:46 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x053200, errno: 126, resid: 8192,
Aug 14 10:26:46 sbapca02 above message repeats 4 times
Aug 14 10:26:46 sbapca02 vmunix: blkno: 3901102, sectno: 7802204, offset: -300238848, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x053200, errno: 126, resid: 8192,
Aug 14 10:26:46 sbapca02 vmunix: blkno: 3393040, sectno: 6786080, offset: -820494336, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: SCSI: Async write error -- dev: b 31 0x053200, errno: 126, resid: 8192,
Aug 14 10:26:46 sbapca02 vmunix: blkno: 3199488, sectno: 6398976, offset: -1018691584, bcount: 8192.
Aug 14 10:26:46 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x053200, errno: 126, resid: 8192,
Aug 14 10:26:46 sbapca02 vmunix: blkno: 3015504, sectno: 6031008, offset: -1207091200, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
Aug 14 10:26:53 sbapca02 vmunix: blkno: 35057680, sectno: 70115360, offset: 1539325952, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: blkno: 3393040, sectno: 6786080, offset: -820494336, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Async write error -- dev: b 31 0x233200, errno: 126, resid: 8192,
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
Aug 14 10:26:53 sbapca02 vmunix: blkno: 3199496, sectno: 6398992, offset: -1018683392, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
Aug 14 10:26:53 sbapca02 vmunix: blkno: 3015504, sectno: 6031008, offset: -1207091200, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: blkno: 2331864, sectno: 4663728, offset: -1907138560, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: blkno: 33759624, sectno: 67519248, offset: 210116608, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Async write error -- dev: b 31 0x233200, errno: 126, resid: 8192,
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
Aug 14 10:26:54 sbapca02 above message repeats 2 times
Aug 14 10:26:53 sbapca02 vmunix: blkno: 26511672, sectno: 53023344, offset: 1378148352, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
Aug 14 10:26:53 sbapca02 vmunix: blkno: 25737304, sectno: 51474608, offset: 585195520, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 2048,
Aug 14 10:26:53 sbapca02 vmunix: blkno: 14219694, sectno: 28439388, offset: 1676064768, bcount: 2048.
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
Aug 14 10:26:53 sbapca02 vmunix: blkno: 11144592, sectno: 22289184, offset: -1472839680, bcount: 8192.
Aug 14 10:26:53 sbapca02 vmunix: SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 1024,
Aug 14 10:26:53 sbapca02 vmunix: blkno: 8, sectno: 16, offset: 8192, bcount: 1024.
Aug 14 10:27:29 sbapca02 vmunix: LVM: VG 64 0x230000: PVLink 31 0x053200 Failed! The PV is not accessible.
Aug 14 10:26:53 sbapca02 vmunix:
Aug 14 10:27:29 sbapca02 above message repeats 21 times
Aug 14 10:27:29 sbapca02 vmunix: LVM: VG 64 0x230000: PVLink 31 0x233200 Failed! The PV is not accessible.
Aug 14 10:36:03 sbapca02 vmunix: LVM: VG 64 0x230000: PVLink 31 0x053200 Recovered.
Aug 14 10:36:03 sbapca02 vmunix: LVM: VG 64 0x230000: PVLink 31 0x233200 Recovered.

dmesg:

SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
blkno: 2331864, sectno: 4663728, offset: -1907138560, bcount: 8192.

SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
blkno: 33759624, sectno: 67519248, offset: 210116608, bcount: 8192.

SCSI: Async write error -- dev: b 31 0x233200, errno: 126, resid: 8192,
blkno: 26511672, sectno: 53023344, offset: 1378148352, bcount: 8192.

SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
blkno: 25737304, sectno: 51474608, offset: 585195520, bcount: 8192.

SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 2048,
blkno: 14219694, sectno: 28439388, offset: 1676064768, bcount: 2048.

SCSI: Read error -- dev: b 31 0x233200, errno: 126, resid: 8192,
blkno: 11144592, sectno: 22289184, offset: -1472839680, bcount: 8192.
22 REPLIES 22
dictum9
Super Advisor

Re: Which device is giving me this error?

PostScriptum

T E N P O I N T S is assigned to every answer.
Sandman!
Honored Contributor
Solution

Re: Which device is giving me this error?

Check disk "/dev/dsk/c35t3d2" for problems. May be bad and needs to be replaced.
Pete Randall
Outstanding Contributor

Re: Which device is giving me this error?

Also check c5t3d2. Your very first read error was against that. You might want to run dd to exercise the disk:

dd if=/dev/rdsk/c5t3d2 of=/dev/null bs=1024k

Any errors indicate a bad disk that needs to be replaced.


Pete

Pete
Sandman!
Honored Contributor

Re: Which device is giving me this error?

Actually check disk "/dev/dsk/c5t3d2" since "/dev/dsk/c35t3d2" seems to be its alternate link.

~cheers
Michael Steele_2
Honored Contributor

Re: Which device is giving me this error?

0x053200 = c5t3d2

# ll /dev/dsk | grep 0x053200

Check the disk with commands below :

# ioscan -nfC disk | grep -i 'no_hw'

# diskinfo -v /dev/rdsk/c5t3d2

Might have a bad disk so verify with dd command.

dd if=/dev/dsk/c5t3t2 of=/dev/null count=10000
Support Fatherhood - Stop Family Law
Pupil_1
Trusted Contributor

Re: Which device is giving me this error?

Try the following command and it will show the status of the disk !!

echo 2400?20X |adb /dev/dsk/cxtxdx !!

A good disk should have a smiliar report like this !!

# echo 2400?20X |adb /dev/dsk/c2t6d0
2400: 44454645 43543031 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
#


dd on a failed disk can take on aweful lot of time !!

There is always something new to learn everyday !!
A. Clay Stephenson
Acclaimed Contributor

Re: Which device is giving me this error?

Why not learn how to do this yourself so that you never have to ask again -- sort of like teaching a man to fish.

Since this was the first error; it should be addressed because generally that is the most significant error:

dev: b 31 0x053200

The 'b' refers to a "block" as opposed to a character device. Do an "lsdev" and note the driver and class associated with block device 31. You will find that corresponds to the "sdisk" driver and class "disk". You now know this is a disk device. When you see a 6 hex digit device address the this is decoded as follows. The first 2 hex digits indentify the controller instance number '05' or "c5"; the next hex digit identifies the SCSI ID or target, '3' or "t3"; the next hex digit identifies the LUN, '0' or "d0"; the last two hext digits are device dependent but can generally be ignored for device identification purposes.

Your device is c5t3d0 and because we have identified the device associated with it (disk) that corresponds to /dev/dsk/c5t3d0.

Now, if you see an 8 hex digit device number in syslog then the first two hex digits identfify the major device number -- but remember lsdev reports in decimal and these major device numbers are hexadecimal so you must translate them. The remaining 6 hex digits are decoded just as above.
If it ain't broke, I can fix that.
Torsten.
Acclaimed Contributor

Re: Which device is giving me this error?

Looks like you are running database stuff.

Are you using anything like snapclone, business copy ...etc? Looks like something prevents the access to the array (zoning, security settings ...). Ask your SAN admins if they changed something ;-)
(this applies for the b 31 0x233200 device)

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
IT_2007
Honored Contributor

Re: Which device is giving me this error?

run vgdisplay -v and check any devices are missing and might be it. You may need to replace it (if it is local) otherwise if it is SAN then talk to SAN people to check why it isn't visible now?
V.Manoharan
Valued Contributor

Re: Which device is giving me this error?

Hi ,
The issue is with /dev/dsk/c5t3d2. you may not be finding the issue using ioscan or vgdisplay since OS able sence the disk but not able to do any read write operation.

please check with following
1. cstm command
cstm>scl
select disk
cstm>info
cstm>il
and check the error in the particular disk.

2. adb command.
echo 2400?20X |adb /dev/dsk/c5t3d2

Please let is know this disk is a san disk or external SCSI connected disk. Parallely log case with HP.

Thaks and regards
V.Manoharan
dictum9
Super Advisor

Re: Which device is giving me this error?

It's almost certainly a SAN issue. Before I posted, I ran basic troubleshooting commands like ioscan, vgdisplay, pvdisplay and did not see anything abnormal (yet). I just wanted to eliminate that possibility.

It's a production environment and I am hesitant to run dd and other troubleshooting commands at this time, maybe in the off hours. My guess nothing will show, and my guess is, it's a SAN issue with zoning and such. The disk is EMC symmetrix. Yes, there is Business Copy involved.
dictum9
Super Advisor

Re: Which device is giving me this error?

Also, the adb command. Could you please enlighted what this does? I looked at the man page but it doesn't really make sense, adb is a debugger.

2. adb command.
echo 2400?20X |adb /dev/dsk/c5t3d2
Torsten.
Acclaimed Contributor

Re: Which device is giving me this error?

The questionable device files "/dev/dsk/c5t3d2" and "/dev/dsk/c35t3d2" are looking like 2 different pathes to the same LUN.
You should find the volume group using these devices first (strings /etc/lvmtab) to ensure this.

It could sound reasonable this was a business copy and is no longer available to the host for any reason.

You have to discuss this issue with your SAN people.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
IT_2007
Honored Contributor

Re: Which device is giving me this error?

You said this is EMC and business copy involved, is somebody removed?
spex
Honored Contributor

Re: Which device is giving me this error?

Hi,

If your disk array has an LCD screen, it would be a good idea to check for any errors there, too.

PCS
Sandman!
Honored Contributor

Re: Which device is giving me this error?

>2. adb command.
>echo 2400?20X |adb /dev/dsk/c5t3d2

The above command checks the bad block directory (BBDIR) for entries. If the tabular output is not mostly zeros then there is something wrong with that disk. Output of the adb command for a normal disk should look like:

2400: 44454645 43543031 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
dictum9
Super Advisor

Re: Which device is giving me this error?

It was a time out, not a disk failure... still trying to narrow down the root cause in the complex san environment.

Appears to be a non-hp problem but it was necessary to remove that possibility.
A. Clay Stephenson
Acclaimed Contributor

Re: Which device is giving me this error?

Almost certainly the device timeout was left at its default value of 30 seconds which is ok for a discrete disk drive but generally too short for an array LUN. You will find that if you run pvchange -t and see the timeout to something in the 120-180 second range for array LUN's that these timeouts will be far less frequent.
If it ain't broke, I can fix that.
dictum9
Super Advisor

Re: Which device is giving me this error?

Hi,
Well yeah, it's at its default. Then the question arises, what value should be picked for all Symm drives? Is there an EMC-recommended value?
Michael Steele_2
Honored Contributor

Re: Which device is giving me this error?

Well there's some additional logs to look into for timeout problems that could be related to the fibre channel. Refer to fcmsutil, tdutil and stm > logtool.

# tdutil /dev/td#

BadRxChar, etc.

# fcmsutil /dev/td# devstat all | grep -i bad

# stm > tools > utility > run > logtool > file > view > raw summery

note the time start and time end and large error numbers by HW address

So you'll need the HW address of c5t3d2
Support Fatherhood - Stop Family Law
sathish kannan
Valued Contributor

Re: Which device is giving me this error?

Hi etc,

The recommended value of PVtimeout is 90 sec with PowerPath and 180 if PV links are used.

I would recommed you to contact EMC and straighten out this based on your configuration.

I beleive you are running "BCV" incremental establish. Check you device group list and make sure that they are not changes. Check your source and target device status using "symdev show SYMDEV".

If you are not sure I would recommend you tl log a call with EMC and fix this issue.

Regards
Sathish
Don't Think too much
dictum9
Super Advisor

Re: Which device is giving me this error?

sathish kannan et al,

You are right on the money... I will try to get a solid commitment out of EMC regarding this value.