1834090 Members
1916 Online
110063 Solutions
New Discussion

Re: Error message

 
Adithyan
Frequent Advisor

Error message

Hi,

I get the below error messages in the server very frequently. Can anyone help me fixing this?


Sep 3 11:39:38 uapkb025 vmunix: SCSI: Request Timeout -- lbolt: 352891964, dev: 1f00c000
Sep 3 11:39:38 uapkb025 vmunix: lbp->state: 4020
Sep 3 11:39:38 uapkb025 vmunix: lbp->offset: ffffffff
Sep 3 11:39:38 uapkb025 vmunix: lbp->uPhysScript: 380000
Sep 3 11:39:38 uapkb025 vmunix: From most recent interrupt:
Sep 3 11:39:38 uapkb025 vmunix: ISTAT: 22, SIST0: 04, SIST1: 00, DSTAT: 80, DSPS: 00000006
Sep 3 11:39:38 uapkb025 vmunix: NCR chip register access history (most recent last): 297 accesses
Sep 3 11:39:38 uapkb025 vmunix: 301549, ISTAT<-20
Sep 3 11:39:38 uapkb025 vmunix: 43, ISTAT<-20
Sep 3 11:39:38 uapkb025 vmunix: 0, ISTAT<-20
Sep 3 11:39:38 uapkb025 vmunix: 0, ISTAT<-20
Sep 3 11:39:38 uapkb025 vmunix: 16060, ISTAT<-20
Sep 3 11:39:38 uapkb025 vmunix: 0, ISTAT<-20
Sep 3 11:39:38 uapkb025 vmunix: 1774164, ISTAT: 22
Sep 3 11:39:38 uapkb025 vmunix: 4, SIST0: 04
Sep 3 11:39:38 uapkb025 vmunix: 2, SIST1: 00
Sep 3 11:39:38 uapkb025 vmunix: 2, DSTAT: 80
Sep 3 11:39:38 uapkb025 vmunix: 1, DSPS: 00000006
Sep 3 11:39:38 uapkb025 vmunix: 1, SCRATCHA: ff000867
Sep 3 11:39:38 uapkb025 vmunix: 3, DSP: 00380058
Sep 3 11:39:38 uapkb025 vmunix: 2, SCRATCHA1<-00
Sep 3 11:39:38 uapkb025 vmunix: 1, CTEST3<-04
Sep 3 11:39:38 uapkb025 vmunix: 0, STEST3<-82
Sep 3 11:39:38 uapkb025 vmunix: lsp: 5806880
Sep 3 11:39:38 uapkb025 vmunix: bp->b_dev: 1f00c000
Sep 3 11:39:38 uapkb025 vmunix: scb->io_id: d2fe2
Sep 3 11:39:38 uapkb025 vmunix: scb->cdb: 28 00 00 00 00 10 00 00 04 00
Sep 3 11:39:38 uapkb025 vmunix: lbolt_at_timeout: 352888864, lbolt_at_start: 352888864
Sep 3 11:39:38 uapkb025 vmunix: lsp->state: 10d
Sep 3 11:39:38 uapkb025 vmunix: lbp->owner: 5806880
Sep 3 11:39:38 uapkb025 vmunix: scratch_lsp: 0
Sep 3 11:39:38 uapkb025 vmunix: Pre-DSP script dump [576b030]:
Sep 3 11:39:38 uapkb025 vmunix: 78346700 0000000a 78350800 00000000
Sep 3 11:39:38 uapkb025 vmunix: 0e000004 003803c0 80000000 00000000
Sep 3 11:39:38 uapkb025 vmunix: Script dump [576b050]:
Sep 3 11:39:38 uapkb025 vmunix: 9f0b0000 00000006 0a000000 003803c8
Sep 3 11:39:38 uapkb025 vmunix: 721a0000 00000000 c0000004 0038035c
Sep 3 11:39:38 uapkb025 vmunix:

Thanks
Adithyan
Keen to learn HP UX
11 REPLIES 11
Marco A.
Esteemed Contributor

Re: Error message

Hello,

Do you have Ultrium drives right there?! ...

Regards,

Marco
Just unplug and plug in again ....
Steven E. Protter
Exalted Contributor

Re: Error message

Shalom Adithyan,

Appears to be a standard lbolt, as in bad disk.

b_dev: 1f00c000

That is the device.

It can be caused by timeout and the disk can continue to work. It can also be caused by swapping out a hot swap disk. In this case it would be corrected by reboot.

Most likely however a disk or part of a disk subsystem has gone and it will be necessary to replace the disk in the near future.

cstm,mstm or xstm can be used to test the disk.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Fabian Briseño
Esteemed Contributor

Re: Error message

Hello adithyan.

How frequently ?

That indicates an SCSI reset, harware in your machine is failling.

Do an
ioscan -Fn |more

to see if there is any harware with state NO_HW
Knowledge is power.
Marco A.
Esteemed Contributor

Re: Error message

When that happens ? running fbackup, etc ?

You can try path the SCSI too, for example, the PHKL_21607/s700_800 11.00 SCSI IO Subsystem Cumulative Patch has the fix for similar issues on 11.0, the loggin could be reduced after patching the server.

Let us know your results,

Regards,

Marco
Just unplug and plug in again ....
Adithyan
Frequent Advisor

Re: Error message

Hi Steven,

# ioscan -fnC disk
Class I H/W Path Driver S/W State H/W Type Description
=====================================================================
disk 3 8/4.8.0 sdisk CLAIMED DEVICE SEAGATE ST34573WC
/dev/dsk/c0t8d0 /dev/rdsk/c0t8d0
disk 4 8/4.9.0 sdisk CLAIMED DEVICE SEAGATE ST34573WC
/dev/dsk/c0t9d0 /dev/rdsk/c0t9d0
disk 5 8/4.10.0 sdisk CLAIMED DEVICE SEAGATE ST34573WC
/dev/dsk/c0t10d0 /dev/rdsk/c0t10d0
disk 6 8/4.11.0 sdisk CLAIMED DEVICE SEAGATE ST34572WC
/dev/dsk/c0t11d0 /dev/rdsk/c0t11d0
disk 7 8/4.12.0 sdisk CLAIMED DEVICE IBM DGHS09Y
/dev/dsk/c0t12d0 /dev/rdsk/c0t12d0
disk 8 8/4.13.0 sdisk CLAIMED DEVICE IBM DGHS09Y
/dev/dsk/c0t13d0 /dev/rdsk/c0t13d0
disk 9 8/4.14.0 sdisk CLAIMED DEVICE IBM DGHS09Y
/dev/dsk/c0t14d0 /dev/rdsk/c0t14d0
disk 10 8/4.15.0 sdisk CLAIMED DEVICE IBM DGHS09Y
/dev/dsk/c0t15d0 /dev/rdsk/c0t15d0
disk 0 8/16/5.2.0 sdisk CLAIMED DEVICE TOSHIBA CD-ROM XM-5701TA
/dev/dsk/c1t2d0 /dev/rdsk/c1t2d0
disk 1 8/16/5.5.0 sdisk CLAIMED DEVICE SEAGATE ST34573N
/dev/dsk/c1t5d0 /dev/rdsk/c1t5d0
disk 2 8/16/5.6.0 sdisk CLAIMED DEVICE SEAGATE ST34573N
/dev/dsk/c1t6d0 /dev/rdsk/c1t6d0

cstm,mstm or xstm are third party tools or built in? Pls be a bit detail. Thanks.
Keen to learn HP UX
Adithyan
Frequent Advisor

Re: Error message

What is the easy way to find the faulty disk out of these physical disks. The OS is HP Ux 10.20 and the server is up for the last 530 days. Cant think abt a reboot.
Keen to learn HP UX
Marco A.
Esteemed Contributor

Re: Error message

Hello,

A known problem with large buffer cache configurations can cause the operating system to hold a spinlock for a long time(severalseconds), which can cause interrupts to be missed by the SCSI driver.

There is a patch available to prevent the buffer cache issue, which led to the SCSI timeout events being logged.

You can search for the patch called "Filesystem buffer cache performance fix"

This patch introduces a new kernel tunable(bcvmap_size_factor) which can be used to tune the buffer cache.

Rgds,

Marco
Just unplug and plug in again ....
Fabian Briseño
Esteemed Contributor

Re: Error message

For 10.20 patches try this link.

This could be patch related.

ftp://ftp.itrc.hp.com/archived_patches/

Seeing you ioscan command tell me that this is probably not disk related, unless you have experienced slow system, or system hangs when you try to access a particular directory.
Knowledge is power.
Amit Parui
Valued Contributor

Re: Error message

Hi Adithyan,

Since you are getting lbolt errors i suggest you try out pvchange -t
If Life gives u a ROCK, its upto u to build a BRIDGE or a WALL !!!
Andrew Merritt_2
Honored Contributor

Re: Error message

One minor point of clarification of some of the comments. What you're getting is a "SCSI Request Timeout".

It is NOT an 'lbolt' or 'lbolt error'. The lbolt value is just a timestamp, not an indication of any error at all.

Andrew
Rita C Workman
Honored Contributor

Re: Error message

Your timeout is on 1f00c000. If memory serves me that equates to c0t12d0.

You have a disk like that........see what vg it belongs to and make sure your not seeing any stale extents.

But as was mentioned it could be a simple timeout. So first try to change the timeout on the disk:
pvchange -t 60 /dev/dsk/c0t12d0
(changes timeout to 60 seconds- or whatever you want to set it as)

If you continue to see issues, then the disk may be going flakey and need replacing.

Rgrds,
Rita