- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Disk Controller Problem
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2005 01:55 AM
11-07-2005 01:55 AM
Disk Controller Problem
We are getting intermittent SCSI errors and I am pretty confident it is a controller issue as opposed to a disk.
The problem causes logical volumes to become unavailable or corrupted. We recently hotswapped 3 new disks and it is since we started using these disks that the problem arose.
We have 2 internal disks in our server and 7 disk sitting in an external array. The problem seems to be in the external box.
We have powered down the server and disk array but still get the errors after a while.
grep -i scsi syslog.log
Nov 7 11:02:38 hpdev vmunix: c8xx BUS: 4 SCSI C1010 Ultra160 Wide LVD A6828-60101 assigned CPU: -1
Nov 7 11:02:38 hpdev vmunix: c8xx BUS: 5 SCSI C1010 Ultra160 Wide LVD A6828-60101 assigned CPU: -1
Nov 7 11:02:38 hpdev vmunix: c8xx BUS: 6 SCSI C1010 Ultra160 Wide LVD A6828-60101 assigned CPU: -1
Nov 7 11:02:38 hpdev vmunix: c8xx BUS: 7 SCSI C1010 Ultra160 Wide LVD A6828-60101 assigned CPU: -1
Nov 7 11:24:18 hpdev vmunix: SCSI: Unexpected Disconnect -- lbolt: 143986, dev: 1f074000, io_id: 708ccb9
Nov 7 13:10:04 hpdev vmunix: SCSI Gross Error on 0/12/0/0:
Nov 7 13:10:04 hpdev vmunix: SCSI: isrEscape Controller at 0/12/0/0.
Nov 7 13:10:04 hpdev vmunix: SCSI: -- lbolt: 778514, dev: cb07f002
Nov 7 13:10:05 hpdev vmunix: SCSI: Resetting SCSI -- lbolt: 778614, bus: 7 path: 0/12/0/0
Nov 7 13:10:05 hpdev vmunix: SCSI: Reset detected -- lbolt: 778614, bus: 7 path: 0/12/0/0
Nov 7 13:10:08 hpdev vmunix: SCSI: Reset detected -- path: 0/12/0/0
Nov 7 13:10:08 hpdev vmunix: SCSI: -- lbolt: 778914, bus: 7
Nov 7 13:10:08 hpdev vmunix: SCSI: Ultra160 Controller at 0/12/0/0: Error: The domain validation test for target 15 determined that communication may not be possible to this target. Verify the hardware configuration.
Nov 7 13:10:08 hpdev vmunix: SCSI: Ultra160 Controller at 0/12/0/0: Error: The domain validation test for target 1 determined that communication may not be possible to this target. Verify the hardware configuration.
hpdev-root$:ioscan -fnC ctl
Class I H/W Path Driver S/W State H/W Type Description
======================================================================
ctl 0 0/0/1/0.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c0t7d0
ctl 1 0/0/1/1.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c1t7d0
ctl 2 0/0/2/0.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c2t7d0
ctl 3 0/0/2/1.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c3t7d0
ctl 4 0/8/0/0.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c4t7d0
ctl 5 0/9/0/0.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c5t7d0
ctl 6 0/10/0/0.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c6t7d0
ctl 7 0/12/0/0.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c7t7d0
ctl 9 0/12/0/0.15.0 sctl CLAIMED DEVICE HP A6491A
/dev/rscsi/c7t15d0
hpdev-root$:ioscan -fnCdisk
Class I H/W Path Driver S/W State H/W Type Description
======================================================================
disk 5 0/0/1/1.2.0 sdisk CLAIMED DEVICE HP 36.4GMAN3367MC
/dev/dsk/c1t2d0 /dev/rdsk/c1t2d0
disk 6 0/0/2/0.2.0 sdisk CLAIMED DEVICE HP 36.4GMAN3367MC
/dev/dsk/c2t2d0 /dev/rdsk/c2t2d0
disk 0 0/0/2/1.2.0 sdisk CLAIMED DEVICE HP DVD-ROM 305
/dev/dsk/c3t2d0 /dev/rdsk/c3t2d0
disk 3 0/12/0/0.0.0 sdisk CLAIMED DEVICE HP 73.4GST373405LC
/dev/dsk/c7t0d0 /dev/rdsk/c7t0d0
disk 4 0/12/0/0.1.0 sdisk CLAIMED DEVICE HP 73.4GATLAS10K3_73_SCA
/dev/dsk/c7t1d0 /dev/rdsk/c7t1d0
disk 7 0/12/0/0.2.0 sdisk CLAIMED DEVICE HP 73.4GATLAS10K3_73_SCA
/dev/dsk/c7t2d0 /dev/rdsk/c7t2d0
disk 8 0/12/0/0.3.0 sdisk CLAIMED DEVICE HP 73.4GST373405LC
/dev/dsk/c7t3d0 /dev/rdsk/c7t3d0
disk 9 0/12/0/0.4.0 sdisk CLAIMED DEVICE MAXTOR ATLAS10K5_73SCA
/dev/dsk/c7t4d0 /dev/rdsk/c7t4d0
disk 10 0/12/0/0.5.0 sdisk CLAIMED DEVICE MAXTOR ATLAS10K5_73SCA
/dev/dsk/c7t5d0 /dev/rdsk/c7t5d0
disk 11 0/12/0/0.6.0 sdisk CLAIMED DEVICE MAXTOR ATLAS10K5_73SCA
/dev/dsk/c7t6d0 /dev/rdsk/c7t6d0
Any ideas ?
Thanks
Mike
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2005 02:06 AM
11-07-2005 02:06 AM
Re: Disk Controller Problem
1) One of the new disks is bad. Suggest xstm mstm check on hardware
2) The system has not been rebooted since the hot swap. This makes simple lbolts due to disk swapping go away.
3) Improper procedure was used with LVM management when you switched the disks.
I'm sure there are others. Was pvcreate used on the new disks? Were they temporarily excluded from the volume group to do this?
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2005 02:12 AM
11-07-2005 02:12 AM
Re: Disk Controller Problem
cb == Major device 203
07 == c7
f == t15
0 == d0
02 == driver specific, if 203 is sctl then 02 means inhibit inquiry on open.
Do an lsdev to make sure that major device 203 on your box is sctl.
You didn't mention why you replaced the disks so the problem may be older than you indicate. There are not enough data to nail this down. Is your external array some kind of smart array or is it simply a JBOD? Have you ensured that your total bus length (including internal connections) does not exceed (or closely approach) maximum bus length? Is at least one device on the bus supplying termination power? Is the bus terminated in EXACTLY two places? On the physical ends of the bus? Did ypu possibly leave the terminator jumpers/switches "ON" on one or more of the new drives? Before jumping to the conclusion that you have a bad controller make certain that your termination is okay. Replace the terminators because poor termination can cause exactly this sort of problem -- a SCSI bus that almost works perfectly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2005 02:12 AM
11-07-2005 02:12 AM
Re: Disk Controller Problem
Check if patch PHKL_32089 11.11 SCSI Ultra160 Cumulative Patch is installed.
http://www4.itrc.hp.com/service/patch/patchDetail.do?BC=patch.breadcrumb.main|patch.breadcrumb.search|&patchid=PHKL_32089&context=hpux:800:11:11
Regards,
Robert-Jan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2005 02:23 AM
11-07-2005 02:23 AM
Re: Disk Controller Problem
This seems to be a disk failure which is creating the scsi lbolt error.
I adivice you to log a call with HP, if it is under the contract and u can send the syslog and event logs to HP solution center to decocde the error.
Regards,
Sunil
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2005 03:20 AM
11-07-2005 03:20 AM
Re: Disk Controller Problem
Thanks for the speedy replies. Here's some more clarification.
Firstly the external box is a HP 2300 Disk System A6491A storage cabinet. It has 14 disk bays which appear to be divided into 2 sets of 7.
There are 2x2 scsi ports at the back of the enclosure. There are 2 scsi cables, one from each side, going into 2 different servers (including the server in question). The other 2 ports are terminated.
We usually have the 2 servers up reading from the one disk system. The other server was also complaining of SCSI errors. I shut down this other server to simplify troubleshooting.
We haven't actually replaced disks, we are adding extra disks. We originally had 4 disks in each side and have now added 3 new disks to both sides.
The LVM configuration is pretty simple for these 3 disks. Each disk is in a separate, standalone volume group.
Here's the output of lsdev
hpdev-root$:lsdev 203
Character Block Driver Class
203 -1 sctl ctl
We do not have PHKL_32089 installed. I will certainly investigate it.
We have just powered down again and checked scsi connectors, reseated disks etc.
As the problem is intermittent, we are now trying to reproduce it using dd of lvols, prealloc to create huge files and the stm "exercise" option. I presume this stress testing is good enough ?
Regards
Mike