- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: SCSI error, performance related?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 02:12 AM
06-07-2007 02:12 AM
We have a D380, which we look after hardware wise, but are not sure about the application running on it. (Alcatel/Lucent app with informix DB)
Recently we get a disk error, where the disk was showing as NO_HW in an ioscan. The disk is the only disk in vg02, which looks like a temporary dump area for the application.
We replaced the disk, and all looked ok for about a day, when the same error occured. We have sinse replaced the machine for another D class, again this has ran for about a day, before giving the same error.
Is it possible that the application is writing far more than the SCSI bus / disk can cope with? Are there some kernel parameters I can look at?
A sar on the disk is showing very high %wio (99%)
Any ideas appreciated
Thanks
Mark
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 02:15 AM
06-07-2007 02:15 AM
Re: SCSI error, performance related?
Can you post the actual errors you are getting?
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 02:22 AM
06-07-2007 02:22 AM
Re: SCSI error, performance related?
After about 18 hours of up time the syslog shows:
Jun 6 18:12:28 lucluc20 vmunix: SCSI: Request Timeout -- lbolt: 6229688, dev: 1f009000
Jun 6 18:12:28 lucluc20 vmunix: lbp->state: 20
Jun 6 18:12:28 lucluc20 vmunix: lbp->offset: ffffffff
Jun 6 18:12:28 lucluc20 vmunix: lbp->uPhysScript: 480000
Jun 6 18:12:28 lucluc20 vmunix: From most recent interrupt:
Jun 6 18:12:28 lucluc20 vmunix: ISTAT: 22, SIST0: 00, SIST1: 04, DSTAT: 00, DSPS: 00480580
Jun 6 18:12:28 lucluc20 vmunix: lsp: 000000004281c600
Jun 6 18:12:28 lucluc20 vmunix: bp->b_dev: 1f009000
Jun 6 18:12:28 lucluc20 vmunix: scb->io_id: 7e895
Jun 6 18:12:28 lucluc20 vmunix: scb->cdb: 2a 00 00 1f 48 40 00 00 10 00
Jun 6 18:12:28 lucluc20 vmunix: lbolt_at_timeout: 6226560, lbolt_at_start: 6226560
Jun 6 18:12:28 lucluc20 vmunix: lsp->state: 10d
Jun 6 18:12:28 lucluc20 vmunix: lbp->owner: 000000004281c600
Jun 6 18:12:28 lucluc20 vmunix: scratch_lsp: 0000000000000000
Jun 6 18:12:28 lucluc20 vmunix: Pre-DSP script dump [0000000041001020]:
Jun 6 18:12:28 lucluc20 vmunix: fbf44810 004807c8 41090000 00480290
Jun 6 18:12:28 lucluc20 vmunix: 78347200 0000000a 78350800 00000000
Jun 6 18:12:28 lucluc20 vmunix: Script dump [0000000041001040]:
Jun 6 18:12:28 lucluc20 vmunix: 0e000004 00480580 80000000 00000000
Jun 6 18:12:28 lucluc20 vmunix: 870b0000 004802d8 0a000000 00480588
Jun 6 18:12:28 lucluc20 vmunix: SCSI: Abort abandoned -- lbolt: 6229688, dev: 1f009000, io_id: 7e895, status: 200
Jun 6 18:12:28 lucluc20 vmunix:
Jun 6 18:12:28 lucluc20 vmunix: SCSI: Read error -- dev: b 31 0x009000, errno: 126, resid: 2048,
Jun 6 18:12:28 lucluc20 vmunix: blkno: 8, sectno: 16, offset: 8192, bcount: 2048.
Jun 6 18:12:28 lucluc20 vmunix: LVM: vg[2]: pvnum=0 (dev_t=0x1f009000) is POWERFAILED
AND
We are contunually getting scsi read errors like these:
Jun 7 15:08:37 lucluc20 vmunix:
Jun 7 15:08:37 lucluc20 vmunix: SCSI: Read error -- dev: b 31 0x009000, errno: 126, resid: 2048,
Jun 7 15:15:51 lucluc20 vmunix:
Jun 7 15:16:01 lucluc20 above message repeats 42 times
Jun 7 15:15:51 lucluc20 vmunix: SCSI: Read error -- dev: b 31 0x009000, errno: 126, resid: 2048,
Jun 7 15:16:01 lucluc20 above message repeats 42 times
Jun 7 15:15:51 lucluc20 vmunix: blkno: 8, sectno: 16, offset: 8192, bcount: 2048.
Jun 7 15:16:01 lucluc20 above message repeats 115 times
Jun 7 15:16:01 lucluc20 vmunix:
Jun 7 15:16:01 lucluc20 vmunix: SCSI: Read error -- dev: b 31 0x009000, errno: 126, resid: 2048,
Jun 7 15:16:01 lucluc20 vmunix: blkno: 8, sectno: 16, offset: 8192, bcount: 2048.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 02:42 AM
06-07-2007 02:42 AM
Re: SCSI error, performance related?
In any event, you are having problems with /dev/rdsk/c0t9d0. If this is an internal disk, it's the second from bottom of a D3xx.
The fact that a disk is nearly 100% busy should be invisible to the application; it simply issues a read() or write() system call and waits until that operation is completed --- and the disk being busy could easily be a symptom of a failing disk.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 03:00 AM
06-07-2007 03:00 AM
Re: SCSI error, performance related?
The disk was replaced, and the volume etc re-created.
Our customer is now telling us to transfer everything to an L class, but I'm not convinced it is a hardware issue. Why does it work for the best part of a day after a reboot?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 03:25 AM
06-07-2007 03:25 AM
Solution- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 03:35 AM
06-07-2007 03:35 AM
Re: SCSI error, performance related?
Apparently HP changed the disk shortly before we took over maintenance, they will not tell us how long HP's new disk has lasted.
We have also changed the disk.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 03:36 AM
06-07-2007 03:36 AM
Re: SCSI error, performance related?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 04:56 AM
06-07-2007 04:56 AM
Re: SCSI error, performance related?
I have a couple of D boxes in my home. Kinda strange, but I learned things from them.
One thing I learned is that the D class boxes route power and bandwidth to the disk through something called a drive cage.
One of my D boxes had a problem around the turn of the century. Disks just kept going bad. It was eating disks like popcorn, every couple of weeks.
Until I got HP to replace the drive cage disks kept going bad. A few of those disks turned out to have not actually gone bad, I kept one and tested it after the drive cage was replaced.
Sometimes you have to argue with hardware to get them to replace this part, but its worth looking into.
These D systems were oracle development servers for a number of years. They were severely stressed both in cpu and i/o. They never lost a disk after the drive cage replacement in spite of being pushed very, very hard for a number of years.
So, No I don't believe performance issues can cause disks to fail.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 09:25 AM
06-07-2007 09:25 AM
Re: SCSI error, performance related?
The disks were changed twice and showed the same error, so thats 3 different disks in the 1st D class, then all the current internal disks from the system were put in to a whole different D class.
Would the fact that the disk has been in a damaged D class, mean it could stop working in another D class?
Strange 1 this!
Im off work now until Tuesday but Im sure 1 of the other lads will check this.
Thanks
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 09:53 AM
06-07-2007 09:53 AM
Re: SCSI error, performance related?
If I were you, I would carefully check cooling and power supply voltages but I'm betting on poor cooling. I suspect that you have actually fixed the problem by moving to the second D-box and now you are simply dealing with the artifacts of the drives having been previously operated in a harsh environment.
Of course, the newest HVD SCSI drives are probably 8 or 9 years old now --- so what do you expect?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2007 09:28 PM
06-07-2007 09:28 PM
Re: SCSI error, performance related?
Just to add some more details to this fault. The original configuration had vg02 which contained 2 disks , 1 internal c0t9d0 and 1 external c3t2d0.
the D380 always failed in the same way , first the external disk would fail with the syslog messages that have already been posted, then several hours later the internal disk would fail with the same messages.
Both disks have been swapped out 3 times, the entire D380 has also been swapped out. I have used STM to exercise the disks, cpu, memory of the original D class and everything tests ok.
To try and narrow the fault to a single disk we removed the external disk, however it made no difference. We then removed the entire external array from the config.
We know little of what the application does, however the D380 is running 11i version 1,
Are there any patches which may cause this ??
Thanks for your help
steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2007 09:26 PM
06-13-2007 09:26 PM
Re: SCSI error, performance related?
Mark