- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- scsi problems causing backup to fail; what's going...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 06:49 AM
05-09-2001 06:49 AM
scsi problems causing backup to fail; what's going on?
On May 4, our backup failed. One of the NFS mounted NT stations was powered off, so I assumed that this was the problem. Monday came, and I tried to run a backup (manually) in the morning. I got errors saying that the device output file for /dev/rmt/0m was bad. I told it to keep going (without backing up the NFS mounts). Sorry I don't have the exact error. It worked fine. That night, for the regular backup (backup all files and the NFS stuff), it killed again. the br_log has an exit code of 2 for the backup command. I tried the backup again during the day, and it did not work. It would start with the output file errors and exit in the middle. I finally rebooted the server (thinking that the NT station being turned off seriously screwed it up) and let backup run. It appears to have run successfully last night. The kicker is the syslog. It is peppered with SCSI errors. Every time I ran a backup for the last two days, these errors appeared. It looks like a new set is there from last night, but it backed up successfully. I have attached the syslog entries and one of the mails to root from backup. I have no idea what these all mean. I assume that they appear when backup fails and shows why, but what about the ones for the sucessful backup (may 8)? We had similar messages a while back, and HP replaced the SCSI card on the server. Could it be the RAID drive or controller? what about the server? what is backup doing to trigger this? I'd like an idea of which tree to bark up. Any ideas are welcome; I'm all out of my own. Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 06:52 AM
05-09-2001 06:52 AM
Re: scsi problems causing backup to fail; what's going on?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 06:53 AM
05-09-2001 06:53 AM
Re: scsi problems causing backup to fail; what's going on?
was anything changed on your server? This sounds indeed like SCSI problems : duplicate SCSI address, bad terminator, ...
good luck,
Thierry.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 06:56 AM
05-09-2001 06:56 AM
Re: scsi problems causing backup to fail; what's going on?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 07:01 AM
05-09-2001 07:01 AM
Re: scsi problems causing backup to fail; what's going on?
Definitely looks like SCSI problems; I would make sure that the buss in properly terminated
on both ends - Did HP install the resistor packs in the controller (if the controller is at the end of the buss). I once had a problem like this on a K-box and it turned out that
one of the terminators was bad but replacing it didn't fix the problem. The bad terminator had blown the on-board term power fuse.
Also, are you anywhere near maximum cable lenght?
My 2 cents, Clay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 07:09 AM
05-09-2001 07:09 AM
Re: scsi problems causing backup to fail; what's going on?
The only thing on the server that was different was the one NT that got turned off but was still mounted, so the server would look for it but not find it. It was powered back on Mon, and the server rebooted Tues. night. The NFS mount shouldn't be interacting with the RAID, but I could be wrong. Thanks for the ideas so far.
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 07:13 AM
05-09-2001 07:13 AM
Re: scsi problems causing backup to fail; what's going on?
One other thought. You didn't mention the tape device type. If it's a DLT (and especially a DLT7000 or 8000 - or Ultrium) it shouldn't be on the same buss.
Clay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 07:23 AM
05-09-2001 07:23 AM
Re: scsi problems causing backup to fail; what's going on?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 07:24 AM
05-09-2001 07:24 AM
Re: scsi problems causing backup to fail; what's going on?
It is a HP DDS-3 Dat24 external drive.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 07:29 AM
05-09-2001 07:29 AM
Re: scsi problems causing backup to fail; what's going on?
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 07:37 AM
05-09-2001 07:37 AM
Re: scsi problems causing backup to fail; what's going on?
if you have some space on two diffrent disks,
I would check, to put some disk-to-disk traffic on the bus.
I.e. copy a 500M lvol using dd or so.
If this works, I think it has to be the tape (or tape cable).
If you get faults during this copy as well, may be with the tape even disconnected, it is more close to controller trouble or a disk going bad.
In addition I would recommend to install the diagnostics and take a look at the error logs of each disk. HP-Support has to give you a password to login to tools (diagmon or so), and coach you through the menus, but it is fairly simple. I did twice with HP on the phone.
Volker
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 07:42 AM
05-09-2001 07:42 AM
Re: scsi problems causing backup to fail; what's going on?
to get clear information about the tape and the disks, can you do an
ioscan -C tape
ioscan -C disk
and give us the output
Thanks
Volker
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 07:46 AM
05-09-2001 07:46 AM
Re: scsi problems causing backup to fail; what's going on?
# ioscan -C tape
H/W Path Class Description
============================================
8/16/5.3.0 tape HP C1537A
# ioscan -C disk
H/W Path Class Description
============================================
8/0.0.0 disk ARTECON LynxRAID
8/0.5.0 disk SEAGATE ST34573WC
8/0.8.0 disk SEAGATE ST34573WC
8/16/5.2.0 disk TOSHIBA CD-ROM XM-5701TA
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 08:01 AM
05-09-2001 08:01 AM
Re: scsi problems causing backup to fail; what's going on?
Since most SCSI messages (all?) refer to cd013000 and the following line:
SCSI TAPE: dev = 0xcd013000 I/O error during close
identifies this one as the tape,
I think your disks and the coresponding bus is OK. The diskcopy-test (if possible) I mentioned before should go fine.
Replace the SCSI-cable for the tape first (easiest and cheapest shot).
Try another tape if available.
Check terminator on the tape.
Good hunting
Volker
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 08:10 AM
05-09-2001 08:10 AM
Re: scsi problems causing backup to fail; what's going on?
If everything else checks out, since it appears
that you are on maintenance, I would have the tape drive replaced. Regardless of the status of individual backup runs you shouldn't be seeing all those syslog errors.
However, all of this count be termination. Don't overlook the internal termination. It's actually amazing how well SE scsi does with no termination. It worls just well enough to drive you crazy.
Clay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 12:25 PM
05-09-2001 12:25 PM
Re: scsi problems causing backup to fail; what's going on?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 12:36 PM
05-09-2001 12:36 PM
Re: scsi problems causing backup to fail; what's going on?
Generally the errors reported are set to report as the description and also perhaps the status of CSR of the device.So in this case it looks like the device is giving lots of parity error , it looks to me that the device itself is fine , please do the following :
1. Check for terminations , on the BUS and devices ( which should be fine as I assume that the system was working ).
2.Check for the pins bent in the SCSI Buses as they can casue lots of intermittent stuff.
3. Finally you can go ahead by changing the controller , which could be your solution.
Manoj Srivastava
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2001 01:07 PM
05-09-2001 01:07 PM
Re: scsi problems causing backup to fail; what's going on?
thing to note in LBOLT's is the device, in your case 'cd013000'. This breaks downs
as:
cd - major device number 0xcd = 205 decimal
if you do as lsdev you will see that 205 is stape - must be a scsi tape drive
01 - buss (controller number) c1
3 - SCSI Target ID t3
0 - LUN d0
00 - the last 2 hex digits are device driver specific flags; (they set things like norewind, density, compression on a tape drive but the same values might do completely different things on a disk drive - it depends on the driver)
In your case we know it's a tape drive c1t3d0.
When that fails, we use the force.
Hopes this helps a bit, Clay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-10-2001 05:45 AM
05-10-2001 05:45 AM
Re: scsi problems causing backup to fail; what's going on?
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-10-2001 06:17 AM
05-10-2001 06:17 AM
Re: scsi problems causing backup to fail; what's going on?
Clearly it is a hardware error. Run stm for both tape and disks.
You said your tape is DDS3... For this device tapes must be 125m (DDS3) tapes.
Try my program on :
http://forums.itrc.hp.com/cm/QuestionAnswer/1,1150,0x298bee3e323bd5118fef0090279cd0f9,00.html
It get full statistics from DDS driver.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-10-2001 07:41 AM
05-10-2001 07:41 AM
Re: scsi problems causing backup to fail; what's going on?
Now it looks as though you have two independent problems going on. Since the tape drive is on one buss and your RAID is on another the problems are unrelated (unless there is a common thread like system board).
But all in all your machine seems to be working too well for a system board failure.
I didn't see the type of RAID you are using - do you have any monitoring software for it?