1755927 Members
3073 Online
108839 Solutions
New Discussion юеВ

Re: parity error

 
Russ Givens_2
Occasional Contributor

parity error

I've been getting a error on my ML530 for months now and I was wondering if anybody had any insight on getting some resolution. I have a external 160/320 sdlt1 attached to the ML530. I keep on getting this error: A parity error was detected on \device\scsi\cpq32fs22. I haven't been able to get consistent reliabel backups because of this error. Here is what has been done. The first time this happened I ran Library and Tape Tools and it identified the device as not working. HP sent me a new tape drive. I installed the tape drive and backups worked for about 3 to 4 months without error. It happened again about a month ago and this time they replaced the drive, cable, and terminator. I also updated the driver for the controller that hp forwarded to me. It worked again for about two weeks without error then the error reappeared. This time they came out and supposedly replaced the system board because the scsi controller is builtin. It worked fine for about two weeks again and the errors have appeared and I am unable to get a good backup. I would love to find out what the deal is with this error. If anybody has any insight it would be much appreciated. Thank you
4 REPLIES 4
Marino Meloni_1
Honored Contributor

Re: parity error

look like youhave a bad tape, the problem appear after a few week,one tape can contaminate the device head and then you will have trouble starting from that tape.
I would suggest to run Media validation test on the envolved tapes via L&TT
marino
Curtis Ballard
Honored Contributor

Re: parity error

If the problem actually says "A parity error was detected" in the event log then it is a hardware error and has nothing to do with bad tapes. That error is almost never caused by a bad tape drive. Whoever changed the tape drive did you a disservice.

The most common cause of that error is a bad terminator. The next is a bad HBA, followed by a bad cable, followed by a power/grounding problem. You say that you have replaced the cables, terminator, and HBA. If that is true then you may be looking at power/grounding. I'd probably try replacing the terminator again just be be certain though. It is rare but I've seen cases where two in a row were bad.

If you are abosolutely certain that your cables and terminator are good first check to make certain that every connection is tight. Then check the routing. There shouldn't be any tight turns, a maximum of 1 ferrite bead on any single bus and the cables shouldn't run through any areas where high power or radio frequencies are present.

Next check you power distribution system. If possible use a meter and check for a voltage difference between the gound pins of the outlets that the server and the library are plugged into. If there is a difference you can really create problems on the bus since it will tie the ground between the server and the library. There can also be problems when the server and the library are on UPS's that isolate the ground.

Angus Crome
Honored Contributor

Re: parity error

These are the types of problems that led us away from DLT. We struggled with it for almost 4 years. When LTO became available, we decided to switch from DLT(7/8)000 devices and haven't had a problem since. In almost all cases, our problems turned out to be the firmware on the tape drives. It had gotten somewhat stable by the end, but we had outgrown the technology by then anyway and were tired of dealing with the problems.

We went through the same functional problems and errors that you describe and each time the hardware would be replaced and work for a short while. We ran power and ground tests and replaced cables/terminators and devices like they were going out of style. Finally, the firmware updates started correcting all the timing issues in our library and drives.

The short of it, if HP support can't get it fixed, they need to replace everything or help you transition to something that does work, at a highly reduced cost.

You should definitely get your power and ground checked first though, if you haven't already.
There are 10 types of people in the world, those who understand binary and those who don't - Author Unknown
Curtis Ballard
Honored Contributor

Re: parity error

To address the previous reply. While there are a lot of errors that are not as clear, the one referenced here "a parity error was detected" can ONLY be caused by hardware. I prefer LTO drives to DLT also but there is nothing you can do in firmware in either drive to cause a parity error on the bus. The parity is entirely hardware controlled by the HBA SCSI chip on writes and by the SCSI chip in the drive on reads. It is extremely rare for one of those chips to fail and only occasionally cause errors so if there are parity errors the problem is almost always somewhere on the bus between the HBA and the drive. It can be cables, terminators, something else on the bus, or power/grounding. There isn't anything else there.

If the problem goes away for a while when connections are changed I start to suspect external interferance. When the connections have just been reseated they are at their best. If there is a source of high energy interferance it might not start causing problems for a few days until the contacts have developed a little surface oxidation. That oxidation is usually never enough to cause a problem but when combined with high energy interferance it can.

You'd be surprised what kind of problems an MRI machine a few doors down can cause.