Tape Libraries and Drives
cancel
Showing results for 
Search instead for 
Did you mean: 

SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

RalfG
Frequent Advisor

SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

Hi,

I'm having a hard time to get our Overland Neo 4100 Lib (should be identical to MSL6060) with two HP Ultrium 1840 drives (OEM FW B12H) to negotiate correctly with the SCSI HBAs. At the moment both drives are only doing U160.

The server has two U320 LSI HBAs (20320) one of the controller is directly connected to a drive, the other controller is connected to the drive and the Lib. I updated both controllers to the current (non RAID) firmware. For testing I also added an Adaptec U29320 controller - no change.

I'm running Debian Etch, but I've also tried other distros. During boot time I get the kernel message that the drives are only connected with U160 speed. I already compiled the latest LSI modules for my debian kernel, but this didn't change anything (same for Adaptec modules).

mptbase: ioc0: mpt_send_handshake_request start, WaitCnt=1
Vendor: HP Model: Ultrium 4-SCSI Rev: B12H
Type: Sequential-Access ANSI SCSI revision: 05
Disabling QAS due to noQas=01 on id=1!
target5:0:1: Beginning Domain Validation
target5:0:1: Ending Domain Validation
target5:0:1: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 64)

The write speed is also limited:

# ./hptapeperf -o /dev/nst1 -i 3 -r 8192 -R -b 131072
16384.00 Mbytes transferred in 144 seconds, 113.78 Mbytes/sec.
16384.00 Mbytes transferred in 137 seconds, 119.59 Mbytes/sec.

# dd if=/dev/zero of=/dev/nst1 bs=262144 count=50000
50000+0 Datensätze ein
50000+0 Datensätze aus
13107200000 Bytes (13 GB) kopiert, 96,4847 Sekunden, 136 MB/s


I had contact to the LSI kernel module developer, he claims that the drives are not able to do U320. Debugging informations shows that the controllers offers U320 (0x08), but the drives do not accept that offer.

debug information:
(http://pastebin.ca/765223)

1. mptbase: ioc0: Initiating bringup
2. ioc0: LSI53C1030 C0: Capabilities={Initiator,Target}
3. mptbase: ioc0: PortPage0 minSyncFactor=8
4. mptspi: ioc0: saf_te 0
5. scsi5 : ioc0: LSI53C1030 C0, FwRev=01032700h, Ports=1, MaxQ=255, IRQ=66
6. mptspi: ioc0: id=0 Requested = 0x00000a00 ( factor = 0x0a @ offset = 0x00 )
7. mptspi: ioc0: id=1 Requested = 0x00000a00 ( factor = 0x0a @ offset = 0x00 )
8. Vendor: HP Model: Ultrium 4-SCSI Rev: B12H
9. Type: Sequential-Access ANSI SCSI revision: 05
10. Disabling QAS due to noQas=01 on id=1!
11. mptspi: ioc0: id=1 min_period=0x08 max_offset=0x7f max_width=1
12. target5:0:1: Beginning Domain Validation
13. mptspi: ioc0: id=1 Requested = 0x00000a00 ( factor = 0x0a @ offset = 0x00 )
14. mptspi: ioc0: id=1 Requested = 0x00000a00 ( factor = 0x0a @ offset = 0x00 )
15. scsi 5:0:1:0:
16. command: Inquiry: 12 00 00 00 60 00
17. scsi 5:0:1:0:
18. command: Inquiry: 12 00 00 00 60 00
19. scsi 5:0:1:0:
20. command: Inquiry: 12 00 00 00 60 00
21. scsi 5:0:1:0:
22. command: Inquiry: 12 00 00 00 60 00
23. mptspi: ioc0: id=1 Requested = 0x20000a00 ( Wide factor = 0x0a @ offset = 0x00 )
24. scsi 5:0:1:0:
25. command: Inquiry: 12 00 00 00 60 00
26. scsi 5:0:1:0:
27. command: Inquiry: 12 00 00 00 60 00
28. scsi 5:0:1:0:
29. command: Inquiry: 12 00 00 00 60 00
30. mptspi: ioc0: id=1 Requested = 0x207f0a00 ( Wide factor = 0x0a @ offset = 0x7f )
31. mptspi: ioc0: id=1 Requested = 0x207f0803 ( Wide factor = 0x08 @ offset = 0x7f IU DT )
32. mptspi: ioc0: id=1 Requested = 0x207f0803 ( Wide factor = 0x08 @ offset = 0x7f IU DT )
33. mptspi: ioc0: id=1 Requested = 0x207f0823 ( Wide factor = 0x08 @ offset = 0x7f IU DT RDSTRM )
34. mptspi: ioc0: id=1 Requested = 0x207f0833 ( Wide factor = 0x08 @ offset = 0x7f IU DT WRFLOW RDSTRM )
35. mptspi: ioc0: id=1 Requested = 0x207f0873 ( Wide factor = 0x08 @ offset = 0x7f IU DT WRFLOW RDSTRM RTI )
36. mptspi: ioc0: id=1 Requested = 0x207f08f3 ( Wide factor = 0x08 @ offset = 0x7f IU DT WRFLOW RDSTRM RTI PCOMP )
37. scsi 5:0:1:0:
38. command: Inquiry: 12 00 00 00 60 00
39. scsi 5:0:1:0:
40. command: Inquiry: 12 00 00 00 60 00
41. scsi 5:0:1:0:
42. command: Inquiry: 12 00 00 00 60 00
43. mptspi: ioc0: id=1 Read = 0x20400902 ( Wide factor = 0x09 @ offset = 0x40 DT )
44. scsi 5:0:1:0:
45. command: Test Unit Ready: 00 00 00 00 00 00
46. scsi 5:0:1:0:
47. command: Test Unit Ready: 00 00 00 00 00 00
48. scsi 5:0:1:0:
49. command: Test Unit Ready: 00 00 00 00 00 00
50. scsi 5:0:1:0:
51. command: Test Unit Ready: 00 00 00 00 00 00
52. target5:0:1: Domain Validation skipping write tests
53. target5:0:1: Ending Domain Validation
54. mptspi: ioc0: id=1 Read = 0x20400902 ( Wide factor = 0x09 @ offset = 0x40 DT )
55. target5:0:1: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 64)


The LTT tool also warns about the U160 mode:

The current SCSI configuration is likely to be limiting the performance of the drive.
Please check that your HBA is the correct type for the drive and that the cabling is good.
The SCSI configuration referenced is the one for which device analysis was run and/or the support ticket was pulled.
If this is not via your backup server then you may not have an issue.
Current SCSI speed limit: Ultra3 or faster.
Recommended: Ultra3 or faster.
Current SCSI transfer rate limited to: 160 MB/sec.
Recommended: 320 MB/sec. or better.


I'm a bit frustrated at the moment because I've spend several days with compiling new modules and adding different SCSI HBAs.

Any idea what to check next? Does anyone use LTO-3/4 drives correctly in U320 mode (with which controller)?

Ralf
13 REPLIES
Curtis Ballard
Honored Contributor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

I can't say much about the Overland version of the HP drives but I can say that the HP version of the drives will negotiate at U320 speeds and run much faster than 160MB/s.

There could be some hardware in the Neo that prevents running at U320 or Overland may have special firmware that limits the speed for some reason.

One issue I have seen quite a bit is that the bus must be absolutely clean and it is very difficult to get a bus that has two devices negotiate to U320 speeds. We typically see domain validation forcing the bus to U160 if there is more than one device connected.
RalfG
Frequent Advisor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

Well, the Lib reports that the drives are capabel of U320 mode.

***** SPI HARDWARE VERSIONS AREA *****

Shuttle: 01 (LTO capable)
Passthru: 00
Vertical: 01 (Geared head)
Touch: 00
Drives 0-1: 07 (Fan stall capable, Ultra320 capable)
Drives 2-3: 07 (Fan stall capable, Ultra320 capable)


Only one of the two drives shares a controller with the Lib. The other drive is directly connected to the second controller. So everything seems to be ok and was installed as documented in the manual.

I'm not sure which version of HP's firmware is the equivalent to Overlands B12H firmware. But it's the most recent I could find.
RalfG
Frequent Advisor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

I got no usefull response from the overland support yet, but I found this in their support db...

http://support.overlandstorage.com/jive/entry!default.jspa?categoryID=3&externalID=4824&fromSearchPage=true

Problem: Slow throughput using LTO3 SCSI tape drives with a Ultra 320 SCSI card.

Solution: Disable Domain Validation.

Domain Validation uses a sequence of I/O commands to determine the optimum transfer rate between the SCSI card and the SCSI device on initial boot. In the same way that a modem will step down the transfer rate based on a telephone line quality to ensure data integrity, Domain Validation will limit the bandwidth to a device to guarantee reliable data transfers.
RalfG
Frequent Advisor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

To disable domain validation and force U320 sounds a bit...strange to me.
RalfG
Frequent Advisor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

Can anyone confirm that he/she is able to get a U320 connection between LTO-3/4 drives and LSI/Adaptec SCSI HBAs (on Linux or at least Windows)?

Right now LSI support tells me that the attached HP LTO-4 drives are not able to do U320.


#### quote lsi support ####
What I see happening the device was asked if it could negotiate at U320,
however it came back with U160. The reason you see the same results
with the adaptec driver is the same spi transport driver is shared
between both my driver, and there's.

Here is where U320 negotiation was sent:

> mptspi: ioc0: id=1 Requested = 0x207f08f3 ( Wide factor = 0x08 @ offset = 0x7f IU DT WRFLOW RDSTRM RTI PCOMP )

Then three inquiries sent

> st 5:0:1:0:
> command: Inquiry: 12 00 00 00 60 00

Then negotiation parameters read back. The device said it could do
U160:

> mptspi: ioc0: id=1 Read = 0x20400902 ( Wide factor = 0x09 @ offset = 0x40 DT )
#### quote end #####


On the other hand, overland is telling me that it's not a problem ot the lib. If there is a problem it must be with the LTO-4 drives. Their only "solution" right now is -> disable domain validation. But this is not possible with the lsi linux driver and not recommended at all.
RalfG
Frequent Advisor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

I got an external Ultrium 1840 drive for testing which correctly negotiated to U320 mode. With this drive I got 180/160 MB/s (read/write) instead of 130/120 MB/s.

It seems that Overland either has a problem with their Lib's firmware/SCSI backplane, or their OEM drive firmware is too old or buggy.


Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

Hello Ralf,

where is your terminator located ?
The termination have to be on the Drive

Regards
Roland
RalfG
Frequent Advisor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

The terminator is at the rear of the overland lib as described in the lib's manual.

I've already backed up >10TB and have not seen any scsi errors. It's just the speed and the limit to U160 that is wrong. Maybe it's just an not-up2date firmware that overland uses as OEM firmware. The lib's fw is B12H, the fw of the U320 capable HP drive was B22D.
Curtis Ballard
Honored Contributor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

The previous poster was correct the terminator MUST be connected to the connector on the drive and not to the connector on the library controller for the drive that is on the same bus as the library controller. The correct cabling is host -> library controller -. drive -> terminator and keep the cables as short as possible with really good tight connections. It is really hard to get connections good enough for U320 on a multi-drop bus.
RalfG
Frequent Advisor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

The lib has two LTO-4 drives (as written in my first post) and the server has two SCSI HBAs.

I'm seeing this behaviour with both drives. One drive is exclusively connect to one HBA, the other drive + the lib with the second HBA. The terminators and cabels are connected to the rear of the lib.

This is described here (pages 2-12 to 2-15):
http://support.overlandstorage.com/jive/servlet/KbServlet/download/3518-102-1144/104248-105_A.pdf

The problem is not the cabeling or the termination. If I take the cable and connect an external Ultrium 1840 drive (firmware B22D) with the HBA everything is ok.

Maybe it's the old fw, or something with the lib's backplane.
Curtis Ballard
Honored Contributor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

I agree that changing the cables is unlikely to solve your problem as your other drive won't negotiate to U320 however the cable diagram as shown in figure 2-10 of the Overland manual is incorrect and if you were to be able to get the system running at U320 speeds you would have problems with that drive. The terminator needs to be on the drive not on the library controller.

It is possible for the library firmware to tell the drive to never negotitate to U320 if the library supports that function. I can't say about the Neo but it may have restricted the drive to a slower speed.

You could try their suggestion of disabling domain validation to see if it is ever possible to get the drive to negotiate to U320 however I agree with you that if domain validation is forcing the drive to a slower speed then running at U320 is likely to have problems so that wouldn't be a solution, just a test.
RalfG
Frequent Advisor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

I had contact to the LSI linux driver developer. He was very clear on disabeling DV:

| That make no sense. The devices will net 5
| MB transfer speeds if you don't do DV.

With the LSI linux driver, it's not possible to diable DV.

After the ext. drive was able to do U320 with the LSI HBA, I'll now wait what Overland will tell me next week. The frustrating thing is that they used ltt for some basic tests and got ~130 MB/s. They then claimed that U320 wouldn't be needed because the drives couldn't do more. I got no answer if their drive/lib was correctly working in U320 mode (in 5 weeks!).

Curtis Ballard
Honored Contributor

Re: SCSI Domain Validation problems with LTO-4 drive and several SCSI HBAs

Thanks for the info from LSI. It sounds like their HBA defaults to the lowest transfer rate if domain validation isn't run which is backwards from the cards I have played with where not running DV causes the card to always attempt the highest possible speed.

You already know that the drive can run faster then 160 from your own tests so no sense even addressing that point.