Operating System - OpenVMS
1828390 Members
3192 Online
109977 Solutions
New Discussion

VMS Poor SDLT performance

 
SOLVED
Go to solution
Mohamed  K Ahmed
Trusted Contributor

Re: VMS Poor SDLT performance

POST NUMBER 100 :)

What sould be the actual rate of backup on the SDLTs
I have seen different rates according to the compression ratio and if I am backing up flat files or database files.
Uwe Zessin
Honored Contributor

Re: VMS Poor SDLT performance

Did a cut&paste and then:
$ grep GMT tmp.txt | wc -l
103

$

There is one line quote:
""> Sep 16, 2004 16:02:11 GMT""

So that makes 102 messages (101 responses) including yours.

--

I don't know what you mean by "actual rate of backup", but the speed is depending on many factors of which you already have named:

- the ability to compress the input data
- whether the backup deals wish many small or some large files


For optimal speed you want to feed enough data so that the tape drive's compressor is not stalled and the tape mechanism can run at full speed. That requires about 2-3x the media transfer rate.

The SDLT600's uncompressed transfer rate is 36 MegaBytes/second, so you should be able to feed it with 108 Megabytes/second.

Did anybody say tape drives are slow? ;-)
.
Jan van den Ende
Honored Contributor

Re: VMS Poor SDLT performance

Uwe,

now I think you are being a little unfair towards Mohamed!

The index page give a.o. a column with the number of replies, and just before I started this, that read "101".
That being including yours, I conclude that Mohamed sqw "99", which would make his reply #100.

The only way I _THINK_ that I can reconcile that with your grep, is that by incident one reply to another thread was entered at that same second...

Not that is is that terribly important, but in general our business is all about exxactness!

Proost.

Have one on me,
Don't rust yours pelled jacker to fine doll missed aches.
Uwe Zessin
Honored Contributor

Re: VMS Poor SDLT performance

I didn't intend to be unfair and, honestly, I didn't look at the index page.

Such claims just ask for being verified by somebody else, don't they?

Looks like we've found a problem in the ITRC database, but I agree it is not that important. Have a nice weekend, everybody.
.
Robert Gezelter
Honored Contributor

Re: VMS Poor SDLT performance

Gentlemen,

My US$0.02.

- The error containment parts of BACKUP (/CRC, /GROUP_SIZE, etc.) are still, IMHO, needed. If you review the technical history, this was demonstrated a LONG, LONG time ago when there were problems with certain hardware devices (e.g., the DMC-11 on the VAX-11/780, and some of the tape drives) where data could be silently corrupted. The error containment support at least prevents this "silent" corruption. While it is a nice feature that modern disks and tapes have more extensive error detection/recovery schemes than their ancestors, this is not a substitute for end-to-end error/detection recovery. There are many stages between a copy of a file's blocks in memory, and the copy on the tape. The error containment is good assurance that the bits are intact.

- The history of BACKUP in performance needs to be similarly considered with regards to efficiency. BRU on RSX-11 was done slightly earlier, and if I recall correctly, suffered some problems related to its attempts to optimize disk operations. Increasing the number of read buffers, which I have always advocated, should not be a problem. Writing disk blocks to the tape would be the type of change that got BRU into trouble.

- Actually, my recollection of the history is that the "single parity" problem originated with the TK50. I would have to go digging in my library, but I believe that the problem may even be alluded to in the original article in the Digital Technical Journal that described the design and development of the TK50. I am remarking "off the cuff" here, check the article to confirm that I my [vague] recollection is correct, I prefer to have citations, but the article is, sadly, not next to me at the moment.

Overall, I would agree with the comment that this is an important area. I would say that BACKUPs need to be an active area, particularly in light of the recent public incidents of data lost/unaccounted for. I do not want to add to the size of the "to do" list, but a strong checksum (recent generation of SHA-n) [to verify integrity--no pun] and AES encryption [to maintain confidentiality if a BACKUP set escapes its intended custodian] should also be placed on the list.

- Bob Gezelter, http://www.rlgsc.com
Ian Miller.
Honored Contributor

Re: VMS Poor SDLT performance

with currrent SCSI tape drives I'm not convinced that /GROUP=x where x > 0 gains anything. I would not be surprised to see the default become /GROUP=0 in a future version of VMS. /CRC is needed as it is a end to end check and upgrading it to a more modern algorithm would be a good idea. I note that the encryption routines are included in VMS V8.2. I had a quick play but BACKUP/ENCRYPT did not seem to work. Encryption of backup savesets is often not considered.
____________________
Purely Personal Opinion
Tom O'Toole
Respected Contributor

Re: VMS Poor SDLT performance

Ian, I will ask the question again - if we are writing an 'end to end' check, why should we not write 'end-to-end' XOR blocks which make recovery of the data checked en-to end possible?
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Uwe Zessin
Honored Contributor

Re: VMS Poor SDLT performance

Redundancy doesn't help when a tape drive doesn't let you go past the bad spot on the tape to get at the XOR data. CRC is still usefull as an end-to-end check to detect, e.g. corruptions of a backup save set.
.
Jan van den Ende
Honored Contributor

Re: VMS Poor SDLT performance

Uwe,

is that not exactly what Tom was refering to in his april 22 posting?

I still have no real feel about what is _IS_ and what its _IS NOT_, but, reading the text he came up with, I have the same nagging feeling that he seems to have:

_ DOES THERE EXIST A FUNCTION THAT TELLS A SCSI DRIVE TO CONTINUE GIVING DATA AFTER A PARITY ERROR_?
Is there a way to convert the error to a non-fatal status, and let BACKUP do its magic again?
Of course the data will be inconsistent, and has to be re-constructed, but BACKUP is good at that.

Even if that would mean that the restore time got multiplied many times, and consuming lots of CPU, would that not be VERY, VERY MUCH preferable over losing the contents of the tape?

I know (... censored...) sure that _WE_ would be VERY happy to pay that price if the alternative is NO restore!!!

Proost.

Have one on me.

Jan
Don't rust yours pelled jacker to fine doll missed aches.
Antoniov.
Honored Contributor

Re: VMS Poor SDLT performance

Jan,
agree - why we would be VERY happy to pay that price if the alternative is NO restore?

For my experience, DAT units are not reliable devices.

Antonio Vigliotti
Antonio Maria Vigliotti
Uwe Zessin
Honored Contributor

Re: VMS Poor SDLT performance

It _looks_ like you are talking about a posting from 22-MAR-2005...

No, I don't think so. To me, it looks like a speed optimization for writes.

OpenVMS Alpha Version 7.3--1 Release Notes:
http://h71000.www7.hp.com/DOC/731FINAL/6652/6652pro_011.html

OpenVMS Alpha Version 7.3 (or higher) has implemented stricter requirements for SCSI Mode Page 01h (the Read Write Error Recovery Page) for SCSI tape drives. These requirements help guard against possible data loss during **write** operations to SCSI tape, by defining the recovery actions to be taken in the event of deferred recoverable errors. For most Compaq-supported drives, these changes will not affect the drive's behavior. For some drives, however, these new requirements may impact SCSI tape behavior in the following two ways:
...
.
Jan van den Ende
Honored Contributor

Re: VMS Poor SDLT performance

Uwe,

I fear you are right, but I still hope Tom is, until he is really proven wrong.
And even then, somehow something should be make available again.. :-(


Proost,

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Cass Witkowski
Trusted Contributor

Re: VMS Poor SDLT performance

I won't be able to attend boot camp but if anyone from HP want to talk to me about my ideas posted above for backup feel free to contact me.
Tom O'Toole
Respected Contributor

Re: VMS Poor SDLT performance

Uwe said:

Redundancy doesn't help when a tape drive doesn't let you go past the bad spot on the tape to get at the XOR data. CRC is still usefull as an end-to-end check to detect, e.g. corruptions of a backup save set.


Sorry to belabor this, but it seems to me that in the case where a CRC error could be successfully used to detect corruption, a recovery could also be attempted. The case you detail, where a tape drive prevents going beyond the bad spot - it also will prevent you from going to the bad spot itself - where the CRC and the CRCed data is written - so the CRC is useless too.

That said, going back to another response where you pointed out the VMS 7.3 doc with scsi tape comments - am I correct to assume that these unrecoverable tape errors will occur on writing only - this is where I am seeing them, and while I don't think I should be getting so many, I can redo the backup in this case and down the tape. This is certainly preferable to getting unrecoverable READ ERRORS! I don't think I've gotten any of these in recent memory.

So should the tapes which get -f-parity on write be considered defective and returned to the manufacturer? What kind of factor is keeping the drive. Also should we be cleaning the drives more often (than the library calls for)?
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Cass Witkowski
Trusted Contributor

Re: VMS Poor SDLT performance

I was wondering about the /CRC qualifier as well. When Backup experiences a error due to a "CRC" error that backup generated is the error message different than when the tape drive detects a parity error? If so what is the error message and has anyone seen these errors recently?

Like Tom said that if we are never going to get past the Tapes error checking then why bother with the /CRC qualifier? If the CRC qualifier was for the 9-track tape technology that I beleive only had longitudinal error detection then the CRC was a needed feature. But if the tape drive these days are writing CRC or ECCs with the data then does this become irrelevant? Again can anyone from HP StorageWorks group chime in?
Tom O'Toole
Respected Contributor

Re: VMS Poor SDLT performance

I said "what kind of factor is keep the drive".

I meant "what kind of factor is keep the drive STREAMING"
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Guenther Froehlin
Valued Contributor

Re: VMS Poor SDLT performance

Error checking and recovery is done transparently in the tape drive (for some drives this can be changed at the SCSI level). BACKUP would not notice such an event.

I had a case a long time ago where the firmware in a tape drive had a bug. It wrote bogus data to tape with a good checksum...yuck! Hence it returned bogus data without a notice. BACKUP's CRC (old but still excellent algorithm) detected the corrupted data.

And back to the performance discussions in this thread:

Always try a save to "NLA0:a.a/SAVE" first. This gives you an idea how fast the data can be read from disk.

Always try a /PHYSICAL to tape to give you an idea on the top transfer speed of your gear (disk->memory->tape).

Always try without /LIST/LOG because it can slow down performance noticeably.

Always...trust in fast hardware and not in magic system/process parameters.
Guenther Froehlin
Valued Contributor

Re: VMS Poor SDLT performance

About write errors to tape:

BACKUP rewrites blocks to tape which have failed with an error. The whole sequence of buffers after that failed buffer is re-issued. The re-issued I/Os are synchronous and BACKUP "barfs" (operator assistance) when there have been more than 100 errors total and the 10 out of 100 writes (10%) fail with an error.

About SS$_PARITY errors from tape:

Mutliple SCSI error conditions are mapped in MKDRIVER to SS$_PARITY. Check the errorlog for the TRUE SCSI condition and look it up in the manual for this drive (easy said because mostly you a) don't get a technical manual or b) the manual "saves" you from technical details - BAH!)
Uwe Zessin
Honored Contributor

Re: VMS Poor SDLT performance

Most tape drives these days have advanced error detection and correction, yes. They will certainly attempty to correct media errors. The CRC in the BACKUP header, as often said, is an -end-to-end check. I think it is still usefull, because I have seen BACKUP create corrupted save sets on its own when the destination was a DECnet remote node - no tape drive involved.

Tom,
I don't think I wrote that unrecoverable tape error happens on write only - at least I didn't intend to say that. If you refer to the discussion about MKSET, it is my understanding (I haven't spend too much thinking on this, though) that the deferred error reporting is some kind of 'writeback cache' for writes to tape.

I have been told that one should NOT clean a modern tape drive unless it really asks for it.


"what kind of factor is keep the drive STREAMING" ?

easy, as the tape drive is a sequential media, it is just MegaBytes per second. The rule of thumb is that you need about 3 times the media transfer rate if you let the tape drive work with compression.


> Always...trust in fast hardware and not in magic system/process parameters.

Guenther, I have to disagree with this! At least on VMS V5.5-2 I did get corrupted savesets when the process quotas were not set up according to specs.
.
Jan van den Ende
Honored Contributor

Re: VMS Poor SDLT performance

Tom,


That said, going back to another response where you pointed out the VMS 7.3 doc with scsi tape comments - am I correct to assume that these unrecoverable tape errors will occur on writing only -


well, see my 10/3/2005 posting:
Fatal READ errors _DO_ occur!

Jan
Don't rust yours pelled jacker to fine doll missed aches.
Guenther Froehlin
Valued Contributor

Re: VMS Poor SDLT performance

>> Always...trust in fast hardware and not in magic system/process parameters.

>Guenther, I have to disagree with this! At least on VMS V5.5-2 I did get corrupted savesets when the process quotas were not set up according to specs.

Uwe, folklore has it that there are always bugs in software :-(. My hint was about SPEED not BUGS.
Tom O'Toole
Respected Contributor

Re: VMS Poor SDLT performance

Sorry Uwe, what I am wondering is: what kind of factor to reliability is keeping the tape streaming - like, if it streams almost continuously, will I get fewer unrecoverable write errors than if there is some shoeshining occurring, and is this a major factor.

I've been doing quite a bit of testing lately with SDLT (160/320) drives, and it can be difficult to keep them streaming, they are so fast. We have the recommended SCSI cards (SYM54C895 LVD SCSI) for a MSL5000 library with two drives/library. When both drives are running, performance sucks, with frequent shoeshining (easily verifiable by watching the front panel - drive goes writing-idle-writing-idle...). But with one drive running it goes at just about full speed. I have tried different blocksizes, and the problem remains.

We have been testing these on an MDR as well. The best throughput we can get on the MDR is about 55MB/sec total for the unit, (we have tried numerous configurations and have not been able to exceed that value). By the way, I've been told the fastest drive supported on the MDR is 110/220 and that 55MB/sec is actually quite respectable (I would agree it's not bad!).

In testing on the MDR to one library/two drives, I also found that the maximum blocksize of 65535 is faster than 32K (and as you go down it gets slower still). On the SCSI, 65K was not faster, perhaps because with two drives running, it was already overloaded.
To qualitatively summarize:

one MDR one drive -- full speed
one MDR two drives -- rare shoeshining
one MDR four drives -- regular shoeshining
one MDR eight drives -- constant shoeshining
one SCSI one drive -- rare shoeshining
one SCSI two drives -- constant shoeshining

Has anybody else seen this? I'm wondering what's going on with the SCSI performance, and if something is set up wrong, but also not overly concerned since there seems to be a better alternative with fibre.
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Cass Witkowski
Trusted Contributor

Re: VMS Poor SDLT performance

I think that on the SCSI bus the tape drive is considered or was considered a slow device and could hold onto the bus for a while. That is why they recommended never to put tape drives and disk drives on the same SCSI bus.

We generally try to have one Tape drive per SCSI bus.

Uwe Zessin
Honored Contributor

Re: VMS Poor SDLT performance

Tom,
I haven't done much with VMS BACKUP recently, but from discussions here on ITRC it looks like BACKUP still does not use double buffering (filling one buffer with data from the disk while writing out the other buffer to tape). So if the tape drive's internal buffer is empty while BACKUP is still trying to fill its buffer you get the shoeshining.

On the other hand, you say you usually can run at least one drive at full speed.

The SDLT320 has a native transfer rate to media of 16 MegaBytes / second. If I read the specifications of the MDR corrrectly, it uses an Ultra2 (80 MegaByte / second) interface. For an effective compression and to keep the 64MB buffer filled, the rule of thumb says you need to deliver about 3 times the native data rate (=48 MegaBytes/s).

Usually, you can't run a parallel SCSI bus at 'full speed' and there certainly is some loss in the FC/SCSI bridging as well (no, I don't have any numbers).


Cass,
I've heard that some old tape drives could not DISCONNECT from the bus after the data transfer (or had disconnects disabled by default - e.g. early Exabytes). DISCONNECT is a feature where the initiator sends a command (and data) to the target. Then the target disconnects from the bus (releases it for other nodes) and processes the command. After it has finished, it aquires the bus on its own to send the result back to the initiator.

Today's tape drives (SDLT, LTO) have large buffers, so they can transfer the data at full bus speed. But with today's drive speeds (the SDLT600 can do 36MB/s, the Ultrium 960 can do 80MB/s to the media) you still need to check if the bus has enough capacity, so 1 tape drive per SCSI bus isn't a bad idea ;-)
.
Guenther Froehlin
Valued Contributor

Re: VMS Poor SDLT performance

Tom,

>I've been doing quite a bit of testing lately with SDLT

how? Using OpenVMS BACKUP? What is the transfer rate for 1, 2, ... drives?

One thing to note is that S/DLT drives have only one speed and stop. LTO drives have various speeds and adjust better to lack of input data. But for the gain in speed you pay a higher price for the LTOs.

The typical bottleneck with file oriented backups is the input disk. Try a BACKUP/PHYSICAL/BLOCK=65024 just for the fun-of-it.