Tape Libraries and Drives
cancel
Showing results for 
Search instead for 
Did you mean: 

detect shoe-shining

SOLVED
Go to solution
RalfG
Frequent Advisor

detect shoe-shining

Hi,

I'm backing up large amounts of data to a lib with 2 LTO-4 drives. Because of the long time the backup takes to complete, I don't want to spool the data to disk and then write to tape for full backups.

The disk storage (file server) has a seq. read performance of 200-300 MB/s and is connected by GbE to the backup server.

The data are mainly large files.

The following is the output of dstat on the file server during a full backup with spooling enabled.

--dsk/sde-- --net/eth1-
read _writ|_recv _send
104M 0 |2509k 105M
104M 0 |2541k 106M
104M 0 |2536k 106M
72M 0 |2032k 85M
0 0 | 12k 1848k
58M 0 |1035k 42M
102M 0 |2527k 105M
106M 0 |2558k 107M
104M 0 |2543k 106M
107M 0 |2548k 106M
106M 0 |2558k 107M
106M 0 |2555k 107M
104M 0 |2558k 107M
16M 0 |1413k 60M
0 0 | 11k 1313k
0 0 | 70B 70B
64M 0 | 295k 12M
104M 0 |2555k 107M


It looks similar without spooling. I'm not sure what the reason for the short dropouts is.

I'm a bit worried about tape shoe-shining without spooling and want to know if there is a way to get the status about how often a drive had to start/stop during a full backup (scsi command, ltt...).
18 REPLIES
Duncan Edmonstone
Honored Contributor

Re: detect shoe-shining

Shoe-shining doesn't really happy on modern LTO tape drives. LTO has always supported something called DRM (Data Rate Matching), which allows it to adjust the speed of the tape according to the rate of data being streamed in. LTO4 for instance will adjust its speed between 40MB/s and 120MB/s to present shoe-shining.

So in-short - don't worry about it.

HTH

Duncan

HTH

Duncan
RalfG
Frequent Advisor

Re: detect shoe-shining

I know that LTO-4 adjusts the speed. But if the data stream drops below 40 MB/s for a short time (as seen in my dstat output above) doesn't the drive has to stop/start write again? I think the drive has a 128 MB buffer which will help, but if the data comes too slow (eg. incremental backups) shoe-shining can happen. Or am I missing something else?
Marino Meloni_1
Honored Contributor

Re: detect shoe-shining

you can got some indication in the support ticket.
You can also test (with LTT) where the problem is:
In LTT there are several performance test available: system performance and device performance.
first use the device perfomance, this test will generate data in the host memory and send it to the LTO drive, this will give you some indication about the scsi subsystem, yo should be able to reach the teorical tape throughput
then you have the system perfomance, here you can test the the source of the data, and have an idea of the max throughput your system can deliver, you can also create a fixed amount of data (this is the restore test) and then test the system with those data, and this will give you an idea if your problem is comming form the HW or from data (defragmented, openfiles, antivirus....)
RalfG
Frequent Advisor

Re: detect shoe-shining

I have performed various test with ltt, hptapeperf and dd. The general performance is fine, I can write with ~60-80 MB/s to tape (real backup, highly compressed data, single client).

With dd and the other tests, I get read/write performance up to 190 MB/s.

The problem is, I don't know if there are periods during a backup with an average throughput of 70 MB/s, where the throughput drops below 40 MB/s. I can see with dstat that there are periods where the the network throughput between file server and backup server drops to <10 MB/s, but I don't know if the drive starts shoe-shining then.

If this happens only a few times during a long running backup, it'd be fine. But if this happens often, I'm worried about the additional stress for the tape and drive.

Hence I would like to see some number on how many time the drive had to stop/rewind/start during a given time.

Maybe I'm too afraid about this. But sometimes people forget to put the data in gzip format on the file server, the result is many very small files...
Marino Meloni_1
Honored Contributor

Re: detect shoe-shining

well LTT record the info you need, but you should contact HP in order to extract these info from the EEPROM data of the support ticket as it is not accessible normaly

see example:

|__ Mechanism EEPROM Page
||__ Power on time : 109243379
||__ Power cycles : 88
||__ Tape pulled : 30012154
||__ Pulling time : 12134187
||__ Load/unloads : 1595
||__ Start/stops : 1295420
||__ Thread cycles : 3537
||__ Repositions : 7032
||__ Turnarounds : 1219383
||__ SDRAM errors : 1
||__ CRC errors : 0
||__ Cleanings : 16
||__ Upgrades : 3
RalfG
Frequent Advisor

Re: detect shoe-shining

I see that the relevant code is in scripts/LTO_ReadLogs.sts. There is no way for me to get this info without HP's help?
Curtis Ballard
Honored Contributor

Re: detect shoe-shining

If you really need to get to the information we can tell you how to extract the reposition information from the logs without using L&TT but most of the time there isn't any real need to figure out if the drive is repositioning if your backups are fast enough to satisfy you.

There are two things that happen with Shoe-Shining.

1) Extra mechanical wear
2) Performance impact

With HP LTO tapes and drives both the drive and the tape cartridge are extensively tested and required to meet standards where the extra mechanical wear isn't an issue. You will get the rated life in worst case operation.

2) Performance impact

Here HP's LTO tape drives really shine. We have the variable rate matching which isn't available in some other LTO tape drives. With variable rate matching the tape drive will slow down as the data rate slows down up to 1/3 of the max speed. The drive input buffer is designed such that at 1/3 of the max speed the buffer is large enough to hold all of the data that can come in during a reposition. That means that the drive can always write the data at the speed it arrives. Something not every LTO drive can do.

The specific log page you will need if you want to see the reposition information is 34h. If you have a way to issue the SCSI Log Sense command and get back a log page 34h I can provide the information on what field has the reposition information.
Tom O'Toole
Respected Contributor

Re: detect shoe-shining


What kind of library do you have. On the MSL5000/6000 series, it tells you what it's doing on the front panel status display. You can spot possible shoe-shining if it's changing from writing... to idle... repeatedly.

It seems like the breaks you are seeing could just be an artifact of buffering on the host file system, and maybe your samples are too frequent to smooth these out. You don't say what the interval is.

Assuming this is linux, you could maybe try piping through the 'buffer' program and see how that goes.
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Cass Witkowski
Trusted Contributor

Re: detect shoe-shining

I have not looked at the capacity in a single pass of the LTO-4 tape cartridge. Be aware that when you write to the tape it actually writes using several head through the length of tape then moves the heads and writes data going back the other way. It can do this many times. Whay you may be seeing is not show-shining but actually the the LTO-4 tape reaching the end and then switching to write the other way.
Cass Witkowski
Trusted Contributor

Re: detect shoe-shining

How frequently does this happen? If you see it every 2-3 minutes or 5-6 minutes then it is the tape reversing. There are four tape bands on the tape each with 12 passes. So to fill a tape 48 passes are done on the tape.
Cass Witkowski
Trusted Contributor

Re: detect shoe-shining

From the QS

Ultrium LTO-3 and the new LTO-4 tape drive features a breakthrough in recording technology by writing sixteen tracks simultaneously on a linear format of 704 tracks and 896 tracks (respectively).

896/16 = 56 passes on the tape.
1600 GB / 56 = 28.5 GB per pass
28000 MB / 107 MB/Sec = 261 seconds or 4.35 minutes
RalfG
Frequent Advisor

Re: detect shoe-shining

Thanks for all your responses.

I'll try to get the SCSI log sense information next week.

The lib I we is an Overland Neo4100, I think it is identical to a MSL6000 lib. Watching the lib's display is no solution, because it's located in a datacenter.


Here's the network throughput (received on the backup server), dstat with a 2 seconds delay. The data is already highly compressed, mostly zip files. ~830 GB data per LTO-4 tape.

58M
10M
52M
95M
104M
90M
98M
60M
622k
71M
94M
88M
96M
93M
57M
937k
60M
94M
96M
88M
101M
60M
886k
60M
99M
96M
98M
90M
56M
849k
58M
96M
102M
104M
84M
56M
499k
72M
98M
92M
101M
81M
56M
2687k
90M
96M
89M
101M


There is a short dropout every 10-20 seconds.

The average performance I get from this backup running without spooling is : Bytes/sec=69,971,539. It seems that backing up direct over the net without spooling is faster, than despooling from a linux md raid1 device (despooling speed was ~60 MB/s).

RalfG
Frequent Advisor

Re: detect shoe-shining

Here's the information about log page 34h:

sg_logs -p=34 /dev/sg6
HP Ultrium 4-SCSI B12H
No ascii information for page = 0x34, here is hex:
00 34 00 00 1e 00 00 60 02 00 00 00 01 60 02 00 d8
10 00 02 60 02 0a 50 00 03 60 02 00 ee 00 04 40 02
20 04 b0


sg_logs -h -p=34 /dev/sg4
HP Ultrium 4-SCSI B12H
Log page code=0x34, DS=0, SPF=0, page_len=0x1e
00 34 00 00 1e 00 00 60 02 00 01 00 01 60 02 00 00
10 00 02 60 02 0a 50 00 03 60 02 00 00 00 04 40 02
20 04 b0
Tom O'Toole
Respected Contributor

Re: detect shoe-shining


We are already up to B34W on the firmware, you must have a pretty early version, and should probably update that. Ours came with B23W, and I thought we had received a very early version of the drive.
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
RalfG
Frequent Advisor

Re: detect shoe-shining

AFAIK Overland has not released a more recent fw yet for the Neo 4100 lib yet. And it's not possible to flash an other version.
Curtis Ballard
Honored Contributor
Solution

Re: detect shoe-shining

The parameters in log page 34h have to
be interpreted with the operation that
just completed in mind. All parameters
are for the current tape, are for writes,
and are cleared at the start of a tape
load except for the native data rate which
is fixed. If the previous load didn't write much then the values can be deceptive.

Full decode from sg6 - kind of slow but
not showing shoe shining. I wonder if
this was a short read or write as I
would expect some repositioning at this
speed.

Header: 34 00 00 1e

Parameter: 00 00
Ignore this byte: 60
Parameter length: 02
Repositions per 100MB written: 00 00 << This is the shoeshine value

Parameter: 00 01
Ignore this byte: 60
Parameter length: 02
Data rate onto tape after compression in 100KB/s: 00 d8 == 216 -> 21.6 MB/s

Parameter: 00 02
Ignore this byte: 60
Parameter length: 02
Maximum Possible Data rate with this data in 100KB/s: 0a 50 == 2640 -> 264 MB/s

Paramter: 00 03
Ignore this byte: 60
Parameter length: 02
Rate of data into the drive in 100KB/s: 00 ee == 238 -> 23.8 MB/s

Parameter: 00 04
Ignore this byte: 40
Parameter length: 02
Native data rate with LTO-4 media in 100KB/s: 04 b0 == 1200 -> 120.0 MB/s

Pulling out just the shoeshine value
from sg4 1 reposition/100MB not bad at
all but the rest of the data doesn't
show any data written to tape at all so
the reposition value probably doesn't
mean much.

sg_logs -h -p=34 /dev/sg4
HP Ultrium 4-SCSI B12H
Log page code=0x34, DS=0, SPF=0, page_len=0x1e
00 34 00 00 1e 00 00 60 02 00 01 00 01 60 02 00 00
10 00 02 60 02 0a 50 00 03 60 02 00 00 00 04 40 02
20 04 b0
RalfG
Frequent Advisor

Re: detect shoe-shining

Thanks for the detailed information! I'll check the numbers after the next backup and report back.

RalfG
Frequent Advisor

Re: detect shoe-shining

Here are some results from last nights incremental backups, 2 jobs, only 23 GB data backed up.

Header:
34 00 00 1e

Repositions per 100MB:
00 00 60 02 00 01 -> 1

Data rate onto tape after compression in 100KB/s:
00 01 60 02 00 94 -> 1,48 MB/s

Maximum Possible Data rate with this data:
00 02 60 02 0a 50 -> 2640 -> 264 MB/s

Rate of data into the drive in 100KB/s:
00 03 60 02 02 f8 -> 76,0 MB/s

Native data rate with LTO-4 media:
00 04 40 02 04 b0 -> 120.0 MB/s


And the job results from the backup program:

#1
Job write elapsed time = 00:00:03, Transfer rate = 31.13 K bytes/second

#2
Job write elapsed time = 00:12:54, Transfer rate = 29.85 M bytes/second

I'm not sure how the data rate values from 34h and the backup program correspond.