Operating System - HP-UX
1823064 Members
3064 Online
109645 Solutions
New Discussion юеВ

frecover part 2 - trouble

 
SOLVED
Go to solution
Fred Martin_1
Valued Contributor

frecover part 2 - trouble

This is an addendum to my previous frecover thread, closed earlier today.

When I did the fbackup, I used this:

fbackup -g graphfile -v -c fbconf -f $TAPE

I followed it with:

frecover -v -I tapelist -f $TAPE

The tapelist looked great; all files matched my file list of what was in the file systems backed up.

Now I'm doing this:

frecover -x -i /test -v -f $TAPE

I'm trying to restore /test, one of three file systems that were backed up.

But, after the tape blinks away for 3-4 minutes, I get the following:

frecover(2114): read error from input device (I/O error)
frecover(2114): read error from input device (No such device or address)
frecover(2113): unable to resync backup media
frecover(4301): /test not recovered from backup media

...this is very, very bad.

TAPE = /dev/rmt/c4t6d0BEST

That appears correct per ioscan and hasn't change.

Any advice? S.O.S.

Fred
fmartin@applicatorssales.com
24 REPLIES 24
James R. Ferguson
Acclaimed Contributor

Re: frecover part 2 - trouble

Hi Fred:

Try cleaning the tape drive. Do this at least twice. If you have another drive, clean it and try your restore there. If you have a duplicate copy of the backup, try that.

If the 'frecover' fails after cleaning and/or switching tape drives, get a new tape and try 'tar'ing a directory like '/etc' to it. It is possible that the tape drive itself has gone bad.

Regards!

...JRF...
Steven E. Protter
Exalted Contributor

Re: frecover part 2 - trouble

Shalom Fred,

I would run some dd read tests on the tape.

Keep trying devices until you get one to work.

It might simply be a bad tape. But the device or address message tells me something else.

Then manually rewind the tape on the command line and you should if there is anything on the tape you should be good to go.


SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Thanks. I'm remote at the moment but will be going in today or tomorrow and will clean the drive. I'll get back to you.
fmartin@applicatorssales.com
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Steven, can you tell me a little more about dd tests? I haven't used dd in about 10 years when I was at another job.

Cleaning the tape drive now. I have an external tape drive which I can hook up if needed. That's been idle about three years but worked when I took it out.

Fred
fmartin@applicatorssales.com
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Anyone, this tape drive is alone on a cable, it's the only device on the SCSI card. Can I power off the drive and swap it with another without bringing the server down?
fmartin@applicatorssales.com
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Folks,

/test is backed up daily so I tried three tapes, two recent and one a few weeks old. Each failed.

Re-seated cables for the tape drive (model C6375A), nothing there.

I tried swapping in the other tape drive (model C5658A) but couldn't get it to be seen by ioscan. I'm not much of a hardware guy. Seem to recall have to do something special with tape drivers, I don't know. Advice there would be welcome.

Anyway, that means I'm still stuck. It has occurred to me that this drive has been bad for weeks and that all my tapes are useless because of it. Hoping that's not the case.

Interesting that fbackup runs a reasonable amount of time, doesn't bomb. When I pull the tape listing off with frecover -I it gets the listing without complaint. It must be at the head of the tape, it only takes a moment to retrieve it.

On my frecover -x to get /test back, it runs for maybe 6-8 minutes before fail.

/test is a misnomer. The data is development stuff and is important but not modified often. There are 10 or so daily tapes here that have the critical stuff on them.

So, still in a fix.
fmartin@applicatorssales.com
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Additional information: for kicks I grabbed a tape that's been sitting around, a temp backup I'd done and labeled November '08. Pulled the tape listing off of it; found a directory on there that had since been deleted.

Restored it, no problem.

So - not a bad drive? But, 15 bad recent backups?

fmartin@applicatorssales.com
Bill Hassell
Honored Contributor

Re: frecover part 2 - trouble

You are correct. The -V and -I options read the head of thje tape where the header (-V) and a complete index (-I) of all the files reside.

The I/O error indicates that the tape drive could not read the data and then got confused. ioscan is useless as a diagnostic except to indicate that the drive responded to a SCSI ID command. Modern streaming tape drives have extremely high bit densities and can indeed become unreliable after years of usage. You can replace the tape drive but there are many, many models. If this is a DDS drive (DDS is sometimes called a DAT but only plastic case around the tape is similar). DDS drives come in DDS1,2,3,4,5 models and are not backward compatible (ie, a DDS4 tape cannot be read on a DDS3 or earlier).

There also several SCSI interfaces, some are completely incompatible with each other. You can swap tape drives but they should have the same model number. Otherwise, you'll need to post your current model and the others that you have to see what may work as a replacement.

If you have critical data on these backup tapes, there are several data recovery companies that can read most unreadable tapes -- for a price. Reading the data involves very expensive recovery equipment that can be adjusted to compensate for the wear that is causing the I/O errors.


Bill Hassell, sysadmin
Dennis Handly
Acclaimed Contributor

Re: frecover part 2 - trouble

>This is an addendum to my previous frecover thread, closed earlier today.

Note: You can always reopen your thread and continue there if this is a continuation. Your other thread was:
http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=1352621
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

These are both DLTs, model numbers noted above. My plan of using one as the spare turned out to be a bad one. At least, I should have tested that out prior.

The logical conclusion I come to is that this drive is failing to write, but can read. It appears that all 15 of my recent daily tapes are unreadable, although the drive just easily read a much older tape.

The implications are really ugly. I can't recover the /test file system now, but worse I haven't a single backup of other far more critical data.

The spare tape drive that I have - which I couldn't get working today - was previously on a D class server which I have mothballed.

I think I will need to start up the D, copy my critical data to it, and write it to that tape drive to get it off-site.

Next week when I can talk to vendors I'll get another drive here.
fmartin@applicatorssales.com
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

A question about verifying a tape, given the above disaster. How do you verify an fbackup then? (I know that the only real way is to restore from it, but aside from that...)

My "verify" reads the tape listing back, with frecover and -I option.

So I have three lists in the end.

I have the file listing. This was done with a find command before the backup. For example:

find /db /test > flist

Then I do the fbackup, like this:

fbackup -i /db /test -v -f $TAPE 1>>vlist

When the backup is done, I get a tape listing like this:

frecover -I rlist -f $TAPE

I'm having trouble restoring now. Is it likely that the tape is corrupt if I can see all the files, in all three lists?

I just tried a "live test" and got the same errors as above during the frecover, even though all the files listed during the fbackup.
fmartin@applicatorssales.com
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Sorry for the typo.

fbackup -i /db /test -v -f $TAPE 1>>vlist

That should read:

fbackup -i /db -i /test -v -f $TAPE 1>>vlist
fmartin@applicatorssales.com
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

...and

frecover -I rlist -f $TAPE

should read:

frecover -x -I rlist -f $TAPE
fmartin@applicatorssales.com
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

...not my day...

frecover -I rlist -f $TAPE

should be:

frecover -v -I rlist -f $TAPE
fmartin@applicatorssales.com
James R. Ferguson
Acclaimed Contributor

Re: frecover part 2 - trouble

Hi (again) Fred:

> A question about verifying a tape, given the above disaster. How do you verify an fbackup then?

You never do in my option. You can write without errors to a tape; read it back successfully; put it in storage; and then when you _really_ need it you find it's unreadable. That's why multiple backups are another level of protection for really important data!

> My "verify" reads the tape listing back, with frecover and -I option.

And that only means that the _index_ of files that 'fbackup' wrote to the _beginning_ of the tape can be read again.

If you _really_ want to read the entire 'fbackup' tape in a verification-only mode, you need to use the '-N' switch for no-recovery. Even the contents of the index file read by the '-I' option may not match what actually got written to the tape. Using '-N' will give both a correct accounting of the tape content but also actually read the whole tape. See the 'frecover' manapages!

Regards!

...JRF...
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Thanks so much.

As it turns out, I got an older tape drive to fire up, finally, and the tapes in fact can be read. I'm doing restores right now. Whew.

This one scared me. I've relied on the tapes and was trusting what I thought was a good verify indicator.

So my bad drive is writing OK but can't read. Replacements on the way.

I don't know why I never used the -N switch, that one got by me but I'll be doing that now.

And, I guess we'll look at other options now too, disk-to-disk-to-tape, or something. Just so I have some other plan.
fmartin@applicatorssales.com
Bill Hassell
Honored Contributor

Re: frecover part 2 - trouble

And just to amplify the -N option a bit, frecover actually reads the tape records, checks the tape drive status, and then computes a checksum based on the data read from the tape. Then that checksum is compared with one that is written at the end of each block by fbackup. This offers the best verification that the tape is OK without having to compare every byte to the original disk records.

A backup strategy starts with a backup frequency slightly more often than you can afford to lose the data. Then realize that all mechanical devices will wear out. fbackup writes a usage counter on the header (the -V option in frecover) and after 200 writes, will reject the tape. You can erase this header with tar or any other backup tool and start over but it would be like riding on bald tires.

Just a note about your C6375A. This is a HIGH-VOLTAGE SCSI fast-wide differential interface. NOTHING is compatible with it that is not HV-FWD. That's why the C5658A did not show up in ioscan -- it is electrically incompatible. The HV-FWD interface is very unique in that it can drive extremely long cables (something like 30 meters or 100 ft if I remember correctly). But the interface was never adopted by other vendors and it quietly faded away. But because it has a unique (high) voltage, it can't be used with any other SCSI device including FWD. The only products that work are long since out of production -- which is good because they are sometimes a real bargain.


Bill Hassell, sysadmin
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Thanks for the detail on the tape drives. I mentioned that I finally got the older tape drive to fire up. Just to be clear, it was on an older server that I brought out of mothballs for this event, not my current server. It saved the day.

Two drives are on the way, a replacement and a proper spare.

Now I need a plan for the future. I got the "this can never happen again" thing from my boss; and he is right.

I can't restore every daily BU that I do, just to prove that the tape can be read.

It sounds like the frecover with -v -N is good, how about if I did that every day, comparing both the resulting file list and the messages on standard error (I didn't see anything about error codes in man frecover).

Then once or twice a month, do a restore that day's tape to a temp space and have a look?

What would be the common practice?

I'm sure there must be threads here already on the subject of backup strategies, I'll be searching.
fmartin@applicatorssales.com
Bill Hassell
Honored Contributor

Re: frecover part 2 - trouble

As a tape strategy, I would decide on daily full backups versus incremental. The incremental will run much faster, thus minimizing the backup time. But restoring gets a little more complicated although fbackup/frecover make this fairly easy with the frontend index. If you have a small system and a full backup fits on a single tape, the strategy is fairly easy.

You'll also need to decide how long to keep the old tapes. A lot depends on regulations or the need to go back several days for certain files. A starting point is daily incremental, weekly full, keep 1 month of all tapes, and keep one full backup for each month older than 1 month.

And of course you are taking a weekly or 2x monthly Ignite backup for vg00. frecover can't run without HP-UX so you need the bootable Ignite tape for complete loss of vg00 or the boot area.

As far as testing old tapes, reading the backup with -N is a good choice but this doubles the wear on the tape and tape drives. I would check a midweek and a weekly full.

There are other strategies that depend on the magnitude of the data (GB, TB, PB...), quantity of servers and number of tape drives and tape changers. And as you leave the world of small machines, fbackup/frecover can become cumbersome for record keeping. That's when HP's Data Protector (or similar) becomes mandatory.


Bill Hassell, sysadmin
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Hate to admit it, but I tried to get a bootable tape made with ignite but was getting some errors that I couldn't figure out. Shelved that project and never got back to it.

So, no I don't. When my new tape drives arrive I will make that a priority, will open a thread here for it if I still have troubles.

Fred
fmartin@applicatorssales.com
Bill Hassell
Honored Contributor
Solution

Re: frecover part 2 - trouble

> I tried to get a bootable tape made with ignite

I hope your vg00 disks are mirrored. If you lose your boot disk, none of the fbackup tapes are of any use.

I would run your Ignite image immediately (before your disk fails). Then post the errors at the bottom of the listing. Ignite actually performs a sanity check on your system and it often points out big problems that you haven't seen yet. Open a new forum topic with your Ignite results.


Bill Hassell, sysadmin
Fred Martin_1
Valued Contributor

Re: frecover part 2 - trouble

Pontiatowski's HP-UX 11 book calls the utility out as make_recovery, but it's make_tape_recovery on my system. He mentions a -C switch which I guess I don't have.

Anyway, I just ran it as:

# make_tape_recovery -p -a $TAPE -A -v

...no errors. That was a surprise.

I'll open a new thread about it, I'm sure I'll have questions.
fmartin@applicatorssales.com
James R. Ferguson
Acclaimed Contributor

Re: frecover part 2 - trouble

Hi (again) Fred:

> Pontiatowski's HP-UX 11 book calls the utility out as make_recovery, but it's make_tape_recovery on my system

The name 'make_recovery' (as I recall) was old nomenclature. You now have 'make_tape_recovery' and 'make_net_recovery'.

Regards!

...JRF...
Bill Hassell
Honored Contributor

Re: frecover part 2 - trouble

As mentioned, Marty's book has aged (as does all documentation...) and the correct command is the one you have.

I would recommend:

make_tape_recovery -I -v -x inc_entire=vg00 -a $TAPE

This makes sure you are saving just vg00 and that the restore tape will give an interactive menu for recovery.


Bill Hassell, sysadmin