Operating System - HP-UX
1834418 Members
1620 Online
110067 Solutions
New Discussion

fbackup failing overnight with unrecoverable media write error

 
SOLVED
Go to solution
Ian Foster_2
Frequent Advisor

fbackup failing overnight with unrecoverable media write error

The full backup on our HP N Class (HP-UX 11.11) has failed consistantly for the last few nights towards the end of the copy with the following error :

fbackup(3013): WRITE ERROR while writing data record, at media record 2992719
fbackup(3102): attempting to make this volume salvagable
fbackup(3105): writing 2 EOFs and rewinding the tape
fbackup(3106): please mount a good tape
fbackup(3310): enter '^[yY]' when volume 1 is ready on /dev/rmt/2mn,
or '^[nN]' to discontinue:
fbackup(4001): automatic 'yes'
fbackup(3202): this is volume 1 OF THIS SESSION!
rejecting this volume
fbackup(3004): writer aborting
fbackup(1002): Backup did not complete : Reader or Writer process exit
0

The only thing which has changed lately is that the amount on the data on the server has increased the size of the backup by about 8Gb to roughly 50 Gb (approx 500,000 files).

Normally I would have assumed that this was a media or hardware problem (DLT 8000)EXCEPT that: There is only one error logged in the syslog and dmesg(relating to one isolated media write fail - bad media - drive needs cleaning);also other (albeit smaller)backups to the same drive written with dd are without error; and more confusingly the same fbackup written to the same drive with the same tapes on an adhoc basis during the day have worked perfectly.

I was a little reluctant under the circumstances to assume it was the drive and have found the following potential problems with the fbackup process (though these are not 'new' changes) : 1. Our fbackup job has been written to use the norewind device and 2. There is no config file so I guess the blocksperrecord are default (16 ?).

Also we do not have the latest fbackup / stape patches for 11.11 installed.

I have also looked at the possibility that another process is interfering with the backup overnight ; which has also failed to new media.

Has anyone come across this type of error when the size of the backup has increased outside of a hardware/media problem ?

Our Patrol system history would suggest system resource (CPU/Memory/Kernel) overnight is not an issue - and this would be born out by the fact that the backup works ok during the working day.

Any ideas gratefully received.

Pretty soon I'm just going to have to bite the bullet and change the drive to eliminate it if nothing else. Thx.
9 REPLIES 9
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: fbackup failing overnight with unrecoverable media write error

I think you are dealing with nothing more (and nothing less) than an end-of-media condition.
If it ain't broke, I can fix that.
Helen French
Honored Contributor

Re: fbackup failing overnight with unrecoverable media write error

I agree with Clay ...This is normally a end of tape warning which means, you need to have two tapes in order to finish the backup. If you want it to include in a single tape, either use a higher capacity drive/media or split the backup.

You can find the same error and it's description on man fbackup (1m) under WARNINGS session.
Life is a promise, fulfill it!
Ian Foster_2
Frequent Advisor

Re: fbackup failing overnight with unrecoverable media write error

Thanks for the input guys. I must admit that I had looked at that as a possibility but I thought the size of the copy (ie. the actual size of the raw data based on bdf output for the filesystems) was well within the 70-80 Gb compressed limit for the media - but maybe I am looking at this too simplistically ? Additionally the backup worked again ok last night (only thing I have changed is the device - to rewind rather than no rewind). So I would have thought once the backup got too big it would be too big every time unless we reduced the amount of data ?

If we continue to have issues I will certainly try splitting the backup over two seperate tape copies.

I am also going to get fbackup and the stape patched up to date.

I can find no real evidence in the logs of a hardware problem.

Thanks again.
Chris Wilshaw
Honored Contributor

Re: fbackup failing overnight with unrecoverable media write error

Personally, I'd be more inclined to suspect a tape, or possibly even tape drive fault (alhtough this is less likely given the error).

An end of tape is (or should be) reported as

fbackup(3003): normal EOT
fbackup(3316): enter '^[yY]' when volume 2 is ready on /dev/rmt/0m,
or '^[nN]' to discontinue:

As you can see, this specifies volume 2, whereas your error refers to volume 1;

fbackup(3106): please mount a good tape
fbackup(3310): enter '^[yY]' when volume 1 is ready on /dev/rmt/2mn,
or '^[nN]' to discontinue:
Steve Steel
Honored Contributor

Re: fbackup failing overnight with unrecoverable media write error

Hi

What type of drive is it .

Could be a firmware problem only coming up after a time. This is recorded occasionally

Probably when you have a lot of open files


Steve Steel
If you want truly to understand something, try to change it. (Kurt Lewin)
Ian Foster_2
Frequent Advisor

Re: fbackup failing overnight with unrecoverable media write error

Hi - tape drive is an HP (Quantum) DLT8000 in an L20 library.Ian.
Steve Steel
Honored Contributor

Re: fbackup failing overnight with unrecoverable media write error

Hi


Patches look most likely solution in the stape patch


Also try to make a new config file.
See man page

set maxretries 2


Steve STeel
If you want truly to understand something, try to change it. (Kurt Lewin)
Ian Foster_2
Frequent Advisor

Re: fbackup failing overnight with unrecoverable media write error

I have an update on this problem : By playing around with the graph file for the fbackup procedure I have determined that the backup seems to complete successfully overnight as long as the size of the copy is under 40Gb. This would suggest that this may well be an end of media condition; though I would have expected a more meaningful EOM message.

If this is the case this would also suggest that the drive is not using any kind of compression by default. Anybody got any idea how I check this or set it up on the library ?

Thx - Ian.
Ian Foster_2
Frequent Advisor

Re: fbackup failing overnight with unrecoverable media write error

Believe this was simply an end of media condition. Housekeeping to get the size of the copy down seems to have resolved the issue.