Make Tape Recovery Hanging

 
SOLVED
Go to solution
Jay Core
Frequent Advisor

Make Tape Recovery Hanging

OK Ladies and Germs, a toughie.

We perform a make_tape_recovery every month. This month, it decided to hang. We use this command:

make_tape_recovery -x inc_entire=vg00 -B /r001/uxinstlf.recovery

It usually runs about 2-4 hours (T600 - I know, I know) and this time it decided to run for 3 days. I killed it, but it still had a hung pax process that I identified with lsof. I couldn't kill that, so I had to reboot (not an easy thing to do). I then had to remove the flag file, then was able to rerun it. It ran for another 2 days, so I killed it again. I'm thinking maybe it's a bad, old, 4mm drive, but we don't have support - so.... I'm trying here.

Thanks for your time,
Joe
6 REPLIES 6
Marcel Boogert_1
Trusted Contributor

Re: Make Tape Recovery Hanging

Joe,

Have you tried to tar to the tape device ? If that´s working it´s another problem I think.

Ragrds, MB.
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: Make Tape Recovery Hanging

The first thing I would do is run pax (as a command) and output to /dev/null. This will help distinguish between a corrupt, looping filesystem and a bad tape drive.
If it ain't broke, I can fix that.
Carlos Roberto Schimidt
Regular Advisor

Re: Make Tape Recovery Hanging

Hi Joe,

Looking for symbolic link for the same simbolic link.

I believe wich you can use command "find" for see all simbolics links.

Rgds

Schimidt
Jay Core
Frequent Advisor

Re: Make Tape Recovery Hanging

Marcel - good suggestion - I'll try this when I get a chance.

A.Clay - thanks - I think this may have nailed it - I let you know when I resolve and close the thread.

Carlos - thanks for the info. - I'll check for the links.

Thanks again everyone. I'll let you know.
Bill Hassell
Honored Contributor

Re: Make Tape Recovery Hanging

Even though the T600 is a bit dated, the pax process should keep the tape moving all the time. I would always run the MTR command with -v so you can monitor the steps. However, pax is apparently stuck, so you could run the same pax command to /dev/null or once the tape activity light stops, kill the pax process and then use mt -f /dev/rmt/berkley_no_rewind and pax -v -f /dev/rmt/berkeley_no_rewind to read the second file on the tape. The last file listed before pax aborts (due to the parttial file during the hang) is where the problem may exist.

Usually, a hang that can't be killed (I assume you resorted to kill -9 which did nothing) it is due to I/O that has failed. Check syslog for any hardware diag messages and be sure to run a cleaning tape followed a new backup tape. This certainly could be an infinite loop by a looping link.


Bill Hassell, sysadmin
Jay Core
Frequent Advisor

Re: Make Tape Recovery Hanging

Thanks Bill - good recommendations - I'll try this when I can reboot again.

Joe