Operating System - HP-UX
1820268 Members
3086 Online
109622 Solutions
New Discussion юеВ

Problems on an old apollo 715/50

 
Shane Travis
Frequent Advisor

Problems on an old apollo 715/50

Recently had to resurrect an apollo 715/50 box running HPUX 9.05 to support a old project.

Everything worked fine on first startup, and for a couple of days thereafter (which was nice, given that the machine had been turned off and in storage for 4 years). Today, however, things started to get a little flaky.

First issue was that on trying to rlogin to the machine, it hung completely. (It had worked fine yesterday.) Tried to telnet, and it kicked me immediately. Walked down to the machine itself to see what was up, and started working on it directly... couple of longer hangs and freezeups, but nothing untenable... only then I tried to remove something, and got the error "/bin/rm: syntax error at line 1: '(' unexpected" Wuh-oh.

On reboot, I saw many more of these /bin/rm errors. On startup, the machine no longer knew who it was, thinking that it was unnamed; this meant that the netlinkrc files didn't execute properly, so I have no networking any more.

I'm going to guess that this is some sort of library corruption, but I have NO idea how to track down what or where. I have tape backups of this machine from before it went into storage; are they going to show the same corruption, or since it worked fine for a couple of days are the backups probably fine?

Other problem with backups is that there is no tape drive attached to this machine; backups were made over the network on another machine's drive. With no networking, it's going to be difficult to get information off the tapes...

Any suggestions, thoughts, similar experiences, moving stories, or offers of beer in condolence?
10 REPLIES 10
Rick Garland
Honored Contributor

Re: Problems on an old apollo 715/50

Maybe a fifth of Wild Turkey would be better condolence.

Do you have the 9x install CDs? Maybe try to reinstall the OS and see if you can scrouge a tape drive from somewhere to connect to this system.

What about installing 10.20 on this system. I am pretty sure 10.20 will run on this box. Don't know about the apps though.

I believe you are SOL otherwise.

Jeff_Traigle
Honored Contributor

Re: Problems on an old apollo 715/50

Um... it's so tempting to be sarcastic since I left a place that was still running 9.00 and 9.07 on a couple of production systems, but I'll restrain myself. :) I actually had a library go bad on the 9.00 system which caused part of the application to stop working, though the OS itself was fine. We left it broken hoping that it would convince someone to replace the application and the system... Y2K plus a few years and it was still running with the broken application.

As for your situation...

With that old of a box and it having sat idle for 4 years, I'm amazed it booted in the first place, but wouldn't be surprised if the hard drive is choking.

Are the backups likely good? Well, maybe... but tape reliability after a while deteriorates too so even if the backup *was* good when it was made, no telling if the tape is still good now. You won't know until you try... assuming you can get the system limping well enough to have network connectivity again to access the remote tape and frecover isn't equally broken at this point. If you can manage to find the 9.05 media, you might try restaging from that and then restoring frm the remote tape drive.

The next option might be to visit whoever came up with the idea of reviving this system to support an old project after four years with a baseball bat... especially if they say it's a "critical" project and the system *has* to work. :)
--
Jeff Traigle
Shane Travis
Frequent Advisor

Re: Problems on an old apollo 715/50


History: the box is part of a whole setup that was commissioned in 1991, and is still running on (basically) the original hardware... with a couple of upgrades, such as moving the workstations from 300-series to these nice Apollo boxes, and going from HP-UX 8.0 to HP-UX 9.05 and going to G30s for the servers. Moving to any OS beyond what currently exists is not an option, because the project is very large (about about 300k lines of original code) and was written specifically to work on the 8/9 architecture. They considered going to HPUX 10, but the entire software suite we wrote would have to have been regression-tested. We only ever had 5 customers, and none of them were willing to pay for such testing *then*, so our sole remaining one certainly isn't willing to pay for it now.

The last remaining installation is in India, and they don't really care about the fact that the hardware is 10 years old, or the OS archaic, as long as it works... which it still does, currently. They're the ones that need an upgrade to some functionality, because one of their input providers is changing the message format, so it's either upgrade, or buy a whole new system. Upgrading is cheaper, so I get to make this stuff work again.


More bad news: I did try doing an 'frestore' from the backup tape onto another drive somewhere else, and it told me that it wasn't a recognizable fbackup. :(


Now the good news: Turns out that there's another HD in the system that has another copy of the OS on it. Perusing it, looks like the initial HD was basically copied wholesale onto this second one at some point... so I've got something that's not quite a mirror (in that it's not current with latest changes) but is at least a snapshot of what the system looked like about a year before it was crated up. Since this is just a workstation, and the mission-critical software is actually on another machine, this image being a year old isn't a problem at all. I've since booted from it, and it networks fine, runs fine, etc. Looks like I'm saved...

... for the moment.

I'm well aware that this stuff is old, and old stuff is (more) prone to failure. I'd like to be as secure as I can. So, my questions are:
1) I can mount the mostly-but-not-quite working original HD, and look at all its files. What should I try copying over from the working system to see if I can get it going again? bin, lib, opt... anything else?
2) In place of #1, could I just copy the whole of the working drive onto the non-working drive? If so, what commands? I know 'cp' ain't gonna cut it... I'd like more of a bitwise copy to get an image, I think... yes? If so, what?
3) Right now, I've got one good HD, and no (usable) backups. As I write, I'm doing an fbackup to tape. What else can I do with a system of this vintage?

I think I'll take that Wild Turkey now, Rick...
Gerard Leclercq
Trusted Contributor

Re: Problems on an old apollo 715/50

Hello,

I have no solution on your problem. It can be a local hardware problem on your HD, or a lan ID or IP address identical to another machine on the network.
It seems that /bin/rm as no longer the status of a compiled executable, but the status of an ascii text. I do not know how to find the status. It is as if you type
"sh /bin/rm".

About your questions :

1)If the 2 HD are hpux 9, save the original files you want to replace by renaming the files. So they will occupy the same place as before.
Then copy the new files from the working new HD.
The new files will be on a new physical place on the old not-working-well HD.
Begin with /bin/rm and test it, something like :
# mv oldHD/bin/rm /bin/rm_original
# cp newHD/bin/rm oldHD/bin

2)I would do 1) before, to try to know what happens.
Then, if you can learn nothing of it, of it does not work, use dd instead of cp.
If the two HD size are same, it will work.
(When rebooting on the old HD, it will do fsck, since the fsclean bit is not set)
If not the same, I do not know.
Before doing dd, I would do a mediainit which check physically the hardware.

3)There is a command to do a bootable system backup on tape, but I do not know (fbackup ?).
Anyway, find it and do it.
Then find another 715 for spare. I've got one, but you are a little bit too far.

Gerard
hein coulier
Frequent Advisor

Re: Problems on an old apollo 715/50

Hi,

If i dig deep, very, very deep in my memories, i recall that the reason (or the luck) for us to expell our old apollo servers, was that there was a problem with 'time since epoch'. I think it was an 8bit-value that would overrun (was it november 97, i can't recall).

Although i can't remember if the problem was software (we ran Domain OS), or hardware, perhaps it can be of some help to you.

you are a brave human being.
Dave Unverhau_1
Honored Contributor

Re: Problems on an old apollo 715/50

Shane,

A couple of points to ponder...

1- Any idea how the backup was made (what utility)? If it's an ignite tape (heck...I don't know if ignite existed before 10.x), you might be in luck.

On the system with the tape drive, put in the tape and execute the command

mt fsf 1

Then try using tar to examine the contents. If this works, you probably have a complete backup and can recover by hooking the tape drive to the 750 and booting from it.

2- When you boot off of the alternate drive, does the make_recovery command work? This would also tell you whether Ignite existed on this system. (try "man make_recovery")

3- As Gerard said, the dd utility would be the best way to duplicate the working drive onto the non-working one, but the destination drive must be at least as large as the source drive. (If the destination drive is larger than the source, the dd will effectively reduce the size of the destination to match the source!)

One more thing. A qiuck check of the non-working drive would be to use dd to dump the contents to the bit bucket:

dd if=/dev/rdsk/ of=/dev/null bs=4096k

If it completes without error, your drive is *probably* physically healthy.

Good luck!

Dave
Romans 8:28
Gerard Leclercq
Trusted Contributor

Re: Problems on an old apollo 715/50

Hein, just a note about date.

The 715/50 is maybe not supported for year 2000, but it works. hpux 9.xx is not supported by HP for year 2000, but it works. Maybe for there will be some malfunctions, but not of the type mentioned by Shane.

The "time since epoch", one of the many problems of year 2000 is limited to year 2038. I think that Shane has a good safety margin ...

I had many 715/xx running hpux 9.xx after year 2000, and now still have two 715/xx running hpux 10.20 without any problem.

Gerard
V. Nyga
Honored Contributor

Re: Problems on an old apollo 715/50

Hi Shane,

do you still need help?

If you can mount the first disk without problem, then it should be no problem to copy rm.

To your tape:
Have you tried to read it through SAM?
There's a menu 'Backup and Recovery'. It also works with backup's over network.

HTH
Volkmar
*** Say 'Thanks' with Kudos ***
Kent Ostby
Honored Contributor

Re: Problems on an old apollo 715/50

Your other option is to create a new backup off of the second disk and restore rm back to it.

Also, can you get both disks up enough to check the differences in installed patches ?

If there are no differences then you'd even be able to pull some libraries over from "working old disk" to "messed up newer disk".

Finally if you want to see if that first disk is broke, you may be able to verify it by:

dd if=/dev/rdsk/ of=/dev/null bs=64k

This command will run for a long time... it should come back and say so many records in and so many records out. If it also says "I/O error" or something similar, then its broke.

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Shane Travis
Frequent Advisor

Re: Problems on an old apollo 715/50

Thank you everyone for your help; sorry to take so long to respond and assign points.

Dave - no ignite in HPUX 8 or 9 - it was a new utility in 10.

Hein - Gerard is right; 'not supported' != 'not working'. Every 9.0 (and even 8.0) box we have still runs fine well past Y2K... it's just that HP won't *guarantee* that they'll run fine without errors past Y2K.

Gerard - if these machines aren't turned off by 2038, it won't be my problem any more, as I'll be safely retired. :-) :-)

Kent - patch levels were the same. Disks 2 (good) was a copy of disk1 (no longer good) at one point, and little changed past that point.

I ended up copying over a number of libraries from the good disk to the faulty one, and that seemed to do the trick. This gives me two working disks again, which is good enough for now. By Christmas I'll know if I have to keep supporting this system for the next five years (in which case I'll be getting lots and lots of backup HDs, and least one more complete Apollo box) or if it's going to be turned off early next year.

Thanks everyone for the help.