cancel
Showing results for 
Search instead for 
Did you mean: 

System Hangs

 
Michael LaRoche
Frequent Advisor

System Hangs

Hi,

I'm managing a system that hangs intermittently, it's running OpenVMS 6.2-1h3. It doesn't crash and we have to hit the reset button. The machine is an AlphaServer 400 4/233. It doesn't have most of the patches installed and I'm wondering if I need to install a patch and which one to fix the problem?

Thanks,
Mike
18 REPLIES
Kris Clippeleyr
Honored Contributor

Re: System Hangs

Mike,

Welcome.

Instead of hitting the reset button, can't you Ctrl/P on the console and then at the chevron prompt (>>>) type "crash".
This will produce a crashdump which can then be analyzed.

Greetz,

Kris (aka Qkcl)
I'm gonna hit the highway like a battering ram on a silver-black phantom bike...
Mohamed K Ahmed
Trusted Contributor

Re: System Hangs

You should consider upgrading, there is a version 7.3-2 now
Also install the patches.
Most probably, the patches or upgrade will solve the problems

Mohamed
Ian Miller.
Honored Contributor

Re: System Hangs

anything in the error log?
check that you have the latest firmware
ftp://ftp.digital.com/pub/DEC/Alpha/firmware/archive/astn400.html
____________________
Purely Personal Opinion
Michael LaRoche
Frequent Advisor

Re: System Hangs

As for upgrading the system, that's not an option as the machine will be replaced within the next year.

We will try the ctrl/p the next time this happens.

Nothing in the error log for the time it hangs.
Robert Gezelter
Honored Contributor

Re: System Hangs

Mike,

Getting the crash is one thing. Applying the available patches for 6.21H3 and updating the firmware are good suggestions (at least they eliminate possibilities).

Can you tell us anything about what is happening on the system at the point that it apprears to hang? Is it hung for all users, or only for some users?

The more information that you can provide, the more helpful we can be.

- Bob Gezelter, http://www.rlgsc.com
Volker Halle
Honored Contributor

Re: System Hangs

Mike,

troubleshooting a hanging system is NOT trivial. Here are a couple of steps to take to at least get a forced crash, this is the ONLY way towards analysis of the hang,without pure speculation and guessing:

- the AlphaServer 400 does not have a separate HALT button. You need to re-jumper the Restart/Halt button to cause a HALT, so that you can type >>> CRASH on the console prompt, once the system is hung. CTRL-P may work on the serial console.

http://h18002.www1.hp.com/alphaserver/download/ek-pcdsa-ui-b01.pdf

- without a forced crash, errlog buffers from a hanging system can't be written to the system disk, so there is nearly no chance to find something in ERRLOG.SYS

Volker.
Wim Van den Wyngaert
Honored Contributor

Re: System Hangs

If you have performance advisor active, I would check high prio cpu loops, number of processes and memory usage before anything else.

Wim
Wim
Jan van den Ende
Honored Contributor

Re: System Hangs

Mike,


As for upgrading the system, that's not an option as the machine will be replaced within the next year.

Replaced by a newer VMS machine? In that case, it still is very wise to do the upgrade now, and then later the move.
That way, you separate any potential upgrade issues from any potential move issues.

-- any other replacement is to be considered a serious downgrade :-)

Proost.

Have one on me.

Seasonal greetings to all!

Jan
Don't rust yours pelled jacker to fine doll missed aches.
Anton van Ruitenbeek
Trusted Contributor

Re: System Hangs

Mike,


As for upgrading the system, that's not an option as the machine will be replaced within the next year.


An upgrade will aproaximaly take several hours. This is peanuts considering you got users who hang when the system hangs.
If you don't got a 24x7 company you can do this eq. in a weekend. If you got special applications the best thing is to hire for a week an Alpha, do an upgrade (not reinstal) of your current system on it and check. If everything is working OK, do it with you're production system. If you're in a cluster, you can do it on the fly .....

If you got a 24x7 environment, then you have off course a cluster and then you can test it also on the fly.

AvR
NL: Meten is weten, maar je moet weten hoe te meten! - UK: Measuremets is knowledge, but you need to know how to measure !
Keith Parris
Trusted Contributor

Re: System Hangs

Availability Manager (or DECamds) is a good way to see what's going on within a system during a "hang". This comes free with a VMS license on Alpha or VAX and there's even a kit on the OS media.

The Data Analyzer for AM can run on either a PC or a VMS box; The earlier DECamds software runs only on a VMS box. There's a Data Provider (same for either AM or DECamds) that is installed on the box to be monitored (it shows up as RMDRIVER).

Since queries come in over the LAN at IPL 8, unless the box is locked up tight at hardware interrupt level or something, you can see what's going on even if you can't log in interactively. See http://h71000.www7.hp.com/openvms/products/availman/index.html
David B Sneddon
Honored Contributor

Re: System Hangs

Mike,

How intermittent are the hangs?
Is it possible something is using up all your
pagefile space?
As Keith suggests, DECamds can be very useful
for these types of situations.

Regards
Dave
Wim Van den Wyngaert
Honored Contributor

Re: System Hangs

I am always amazed of how easy everyone is doing "upgrades" without re-qualification of the applications. I freeze my systems the moment the qualification is done.
And no one-time, uninvestigated system hang will motivate an upgrade.

Wim
Wim
Volker Halle
Honored Contributor

Re: System Hangs

Mike,

using DECamds/Availability Manager also allows you to force a crash from remote (without the need to change motherboard jumpers), if the system hangs (assuming it does not hang at or above IPL 8).

You just need to configure the correct security in AMDS$DRIVER_ACCESS.DAT to allow WRITE access from your AMDS Data Collector node.

If you keep an AMDS Collector node running all the time (until the next system hang), you can also look at the AMDS Event messages (AMDS$LOG:AMDS$*.LOG) prior to the hang. If the system hangs, try to look for problems using AMDS, then force a crash (to capture the dump and get the system back up) and then you can take your time to look at the problem in the dump. This will also cause error log entries to be written to ERRLOG.SYS.

Volker.
Mohamed K Ahmed
Trusted Contributor

Re: System Hangs

In response to Wim's reply,
Most applications running on old versions of OpenVMS would have been upgraded to accomodate the new versions. If not, then you have a problem with the applications and should seek advice from the vendor.

In this situation, he has version 6.2, which is very old in contrast to the new versions, so an upgrade would be the most appropriate thing to do (of course after checking the applications).

Mohamed
Michael LaRoche
Frequent Advisor

Re: System Hangs

Upgrading th OS was of first interest to me but the decision makers are saying lets wait until we get the new hardware and applications then all we have to do is move the database files and update them on the new system.

I also wanted to put all the patches in to make it up to date but again and according to the powers that be it would take over a year and a half to put them in piecemeal, which is the desired way here nothing all at once just a couple at a time.
Wim Van den Wyngaert
Honored Contributor

Re: System Hangs

Mohamed,

The real price of testing applications is underestimated. To test a big application properly weeks if not months are needed. And a new machine because otherwise you don't have the 2 versions available for development.

And if you don't take this time, you are taking risks. And that can cost even more (things not working after the upgrade).

6.2 may be old but we recently had machines running it being up for years.

Wim (not upgrading anything)
Wim
Antoniov.
Honored Contributor

Re: System Hangs

Hi Mike,
I'm using alphaserver 400 4/233 with VMS V6.2 and with V7.3-2.
Lat year, one machine hanged intermitently or could not boot. After some weeks crashed. I changed power supply and now it works fine!
Last week, another machine broken power supply!
May be you have similar trouble.

Antonio Vigliotti
Antonio Maria Vigliotti
Michael LaRoche
Frequent Advisor

Re: System Hangs

Will need to look at it when it happens again. So far no users have been affected, happens at night or weekends when no one is on, and some times when either backups are running or just the machine is up and running with no other interaction.