Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

What kills X sessions?

 
Selden Ball
Advisor

What kills X sessions?

Is there anything in OpenVMS 8.3 which might cause an active X session to be terminated after it has been running for a short while?

We are seeing X sessions being killed, typically a few 10s of seconds after they logged in, after DECterm windows have been opened, whether or not the user is busy doing things. Sometimes the session is killed even before one has had a chance to login.

The X host (client) systems are a variety of Alphas running OpenVMS 8.3 (update 8) in a cluster, with Multinet TCP/IP v5.2.

The problem has been seen both with Linux display systems (using gdm in chooser mode and xorg) and with NCD X terminals.

When it happens, the NCDs generate a popup menu asking if the session should be killed, giving the user the ability to keep it from happening. The Linux display systems, however, simply kill all the windows and redisplay the XDMCP chooser. We have not found any option in xorg to prevent this.

Thanks for whatever help anyone can provide.
14 REPLIES 14
Robert Gezelter
Honored Contributor

Re: What kills X sessions?

Selden,

A suggestion: Check the accounting logs on the OpenVMS system and see if there is anything about why the process terminated.

- Bob Gezelter, http://www.rlgsc.com
Selden Ball
Advisor

Re: What kills X sessions?

The processes associated with MBA "terminal" devices or no terminal devices are terminating with
%SYSTEM-F-EXITFORCED, forced exit of image or process by SYS$DELPRC
and the processes associated with FTA devices are terminating with
%CLI-S-NORMAL, normal successful completion

The last few lines in DECW$SM.LOG are
=====================================
X connection to WSA136: broken (explicit kill or server shutdown).

Fatal error detected, image exiting -- final message:

Exiting DECW$STARTSM.COM
==============================

But we essentially already knew that. What we don't know is what is causing the X session termination.

For the weekend we've reconfigured one of the Linux systems so people can use ssh and then invoke CREATE/TERMINAL and other X programs. Jobs started that way don't have this abnormal termination problem. That's really not ideal, though, since the operations staff are much more familiar with the VMS login procedures than Linux.

Rick Retterer
Respected Contributor

Re: What kills X sessions?

Selden,

Are you running any type of idle process killer applications on your system? One that comes to mind is something like "hitman" in which if not configured correctly will kill some decwindows processes such as the Decw$te_nnnn processes or any process that shows up as "Disconnected" like the Motif Window Manager will sometimes.

How exactly are you starting your X Session on the OpenVMS system?

Are you using XDMCP, to start a session to display to the Linux system?

Are you using manually starting running DECW$STARTSM.COM via a command procedure or some type of application launcher?

When you Start your X Session from the OpenVMS system are you using CDE or Traditional DECWindows?

Between Decwindows and Linux, there can only be one window manager running to manage all of the windows on the display. Is it possible that the window managers are colliding with each other causing the CDE or Decwindows Motif Window manager to exit taking down the X Session as well?

- Rick Retterer



Steve Reece_3
Trusted Contributor

Re: What kills X sessions?

As well as what's already been suggested, you've not changed accounts to be captive or anything like that?

I'm also guessing that the network isn't heavily loaded (I've seen situations where Xsessions die because someone was doing copies of windows system disk images around the network. I wasn't happy that they were screwing my systems up!)

Finally, you haven't been editing any of the login command procedures to explicitly do a LOGOUT where they never used to do them?

Steve
Hoff
Honored Contributor

Re: What kills X sessions?

... some background information. Has this ever work before? If so, what has changed?

... clean out LOGIN.COM and SYLOGIN.COM and DECW$LOGIN.COM. Make it all go away, just for the testing.

... remove some pieces. Start by removing xdm.

... add parallel pieces. Add an OpenVMS-based DECwindows via IP display.

... look at the network level. Switches and vlans and firewalls can be set to block traffic, and can cause weirdness.

... confirm that the IP traffic is enabled underneath DECwindows. (That transport was optional and had to be lit manually for eons; I haven't looked to see if it's now lit by default.)
Selden Ball
Advisor

Re: What kills X sessions?

Rick,

We're not running any idle process killer

The failing sessions are started using the Gnome Display Manager's chooser, which uses XDMCP to bring up the standard VMS login screen.

We're using the traditional Motif window manager, not CDE.

There is no manager collision: the linux systems are not running any window manager.

Steve,

We're not using captive accounts.

The X clients are running at 10Mbits/half duplex. The servers are running at 100Mbits/full duplex. This is intentional to make sure they have plenty of bandwidth available. The session shutdowns don't seem to be related to network traffic levels.

The times that the sessions die seem to be at random times, unrelated to when the login command files were run. Sometimes before the login prompt appeared, sometimes tens of seconds after the login completed.

Hoff,

Part of our frustration is that there have been no known system-level changes between the time jobs were logging in OK and now, when they die.

In particular, there have been no changes to login.com of the affected accounts in recent weeks, since before the problem started. The problem is seen with accounts with sophisticated logins and accounts with near-empty ones. The things in common between them have been in place for years.

One of the common threads does seem to be that XDMCP is involved in starting the jobs. Windows opened in other waya seem to be unaffected.

Since the XDMCP server is part of Multinet, I'll try posting to info-multinet to see if anyone there has any thoughts.
Edwin Gersbach_2
Valued Contributor

Re: What kills X sessions?

>> The last few lines in DECW$SM.LOG are
>> =====================================
>> X connection to WSA136: broken (explicit kill or server shutdown).
>> Fatal error detected, image exiting -- final message:
>> Exiting DECW$STARTSM.COM

I would pay more attention to the display side! I guess the culprit is the display server WSA136 (or whatever the active display is) is pointing to. Check the logs of the X-server there and the logs of the transport protocol (TCP/IP).

Edwin
Wim Van den Wyngaert
Honored Contributor

Re: What kills X sessions?

Once had the same problem of CDE sessions being killed on 7.3. It was mostly while leaving the session iddle during a long time (night/weekend).

I took a tcp trace and just found that both sides (KEA1X and VMS) agreed to stop. No explaining message at all. Problem was never solved, just phased out.

You could try a tcptrace of a session and try to understand that.

Wim
Wim
Richard Brodie_1
Honored Contributor

Re: What kills X sessions?

"One of the common threads does seem to be that XDMCP is involved in starting the jobs. Windows opened in other ways seem to be unaffected."

Its likely that your clients are using XDMCP keepalives, and missing some responses. The X-server then goes and kills all the client connections.

Have a look at the XDMCP request/response traffic on one of the Linux clients. It typically should be one request/response every 15 seconds.

Selden Ball
Advisor

Re: What kills X sessions?

Thanks to all who've been trying to help.

The network traces are very strange. Here's what our Linux expert reports.
===============================
Theproblem now seems to be jumping between cesr27 and cesr28.
Various logs of packets can be found in /home/dab66/xdmcp_packets/.

I captured packets for two separate attempts using:
tshark -R "ip.addr == 192.168.1.27"

I started an XDMCP sesssion on cesr27 as follows.
/usr/X11R6/bin/X -ac -once -query cesr27

On both attempts, the login window for cesr27 appeared for more than
10 seconds before the session terminated automatically. You can find
the packet captures in cesr27.log and cesr27.1.log.

Later on, the connections succeeded. A log of the packets from a
successful session can be found at cesr27_successful.log.

And finally, a log of packets from a successful connection to cesr28
is in cesr28_successful.log.
========================
Some of the failing sessions have keepalives in the packetlogs and some do not.
Some of the working sessions have keepalives in the packetlogs and some do not.

I can provide the logs to anyone who is willing to take a look.

*sigh*

Richard W Hunt
Valued Contributor

Re: What kills X sessions?

Take a look at the release notes for DWMOTIF ECO05, which includes information regarding problems with X connections between OpenVMS an a UNIX-class system. If you haven't applied that patch, it might be useful to you.
Sr. Systems Janitor
Olivier B
Advisor

Re: What kills X sessions?

I used to loose also windows after a short delay. I don't remember if it was the same error as yours but my solution was to recreate the display in sys$sylogin with these few commands :

$ if (f$trnlnm ("Decw$Display") .nes. "")
$ then
$ TmpSetOn = (f$environ ("ON_SEVERITY") .nes. "NONE")
$ Set NoOn
$
$ Define /User Sys$output NLA0:
$ Show Display /Symbol
$ if (DECW$DISPLAY_TRANSPORT .nes. "LAT")
$ then
$ Set Display /Create /Node= "''DECW$DISPLAY_NODE'" -
/Transport= "''DECW$DISPLAY_TRANSPORT'" -
/Screen= "''DECW$DISPLAY_SCREEN'" -
/Server= "''DECW$DISPLAY_SERVER'"
$ endif
$
$ if (TmpSetOn) then Set On
$ endif

Selden Ball
Advisor

Re: What kills X sessions?

@Richard: we've updated to eco 5. no change :(
(we were at 3)

@Olivier: the timeouts happen even with just the login prompt on the screen, before logging in, so I fear that setting the display variables won't help :(

I sent a tcpdump to Process software (the Multinet vendors). They're puzzled, too.

More *sigh*s.
Steve Reece_3
Trusted Contributor

Re: What kills X sessions?

Stupid thought but...
have you looked at what OPCOM is telling you when the jobs die?
have you tried increasing quotas to see if it's actually a quotas related issue?