Operating System - OpenVMS
1752369 Members
5927 Online
108787 Solutions
New Discussion юеВ

Re: Securing the console port on an ES47

 
SOLVED
Go to solution
John A.  Beard
Regular Advisor

Re: Securing the console port on an ES47

Hi Volker,

I am attaching the CLUE info pertaining to one of the 8 servers in question.

Although there have been more instances on this particular server, we were not able to always get to the SRM prompt from MBM in order to initiate a crash command. On those occasions we had to issue a power off/power on command from MBM whic automaticly rebooted the server (despite AUTO_ACTION being set to HALT).

Glacann fear cr├нonna comhairle.
Art Wiens
Respected Contributor

Re: Securing the console port on an ES47

I'll ask again, do you have an Alpha Management Station? ie. you don't need to go through the MBM console to get to the SRM console (if you set up the AMS appropriately). If I recall, doing the MBM/SRM access seemed to be a bit of hit and miss ... going straight to the SRM works without issue. And like I said, AMS gives you more control of who can access any consoles and some very granular monitoring points.

A 2 nic Linux box for AMS and a private VLAN for the ES47 side works well for us.

Cheers,
Art
John A.  Beard
Regular Advisor

Re: Securing the console port on an ES47

Hi Art,

Sorry for not responding to the question earlier...the answer is no, we don't have AMS but I have just started to look into what it would take to set it up.

Glacann fear cr├нonna comhairle.
Volker Halle
Honored Contributor

Re: Securing the console port on an ES47

John,

this crash footprint certainly looks like a software problem ! Whether it's an error in OpenVMS itself or third-party code remains to be seen.

It's an ACCVIO in process_management code. The code tries to access an invalid S2 space address. The symbolization of the failing PC is misleading. There is no lock manager code in PROCESS_MANAGEMENT. I would guess, that the failing code stream is in the Kernel Thread Support routines.

The crash happens during image rundown (in kernel mode) of the MQ series Execution Controller.

For further confirmation, one would need the source listing of the failing code from this OpenVMS build (from 29-AUG-2007). This information is ONLY available within OpenVMS engineering !

Please have these crashes escalated to HP OpenVMS engineering for further analysis.

The fact that CPU 1 is HALTed in this crash has absolutely nothing to do with 'noise on the console line'. It's becasue the invealid exception happened on CPU 0 and this CPU stopped (all) the other CPUs in the system while handling the bugcheck.

Feel free to mail me the other CLUE files.

Volker.
John A.  Beard
Regular Advisor

Re: Securing the console port on an ES47

Hi Volker,

Thanks for all your efforts.

I know it's difficult to include all the relevant history in cases like this, but apart from attaching some additional clue listings I should also mention a few other facts surrounding this case.

The server for which I sent the first listing has actually "hung" a total of nine times since Jan '09(including the incident associated with the listing I attached earlier). There was only one clue listing in sys$errorlog, and it seems no other clue files were geneated for all the successive incidences.

In the cases where we were able to connect to SRM we issued the crash command and sent the dump, errorlog and MBM information off to HP.

The reference to potential "line noise" was give as a possible cause and not a definitive answer.

On another occasion one of my colleagues attempted to connect to the console after being informed that the system had appeared to freeze. As soon as he issued a connect command from MBM, the system dropped down the chevron prompt. The dump file for this event actually showed the last piece of activity to have been the connection attempt by my colleague. My colleague had prior to this pinged the box but the command timed out.

I am attaching a clue file from another node that is located in a different sub-net. This particular incident occured only last Saturday.

FYI From January '09 to date we have experienced a total of 40 such events across all 8 VMS instances (4 ES47s in P2 drawers)
Glacann fear cr├нonna comhairle.
Volker Halle
Honored Contributor

Re: Securing the console port on an ES47

John,

if an OpenVMS system appears to 'hang' and you can't access the console for a CTRL-P CRASH command, consider using the AMDS (Availability Manager) software. You can force a crash from remote (on the same LAN), if at least the network interface and drivers still work.

Volker.
Volker Halle
Honored Contributor

Re: Securing the console port on an ES47

John,

the CLUE file from 17-APR-2010 represents a 'forced crash' (OPERCRASH bugcheck), i.e. a CTRL-P >>> CRASH sequence. For such a crash, analysis of the full system dumpfile is required. There is nothing suspicious in the CLUE file data from this crash.

Volker.

John A.  Beard
Regular Advisor

Re: Securing the console port on an ES47

Volker....Does AMDS require a regular network interface connection...nobody was able to Telnet into the box once the problem arrose.


Glacann fear cr├нonna comhairle.
Volker Halle
Honored Contributor
Solution

Re: Securing the console port on an ES47

John,

AMDS runs as a separate protocol on the LAN interface. As long as the LAN interface and the RMDRIVER still work, you can force a crash with AMDS.

If the system does not respond to a PING, the chances are pretty low, that RMDRIVER will still work.

TELNETting into the system needs process scheduling to work as well.

Volker.
John A.  Beard
Regular Advisor

Re: Securing the console port on an ES47

..a quick update.

The HP analysis of the dump file from last Saturday showed (once again) that there was nothing untowards happening...no identifiable processes showed up as potentially causing the problem.
Glacann fear cr├нonna comhairle.