Operating System - OpenVMS
1748252 Members
3903 Online
108760 Solutions
New Discussion

Re: Server Restarts Automatically

 
Steven Schweda
Honored Contributor

Re: Server Restarts Automatically

 

Re: Server Restarts Automatically

Unfortunately, this case was not well argued and we have few information to work on.

 

The post 1 recites : "I have an Alpha Server DS15 on a cluster with another DS15 server.  The first machine restarts randomly while the second machine is ok".

 

The suggestion submitted with post 10 is absolutely valid, but it's my opinion that even the analysis  of the system errorlog (using the right tools) may help in cases like this one :

 

1- we can confirm if the root cause of this behavior is a system crash or not (CLUEXIT Bugckeck...????)

2- if the system suffers of the hw problems

3- If this DS15 is really just resetting

 

and so on

 

/Maurizio

Purely Personal Opinion

[ I am a HPE Employee and an OpenVMS Ambassador ]
Eddy_fj
Advisor

Re: Server Restarts Automatically

Thanks For Replying.

 

I will be uploading the operator log file when the machine restarts this time. Maybe then we will know where to go and look for the cause of the restart.

 

The server is running VMS 7.3 version.

 

B Claremont
Frequent Advisor

Re: Server Restarts Automatically

In the interum, post the results of SHOW DEV and SHOW ERROR and let's see if you are logging any device errors.

www.MigrationSpecialties.com
Eddy_fj
Advisor

Re: Server Restarts Automatically

Hello All, the server restarted again this Saturday at 11.09 am. I had a look at the operator log file for that day and time. Please refer below.

 

%%%%%%%%%%%  OPCOM  21-SEP-2013 09:32:03.89  %%%%%%%%%%%
Message from user TCPIP TELNET on LBAWB3
TELNET Logout Request from Remote Host: 10.100.30.111 Port: 59569
 
%%%%%%%%%%%  OPCOM  21-SEP-2013 10:14:39.23  %%%%%%%%%%%
Logfile time stamp

 

the server restarted here at 11.09 but nothing much logged. 


%%%%%%%%%%%  OPCOM  21-SEP-2013 11:14:39.25  %%%%%%%%%%%
Logfile time stamp

%%%%%%%%%%%  OPCOM  21-SEP-2013 12:08:24.07  %%%%%%%%%%%
Logfile has been initialized by operator _LBAWB3$OPA0:
Logfile is LBAWB3::SYS$SYSROOT:[SYSMGR]OPERATOR.LOG;1440

 

 

 

 

Also while going through the log file, i found this....while the server was booting.

 

%%%%%%%%%%%  OPCOM  21-SEP-2013 12:08:32.04  %%%%%%%%%%%
Message from user SYSTEM on LBAWB3
%LICENSE-E-NOAUTH, DEC OPENVMS-ALPHA use is not authorized on this node
-LICENSE-F-EXCEEDED, attempted usage exceeds active license limits
-LICENSE-I-SYSMGR, please see your system manager

 

however the machine booted and is online since than and working fine.

 

 

LBAWB3> sh dev
 
Device                  Device           Error    Volume         Free  Trans Mnt
 Name                   Status           Count     Label        Blocks Count Cnt
DSA0:                   Mounted              0  WBDATA        65389230   116   2
$1$DKA0:      (LBAWB3)  ShadowCopying        0  (copy trgt DSA0:  52% copied)
$1$DKA100:    (LBAWB3)  Mounted              0  AXPVMSSYS     61614378   397   2
$1$DQA0:      (LBAWB3)  Online               0
$1$DQA1:      (LBAWB3)  Offline              1
$1$DQB0:      (LBAWB3)  Offline              1
$1$DQB1:      (LBAWB3)  Offline              1
$2$DKA0:      (LBAWB4)  ShadowSetMember      0  (member of DSA0:)
$2$DKA100:    (LBAWB4)  Mounted              0  ALPHA_0722    65796399     1   2
$2$DQA0:      (LBAWB4)  Online               0
 
Device                  Device           Error
 Name                   Status           Count
OPA0:                   Online               0
OPA2:                   Online               0
OPA3:                   Online               0
ASN0:                   Online               0
FTA0:                   Offline              0
LTA0:                   Offline mounted      0
LTA2:                   Online               0
LTA3:                   Online               0
LTA4:                   Online               0
LTA5:                   Online               0
LTA101:                 Online spooled       0
                        alloc
LTA102:                 Online               0
LTA103:                 Online               0
LTA104:                 Online               0
LTA5022:                Online               0
LTA5033:                Online               0
LTA5034:                Online               0
LTA5035:                Online               0
RTA0:                   Offline              0
RTB0:                   Offline              0
TNA0:                   Online               0
TNA5:                   Online               0
TNA6:                   Online               0
TNA7:                   Online               0
TTA0:                   Online               0
 
Device                  Device           Error
 Name                   Status           Count
LRA0:                   Online               0
 
Device                  Device           Error
 Name                   Status           Count
EIA0:                   Online               0
EIA2:                   Online               0
EIA5:                   Online               0
EIA6:                   Online               0
EIA7:                   Online               0
EIA9:                   Online               0
EIB0:                   Online               0
EIB2:                   Online               0
EIB4:                   Online               0
MPA0:                   Online               0
PEA0:                   Online               0
PKA0:                   Online               0
PKB0:                   Online               0
PPP0:                   Online               0
SMA0:                   Online               0

LBAWB3> sh error
Device                           Error Count
$1$DQA1: (LBAWB3)                        1
$1$DQB0: (LBAWB3)                        1
$1$DQB1: (LBAWB3)                        1

 

 

 

Thanks for the support

 

 

 


 

Volker Halle
Honored Contributor

Re: Server Restarts Automatically

Eddy,

 

OPERATOR.LOG nearly never contains information about a crash or restart reason.

 

You need to capture the console (OPA0:) output from such a 'restart'. The ERRLOG.SYS file or a system dump file may contain additional information, if the 'restart' is caused by a system crash. Unfortunately, you need a tool like DECevent to decode the ERRLOG.SYS file on OpenVMS Alpha V7.3.

 

Also consider to set the console variable AUTO_ACTION to RESTART, if it should be set to BOOT. You can retrieve the current setting with WRITE SYS$OUTPUT F$GETENV("AUTO_ACTION")

 

Volker.

Brad McCusker
Respected Contributor

Re: Server Restarts Automatically

Eddy,

 

You need to get this system configured to create a dump file.  Or, confirm that in fact it is properly configured to create a dump file.  Please read the OpenVMS System Manager's Manual - look for sections that discuss the System Dump and generating crash dumps.  Until you get the dump file properly configured, you are wasting your time.  (If you need help doing this, there are plenty of people around who can do this - many responding in this thread, our company included)

 

Just out of curiosity - how do you connect to the console of this server?  Is there a console server involved?  You also need to set up your console so that output to the console is captured and preserved, somewhere.  

 

Brad McCusker

Software Concepts International

www.sciinc.com

 

 

Brad McCusker
Software Concepts International
B Claremont
Frequent Advisor

Re: Server Restarts Automatically

Using a USB serial adapter, connect a laptop to the console port and use a PuTTY session to capture the console output.  Be sure to set the PuTTY session scroll back buffer to a large value.  I use 20000.

www.MigrationSpecialties.com

Re: Server Restarts Automatically

Eddy,

 

as I reported many times in this thread, the analysis of the system errorlog (not operator.log) might help us to better understand the root cause of your problem.

 

From the DS15, where the problem occurs :

 

$ create/dir sys$sysdevice:[temp_errlog]

$ copy/log sys$errorlog:errlog.sys sys$sysdevice:[temp_errlog]*

 

Transfer the sys$sysdevice:[temp_errlog]errlog.sys from Alpha to your PC in binary mode.

When you have done, contact me off line. We have to arrange the way so I can analyze the errorlog for you.

 

Regards,

/Maurizio

[ I am a HPE Employee and an OpenVMS Ambassador ]
Volker Halle
Honored Contributor

Re: Server Restarts Automatically

Maurizio,

 

if the problem is a restart-crash (like MACHINECHK etc.), copying and analyzing the ERRLOG.SYS files does not help, if AUTO_ACTION is NOT set to RESTART and there is no valid SYSDUMP.DMP file set up.

 

Capturing the console output is the only way to find out, why your server is restarting unexpectedly. Once you know that, more analysis may be required (dumpfile, errorlog etc.).

 

Volker.