HPE 9000 and HPE e3000 Servers
1748182 Members
3543 Online
108759 Solutions
New Discussion

Server restart suddenly with no error

 
SOLVED
Go to solution
Dree
Advisor

Server restart suddenly with no error

Hi expert,

 

I've problem with my server (L 1000 type) HP-UX, the problem is the server suddenly restart it self without error in syslog or MP.

There is a chance the problem is because of corrupted OS? or it's totally hardware problem. What should I check?

 

Thanks

Your help will be appreciate

 

Adrian Tjahjana

11 REPLIES 11
DeafFrog
Valued Contributor

Re: Server restart suddenly with no error

Hi Adrian , an ioscan | grep -i "NO_HW" will tel you about some issue with hardware that you might have had. OLDsyslog.log in the /var/adm/syslog directory will als give you some hint , in addition to that check the event.log in the directoty /var/opt/resmon/log , you may get some hint there. Regards, FrogId Deaf
FrogIsDeaf
Dree
Advisor

Re: Server restart suddenly with no error

Hi,

I've run what you suggest and see in event.log and find this

it's log below mention that there is some problem with one of the harddisk?

 

Summary:
     Enclosure at hardware path 0/4/0/0.8.0.255.0.10.0 : Hardware failure


Description of Error:

     The enclosure services controller has failed.

Probable Cause / Recommended Action:

     The enclosure services controller has failed. Inspect/verify the
     controller card in slot A.


DeafFrog
Valued Contributor

Re: Server restart suddenly with no error

Hi ,

 

can you please check if #ioscan -funH <hardware_path_you_mentioned> in S/W Sate if you can see NO_HW for the devices attach thru that path , also make sure that event logged in the event.log file are curent i.e some recent time matching the reboot time of your servers.Also , the OLD Syslog.log file must contain some messages , for exapmle if the server rebooted due to some high temp etc .Check the rc.log and rc.log.old in the /etc directory .

 

 

 

FrogIsDeaf
Torsten.
Acclaimed Contributor

Re: Server restart suddenly with no error

Looks like there is a problem with the external disk enclosure controller.

 

Post an

 

# ioscan -fn

 

to see if it is still accessible.


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Dree
Advisor

Re: Server restart suddenly with no error

Hi,

I think it's accessible, for example I got this below line

 

/dev/dsk/c4t0d0   /dev/rdsk/c4t0d0
target       9  0/4/0/0.8.0.255.0.1     tgt       CLAIMED     DEVICE
disk         6  0/4/0/0.8.0.255.0.1.0   sdisk     CLAIMED     DEVICE       SEAGATE ST118202FC
                            /dev/dsk/c4t1d0   /dev/rdsk/c4t1d0
target      10  0/4/0/0.8.0.255.0.2     tgt       CLAIMED     DEVICE
disk         7  0/4/0/0.8.0.255.0.2.0   sdisk     CLAIMED     DEVICE       HP 18.2GST318304FC

 

it's normal condition or maybe something else?

 

Thanks

Your help will be usefull

 

Adrian Tj

Torsten.
Acclaimed Contributor

Re: Server restart suddenly with no error

This could be a DS2405 or something similar.

 

Please post

 

# echo "map" | cstm

 

this should show a map of all devices. If we get the index number of the enclosure controller, we can get more details about the enclosure.


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Dree
Advisor

Re: Server restart suddenly with no error

Hi,

That's right it DS2405, how do you know :D.

 

Here from cstm

 

 Num  Path                 Product                   Active Tool Status
  ===  ==================== ========================= =========== =============
    1  system               system ()                 Information Successful
    2  0                    Bus Adapter (582)         Information Successful
    3  0/0                  PCI Bus Adapter (782)     Information Successful
    4  0/0/0/0              Core PCI 100BT Interface  Information Successful
    5  0/0/1/0              PCI SCSI Interface (10000 Information Successful
    6  0/0/1/1              PCI SCSI Interface (10000 Information Successful
    7  0/0/1/1.0.0          SCSI Disk (SEAGATEST31820 Information Successful
    8  0/0/1/1.2.0          SCSI Disk (IBMDMVS18D)    Information Successful
    9  0/0/2/0              PCI SCSI Interface (10000 Information Successful
   10  0/0/2/0.0.0          SCSI Disk (SEAGATEST17340 Information Successful
   11  0/0/2/0.2.0          SCSI Disk (SEAGATEST39204 Information Successful
   12  0/0/2/1              PCI SCSI Interface (10000 Information Successful
   13  0/0/4/0              RS-232 Interface (103c104 Information Successful
   14  0/0/5/0              RS-232 Interface (103c104 Information Successful
   15  0/1                  PCI Bus Adapter (782)     Information Successful
   16  0/2                  PCI Bus Adapter (782)     Information Successful
   17  0/3                  PCI Bus Adapter (782)     Information Successful
   18  0/3/0/0              Fibre Channel Interface ( Information Successful
   19  0/4                  PCI Bus Adapter (782)     Information Successful
   20  0/4/0/0              Fibre Channel Interface ( Information Successful
   21  0/4/0/0.8            Fibre Channel Driver (Mas
   22  0/4/0/0.8.0.255.0.0. SCSI Disk (HP18.2GST31830 Information Successful
   23  0/4/0/0.8.0.255.0.1. SCSI Disk (SEAGATEST11820 Information Successful
   24  0/4/0/0.8.0.255.0.2. SCSI Disk (HP18.2GST31830 Information Successful
   25  0/4/0/0.8.0.255.0.3. SCSI Disk (SEAGATEST11820 Information Successful
   26  0/4/0/0.8.0.255.0.4. SCSI Disk (SEAGATEST31820 Information Successful
   27  0/4/0/0.8.0.255.0.5. SCSI Disk (SEAGATEST31820 Information Successful
   28  0/4/0/0.8.0.255.0.6. SCSI Disk (SEAGATEST31820 Information Successful
   29  0/4/0/0.8.0.255.0.7. SCSI Disk (HP18.2GST31830 Information Successful
   30  0/4/0/0.8.0.255.0.8. SCSI Disk (SEAGATEST31820 Information Successful
   31  0/4/0/0.8.0.255.0.9. SCSI Disk (SEAGATEST11820 Information Successful
   32  0/4/0/0.8.0.255.0.10 Disk Enclosure (HPA5236A) Information Successful
   33  0/5                  PCI Bus Adapter (782)     Information Successful
   34  0/6                  PCI Bus Adapter (782)     Information Successful
   35  0/7                  PCI Bus Adapter (782)     Information Successful
   36  8                    MEMORY (95)               Information Successful
   37  160                  CPU (5c9)                 Information Successful
   38  166                  CPU (5c9)                 Information Successful

 

 

Thanks

Adrian Tj

Torsten.
Acclaimed Contributor

Re: Server restart suddenly with no error

This is the controller of the enclosure:

32 0/4/0/0.8.0.255.0.10 Disk Enclosure (HPA5236A) Information Successful

Now do

# echo "sel dev 32;info;wait;il" | cstm

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Dree
Advisor

Re: Server restart suddenly with no error

Hi,

This is what I get, the controller A get critical like information below?

 

Controller B
------------
       FC Loop ID: 255 (decimal)
     Enclosure ID: 0
    Hardware Path: 0/4/0/0.8.0.255.0.10.0
        Serial No: USSA12036855
   WW Name (node): 50 06 0B 00 00 0C 70 44
   WW Name (port): 50 06 0B 00 00 0C 70 44
    Firmware Rev.: HP10

       Annotation:


Enclosure Status
----------------

Disk Modules
------------

       -----------------------------------------------------------------------
SLOT   |   0  |   1  |   2  |   3  |   4  |   5  |   6  |   7  |   8  |   9  |
BUS ID |   0  |   1  |   2  |   3  |   4  |   5  |   6  |   7  |   8  |   9  |
STATUS |  OK  |  OK  |  OK  |  OK  |  OK  |  OK  |  OK  |  OK  |  OK  |  OK  |
       -----------------------------------------------------------------------


Controllers       Status
-----------       -------------
   A:             CRITICAL
                  The enclosure services controller has failed.  Replace the
                  controller card.
   B:             OK               (Reporting Controller)

Fan Modules       Status
-----------       -------------
   A:             OK
   B:             OK

Power Supplies    Status
--------------    -------------
   A:             OK
   B:             OK

GBIC              Status
--------------    -------------
   Controller A
   ------------
        Primary:  UNKNOWN
      Expansion:  UNKNOWN

   Controller B
   ------------
        Primary:  OK
      Expansion:  NOT INSTALLED

Voltage Sensors   Voltage    Status
---------------   -------    -------------
   Controller A
   ------------
      3.3v:                  UNKNOWN
      5.0v:                  UNKNOWN
       12v:                  UNKNOWN

   Controller B
   ------------
      3.3v:          3.29    OK
      5.0v:          5.06    OK
       12v:         12.22    OK

Temp Sensors      Temperature    Status
------------      -----------    -------------
   Sensor 1:                     UNKNOWN
   Sensor 2:                     UNKNOWN
   Sensor 3:      25 (Celsius)   OK
   Sensor 4:      26 (Celsius)   OK