System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

diagmond stopped across many servers

 
mattmadey
Visitor

diagmond stopped across many servers

The diagmond process stopped on several servers around the same time period. Trying to start the daemon produces no ouput or errors, but the process does not start:

 

 

[/etc/opt]# /sbin/init.d/diagnostic start

[/etc/opt]#

 

No errors have been written to syslog or activity_log

 

The only relevant information I found in the logfiles points to a larger issue of losing LUN paths, so this may be the cause of the daemon crashing, but I'm not seeing why it won't start or at least throw an error:

 

Jun 15 08:15:34 hpx-oss-3 vmunix: DIAGNOSTIC SYSTEM WARNING:
Jun 15 08:15:34 hpx-oss-3 vmunix:    The diagnostic logging facility has started receiving excessive
Jun 15 08:15:34 hpx-oss-3 vmunix:    errors from the I/O subsystem.  I/O error entries will be lost
Jun 15 08:15:34 hpx-oss-3 vmunix:    until the cause of the excessive I/O logging is corrected.
Jun 15 08:15:34 hpx-oss-3 vmunix:    If the diaglogd daemon is not active, use the Daemon Startup command
Jun 15 08:15:34 hpx-oss-3 vmunix:    in stm to start it.
Jun 15 08:15:34 hpx-oss-3 vmunix:    If the diaglogd daemon is active, use the logtool utility in stm
Jun 15 08:15:34 hpx-oss-3 vmunix:    to determinclass : disk, instance 66
Jun 15 08:15:34 hpx-oss-3 vmunix: All available lunpaths of a LUN (dev=0x1000033)
Jun 15 08:15:34 hpx-oss-3 vmunix: have gone offline. The LUN has entered a transient
Jun 15 08:15:34 hpx-oss-3 vmunix: condition. The transient time threshold is 120 seconds.
Jun 15 08:15:34 hpx-oss-3 vmunix: 2 lunpaths are currently in a failed state.

 

Any ideas??

9 REPLIES
Shibin_2
Honored Contributor

Re: diagmond stopped across many servers

what's your OS version ?

Regards
Shibin
Bill Hassell
Honored Contributor

Re: diagmond stopped across many servers

Run this command to determine the diagnostic version:

 

# echo "selclass qualifier system;info;wait;infolog" | cstm
Running Command File (/usr/sbin/stm/ui/config/.stmrc).

-- Information --
Support Tools Manager

Version C.53.00

Product Number B4708AA

...

 



Bill Hassell, sysadmin
mattmadey
Visitor

Re: diagmond stopped across many servers


Shibin_2 wrote:

what's your OS version ?


B.11.31 U ia64

mattmadey
Visitor

Re: diagmond stopped across many servers


Bill Hassell wrote:

Run this command to determine the diagnostic version:

 

# echo "selclass qualifier system;info;wait;infolog" | cstm

Running Command File (/usr/sbin/stm/ui/config/.stmrc).

-- Information --
Support Tools Manager

Version C.53.00

Product Number B4708AA

...

 

[/]# echo "selclass qualifier system;info;wait;infolog" | cstm

 This command is not supported on this server model.
 Please use System Management Homepage (see hpsmh(1M)).


 

Torsten.
Acclaimed Contributor

Re: diagmond stopped across many servers

What result gives this:

 

# /opt/sfm/bin/sfmconfig -w –q

 

 


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
mattmadey
Visitor

Re: diagmond stopped across many servers


Torsten. wrote:

What result gives this:

 

# /opt/sfm/bin/sfmconfig -w –q

 

 


[/]# /opt/sfm/bin/sfmconfig -w -q
SysFaultMgmt is the default monitoring mode.
Note: EMS hardware monitors(OnlineDiag) are not supported on this platform.

Torsten.
Acclaimed Contributor

Re: diagmond stopped across many servers

Then please post the result of

# model

# cimprovider –l –s

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
mattmadey
Visitor

Re: diagmond stopped across many servers


Torsten. wrote:
Then please post the result of

# model

# cimprovider –l –s

hpx-oss-3:[/]# model
ia64 hp Integrity BL870c i2
hpx-oss-3:[/]# cimprovider -l -s
MODULE                                        STATUS
OperatingSystemModule                         OK
ComputerSystemModule                          OK
ProcessModule                                 OK
IPProviderModule                              OK
DNSProviderModule                             OK
NTPProviderModule                             OK
NISProviderModule                             OK
IOTreeModule                                  OK
HPUXLVMProviderModule                         OK
AmgrAgentProviderModule                       OK
EMSHAProviderModule                           OK
HPUX_ProviderModule                           OK
HP_NParProviderModule                         OK
HP_ResParProviderModule                       OK
Hewlett-Packard:CDM-ProvidersModule           OK
HPUXRAIDSAProviderModule                      OK
HPUXRAIDSACSProviderModule                    OK
HPUXRAIDSANativeIndicationProviderModule      OK
HPUXRAIDSAIndicationProviderModule            OK
HPUXSASProviderModule                         OK
HPUXSASCSProviderModule                       OK
HPUXSASNativeIndicationProviderModule         OK
HPUXSASIndicationProviderModule               OK
HPUXSCSIProviderModule                        OK
HPUXSCSICSProviderModule                      OK
SFMProviderModule                             OK
HP_VParProviderModule                         OK
HPUXFCCSProviderModule                        OK
HPUXFCIndicationProviderModule                OK
HPUXFCNativeIndicationProviderModule          OK
HPUXFCProviderModule                          OK
FSProviderModule                              OK
HPUXIOTreeIndicationProviderModule            OK
HPUXIOTreeCSProviderModule                    OK
HPUXLANProviderModule                         OK
HPUXLANCSProviderModule                       OK
HPUXLANIndicationProviderModule               OK
HPUXStorageIndicationProviderModule           OK
HPUXStorageNativeProviderModule               OK
HP_iCODProviderModule                         OK
HP_iCAPProviderModule                         OK
HP_GiCAPProviderModule                        OK
HP_OLOSProviderModule                         OK
HP_UtilizationProviderModule                  OK
HPVMProviderModule                            OK
SDProviderModule                              OK

Torsten.
Acclaimed Contributor

Re: diagmond stopped across many servers

This system runs system fault management, no longer the old online diagnostics.

Consider to see the system management homepage for current status or connect to the MP (ilo) and review the system logs (SL).

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!