1833851 Members
2032 Online
110063 Solutions
New Discussion

Re: dmesg got error

 
SOLVED
Go to solution
Idham
Frequent Advisor

dmesg got error

Can anybody assist?..
Actualy I just ignore this message earlier on (below dmesg).
But yesterday 15 Dec 08...the server rebooted by itself. Maybe due to this error..so,how to overcome/troubleshoot this situation?...

fyi..already ioscan on FC and DISK. No problem detected.

Nov 26 22:07:56 camrac01 vmunix: DIAGNOSTIC SYSTEM WARNING:
Nov 26 22:07:56 camrac01 vmunix: The diagnostic logging facility has started receiving excessive
Nov 26 22:07:56 camrac01 vmunix: errors from the I/O subsystem. I/O error entries will be lost
Nov 26 22:07:56 camrac01 vmunix: until the cause of the excessive I/O logging is corrected.
Nov 26 22:07:56 camrac01 vmunix: If the diaglogd daemon is not active, use the Daemon Startup command
Nov 26 22:07:56 camrac01 vmunix: in stm to start it.
Nov 26 22:07:56 camrac01 vmunix: If the diaglogd daemon is active, use the logtool utility in stm
Nov 26 22:07:56 camrac01 vmunix: to determine which I/O subsystem is logging excessive errors.


11 REPLIES 11
Dennis Handly
Acclaimed Contributor

Re: dmesg got error

>the server crashed. Maybe due to this error? so, how to overcome/troubleshoot this situation?

First look at the crash/logs from yesterday.
Do you have a crash dump in /var/adm/crash?
Look at /var/adm/syslog/OLDsyslog.log to see if there are any I/O errors logged from Nov or yesterday.
Analyst
Trusted Contributor

Re: dmesg got error

Hi Idham,

dmesg errors may be old one but,

Run the CSTM utility
# cstm>selall>infolog ; wait

check for the errors.
check for the old syslog for any errors.
check for the diaglogd daemon is active.

Thanks,
Analyst.
Idham
Frequent Advisor

Re: dmesg got error

Hi Analyst...

I'm not familiar with the STM...can assist?
will it affect running oracle db.
how to check whether diaglod is running?
Kapil Jha
Honored Contributor

Re: dmesg got error

These error are of 26Nov and if this is the llast few lines of dmesg then it is not at all the cause of server crash.
Above error says and logging facility is receiing lot of alert/errors which can be of anything in server.

check /var/adm/messages entries for 15th Dec!!

Do u have any other error in syslog.log, any crash dumps ts99 file??

BR,
Kapil+
I am in this small bowl, I wane see the real world......
Idham
Frequent Advisor

Re: dmesg got error

I've checked ..but there was no error in syslog/OLDsyslog and /var/adm/crash/

What is "crash dumps ts99 file" anyway?
Kapil Jha
Honored Contributor

Re: dmesg got error

Just check /var/adm/crash directory if you have anything.
ts99 file is in /var/tombstones directory

These are the files which are used to investigate the cause of crash.Crash files are actully memory image when system booted.

as there is no error after 26Nov you will have to have crash file else it would be tuff to find out the exact cause of crash.
BR,
Kapil+


I am in this small bowl, I wane see the real world......
Analyst
Trusted Contributor

Re: dmesg got error

1.Cstm is a support manager tool , use the sam way how I wrote.
#cstm
#cstm>selall
#cstm>infolog ; wait

displays the hardware information completely

2. #ps -ef |grep diaglogd

>:will it affect running oracle db.--->you are not stopping anything, its only checking.

Thanks,
Analyst.

Suraj K Sankari
Honored Contributor

Re: dmesg got error

Hi,

As per your log I can see date was old (Nov 26 22:07:56) what you get at your /var/adm/syslog/syslog did you find any error message over there.

Suraj
Idham
Frequent Advisor

Re: dmesg got error

Analyst,

need ur help..i've run the STM>selall>infolog
Can u please check below warning for me...Is it problem with the Memory or Optical device?..
Is the data up-to-date or do I need to run some command to collect any new data?
Last Last Op
Path Product Active Tool Status
==================== ========================= =========== =============
system system (1005) Information Successful
memory IPF_MEMORY (1005) Information Successful
0 Bus Adapter (103c1229) Information Successful
0/0 PCI Bus Adapter (103c122e Information Successful
0/0/1/0 RS-232 Interface (103c129 Information Successful
0/0/1/1 RS-232 Interface (103c104 Information Successful
0/0/2/0 USB Open Host Controller
0/0/2/1 USB Open Host Controller
0/0/2/1.0 Generic USB Interface (30 Information Incomplete
0/0/2/2 USB Enhanced Host Control
0/0/3/0 IDE Bus Adapter (10950649 Information Successful
0/0/3/0.0 IDE Interface (10950649) Information Successful
0/0/3/0.0.0.0 Optical Storage Device (T Information Warning
0/0/3/0.1 IDE Interface (10950649) Information Successful
0/0/4/0 Graphics Interface (GRAPH Information Warning
0/1 PCI Bus Adapter (103c122e Information Successful
0/1/1/0 MPT SCSI Adapter (HP_A717 Information Successful
0/1/1/0.0.0 SCSI Disk (HP73.4GST37345 Information Successful
0/1/1/0.1.0 SCSI Disk (HP73.4GST37345 Information Successful
0/1/1/1 MPT SCSI Adapter (HP_A717 Information Successful
0/1/1/1.3.0 SCSI Tape (HPC7438A)
0/1/2/0 1000BaseT 2 port Adapter Information Successful
0/1/2/1 1000BaseT 2 port Adapter Information Successful
0/2 PCI Bus Adapter (103c122e Information Successful
0/2/1/0 1000BaseT 2 port Adapter Information Successful
0/2/1/1 1000BaseT 2 port Adapter Information Successful
0/3 PCI Bus Adapter (103c122e Information Successful
0/3/2/0 Fibre Channel Interface ( Information Successful
0/3/2/0.145 Fibre Channel Driver (Mas
0/3/2/0.145.0.0.0.0. SCSI Disk (DGCCX3-40cWDR5
0/3/2/0.145.0.0.0.0. SCSI Disk (DGCCX3-40cWDR5
0/3/2/0.145.0.0.0.0. SCSI Disk (DGCCX3-40cWDR5
0/3/2/0.145.0.0.0.0. SCSI Disk (DGCCX3-40cWDR5
0/3/2/0.145.0.0.0.0. SCSI Disk (DGCCX3-40cWDR5


Memory Board Inventory

DIMM Location Size(MB) DIMM Location Size(MB)
-------------------- -------- -------------------- --------
DIMM 0A 1024 DIMM 0B 1024
DIMM 0C 1024 DIMM 0D 1024
DIMM 1A 512 DIMM 1B 512
DIMM 1C 512 DIMM 1D 512
DIMM 2A ---- DIMM 2B ----
DIMM 2C ---- DIMM 2D ----
DIMM 3A ---- DIMM 3B ----
DIMM 3C ---- DIMM 3D ----

Total: 6144 (MB)

===========================================================================

Memory Error Log Summary

DIMM Location Error Address Error Type Page Count
---------------------- ---------------- ---------- ------------- -----
DIMM 0C 0x40586c3580 Single-Bit 0x40586c3 2

System start: Mon May 14 17:15:22 2007.
Last error detected: Mon May 14 17:15:22 2007.
Logging interval: 900 seconds.
1 address(es) with errors logged in memory error log.

The Logtool Utility provides full details about the memory error log.

Page Deallocation Table (PDT)

DIMM Location Error Address Error Type Page
---------------------- ---------------- ---------- -------------
DIMM 0C 0x40586c3580 Single-Bit 0x40586c3

PDT Entries Used: 1
PDT Entries Free: 99
PDT Total Size: 100
Analyst
Trusted Contributor

Re: dmesg got error

Idham,

#cstm>selall>runutil> choose logtool-->SAV-->save the file and find the errors.


Seems to be some memory h/w issue.

Else upload the same.we will analyse it.


Thanks,
Analyst.
cnb
Honored Contributor
Solution

Re: dmesg got error

To check if diaglogd is running:

# ps -ef | grep diag

you should see something like this:
root 1467 1240 0 Dec 12 ? 0:19 diaglogd


To check the Raw Summary (rs) of I/O errors in logtool:

# cstm
cstm> ru logtool
Logtool> rs
Logtool> q
Logtool> saveas
(give it a location and name you'll remember)
Logtool> done

To format the current raw file to get detailed I/O error information:

Logtool> fl
(select default location)
Logtool> q
Logtool> saveas
(choose a name and location you'll remember)
Logtool> done
Logtool> exit
cstm> exit

view the files you created to see where your errors are or post the files here.

HTH,