- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: Error mesg in kernel
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-21-2011 02:30 AM
тАО03-21-2011 02:30 AM
Error mesg in kernel
I got an error message in /var/log/messages as below
Unable to handle kernel NULL pointer dereference at 0000000000000001 RIP
My server had got rebooted at 2.00 0 clock
While checking the messages i got this messages
Unable to handle kernel NULL pointer dereference at 0000000000000001 RIP
in /var/log/messages
Can anyone help me the root cause for this error message in the log
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-21-2011 10:51 AM
тАО03-21-2011 10:51 AM
Re: Error mesg in kernel
Normally this message is followed by many other lines that will provide more details about what the processor was doing when the error happened.
From this one line only, it is impossible to say anything more.
An example of a complete "Unable to handle kernel NULL pointer dereference..." message is in the first post of this Fedoraforum thread:
http://forums.fedoraforum.org/showthread.php?t=52508
All the lines dated "Apr 22 13:00:06" were generated by a single error event. Much of the numeric contents are useful for serious kernel programmers only, but the "Call Trace" might be the easiest to understand: it is a list of kernel functions executed just before the error was detected.
If the error happens again and the Call Trace is again the same, you may have found a kernel bug. If the system reboots again but the Call Trace is always different, a failing hardware component (processor, RAM or system board) might be a likely cause.
MK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-21-2011 09:38 PM
тАО03-21-2011 09:38 PM
Re: Error mesg in kernel
The message log is showing the following messages.
Mar 20 02:03:01 vpp211 crond(pam_unix)[25484]: session closed for user linus
Mar 20 02:04:01 vpp211 crond(pam_unix)[25809]: session opened for user linus by
(uid=0)
Mar 20 02:04:02 vpp211 crond(pam_unix)[25809]: session closed for user linus
Mar 20 02:05:01 vpp211 crond(pam_unix)[26141]: session opened for user linus by
(uid=0)
Mar 20 02:05:01 vpp211 crond(pam_unix)[26141]: session closed for user linus
Mar 20 02:06:01 vpp211 crond(pam_unix)[26467]: session opened for user linus by
(uid=0)
Mar 20 02:06:02 vpp211 kernel: Unable to handle kernel NULL pointer dereference
at 0000000000000001 RIP:
Mar 20 02:20:33 vpp211 syslogd 1.4.1: restart.
Mar 20 02:20:33 vpp211 syslog: syslogd startup succeeded
Mar 20 02:20:33 vpp211 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Mar 20 02:20:33 vpp211 kernel: Bootdata ok (command line is ro root=LABEL=/ hdd=
ide-scsi console=tty0 console=ttyS1,115200)
After that error , the server got restarted automatially .(ASR is enabled in this server )
we are using
Linux 2.6.9-78.ELsmp #1 SMP Wed Jul 9 15:46:26 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
kernel of RHEL (64 bit)
Some sceduled jobs are assigned for the server. These cron jobs are executing automatically , while this time the error got happened , and nobody was manually interrupted any of the processes at this time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-22-2011 09:33 PM
тАО03-22-2011 09:33 PM
Re: Error mesg in kernel
The server is getting hang atleast once in a month; Since ASR is enabled it will get recovered form the hang ( automatic rebbot will occur) . So i'm not able to get the exact error , and the message logs are not showing proper information.
can i assume this frequent hanging as hardware issue ? Is there any possibility for that ?
I'm expecting a reply ...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-22-2011 11:38 PM
тАО03-22-2011 11:38 PM
Re: Error mesg in kernel
Apparently the error causes problems with the disk I/O subsystem, because the kernel seems to be unable to write the full error message to /var/log/messages. (You might want to check driver versions and/or firmware levels. If newer versions are available, read the release notes to see if any of the descriptions of fixed problems matches your problem.)
There are a few things you might want to do to gather more information when the error happens again:
* If the server can remain in a hung state for a while (i.e. not a critical service) and the server is conveniently accessible physically, you might disable the ASR and wait for the problem to happen. Most likely it will cause the server to hang. Then look at the console display: the full error message might be visible there, even if it cannot be written to disk. Once you've captured the error message from the screen (a camera might be useful here), you can reboot the system and enable ASR again.
* If the server has a serial port that is not used for anything else, you might connect it to another system and send all kernel messages to that serial port. Then the other system can be used to record all incoming data from the serial port. (A serial port is a very simple device, and it may remain functional even if more complex parts of the system are disabled because of this error.)
For example, if you want to send all kernel messages to serial port ttyS0 (=COM1), you'll need to add this to your boot options:
console=ttyS0 console=tty0
This will keep the normal console display working as usual, but all kernel messages are sent to the serial port too.
* You might also try configuring a crash dump feature (in your RHEL 4, either diskdump or netdump is available). This may require more configuration changes than the other options, but it has a chance of providing much more information for troubleshooting.
These RedHat Knowledge Base documents might be helpful: (RedHat Network access required)
https://access.redhat.com/kb/docs/DOC-5413
https://access.redhat.com/kb/docs/DOC-7075
https://access.redhat.com/kb/docs/DOC-6855
You might also want to run memtest86 or a similar hardware test utility in a loop over a weekend or so. If the test utility produces error messages or crashes, it's a hardware issue.
MK