Operating System - Linux
1819684 Members
3388 Online
109605 Solutions
New Discussion юеВ

memtest/memtest86 on Linux?

 
TJ_16
Frequent Advisor

memtest/memtest86 on Linux?

I have a Proliant DL580 G2 maching running Redhat Advanced Server 2.1. This machine locks up every so often and requires a hard reboot.
I am suspecting memory, but I do not get any HP Management alerts saying I have a bad DIMM. The only thing I see when this happens is on the console I get a NMI handler:
"Uhhuh. NMI received. Dazed and confused, but trying to continue".
I have also seen a few ILO logs about "correctable memory error".
Anyway, someone suggested running memtest(86), but I am clueless as to how to run that.

Any idead or help with memtest?
5 REPLIES 5
Gopi Sekar
Honored Contributor

Re: memtest/memtest86 on Linux?



memtest is a standalone program which does not depend on any OS. it has its own bootable image which will launch memtest

I believe SUSE installation CD (CD 1) contains a link to launch memtest. you can choose from the drop down list shown when you boot from CD.

check http://www.memtest86.com to download the source or ISO image.

source images can be compiled and it will give bootable image which can either be booted through CD or floppy. or a simple method would be to download the ISO image and burn a CD and boot from it.

here is the snip from README of memtest86

Memtest86 is a stand alone program and can be loaded from either a disk
partition or from a floppy disk.

To build Memtest86:
1) Review the Makefile and adjust options as needed.
2) Type "make"

This creates a file named "memtest.bin" which is a bootable image. This
image file may be copied to a floppy disk or lilo may be used to boot this
image from a hard disk partition.

To create a Memtest86 bootdisk
1) Insert a blank write enabled floppy disk.
2) As root, Type "make install"

To boot from a disk partition via lilo
1) Copy the image file to a permanent location (ie. /memtest).
2) Add an entry in the lilo config file (usually /etc/lilo.conf) to boot
memtest86. Only the image and label fields need to be specified.
The following is a sample lilo entry for booting memtest86:

image = /memtest
label = memtest
3) As root, type "lilo"

At the lilo prompt enter memtest to boot memtest86.

If you encounter build problems a binary image has been included (precomp.bin).
To create a boot-disk with this pre-built image do the following:
1) Insert a blank write enabled floppy disk.
2) Type "make install-precomp"



Regards,
Gopi
Never Never Never Giveup
TJ_16
Frequent Advisor

Re: memtest/memtest86 on Linux?

Thanks. I have downloaded the .iso and created a boot disk. My server is currently running memtest (I think).
Any idea how long the test takes and how many test does it run?

Also, is there any user interaction I need to do or it is pretty much automatic?
Gopi Sekar
Honored Contributor

Re: memtest/memtest86 on Linux?

Check this link for execution time and test cases that memtest execute, http://www.memtest86.com/#timing

also check this link for list of available commands,
http://www.memtest86.com/#commands

simply press ESC to exit and reboot the machine.

in my view, memtest86 is automatic and does not require any input unless you want to configure advanced options

Regards,
Gopi
Never Never Never Giveup
TJ_16
Frequent Advisor

Re: memtest/memtest86 on Linux?

OK, one more question...
I assume if memtest keeps running for a few tests, then everythign is OK? If it does find a problem, does it stop and let you know?
Gopi Sekar
Honored Contributor

Re: memtest/memtest86 on Linux?

From memtest86 doc,


Memtest has two options for reporting errors. The default is to report
individual errors. In BadRAM Patterns mode patterns are created for
use with the Linux BadRAM feature. This slick feature allows Linux to
avoid bad memory pages. Details about the BadRAM feature can be found at:

http://home.zonnet.nl/vanrein/badram

For individual errors the following information is displayed when a memory
error is detected. An error message is only displayed for errors with a
different address or failing bit pattern. All displayed values are in
hexadecimal.

Tst: Test number
Failing Address : Failing memory address
Good: Expected data pattern
Bad: Failing data pattern
Err-Bits: Exclusive or of good and bad data (this shows the
position of the failing bit(s))
Count: Number of consecutive errors with the same address
and failing bits

Regards,
Gopi
Never Never Never Giveup