1753730 Members
4590 Online
108799 Solutions
New Discussion

Re: ML530 ASR(s)

 
Phil Cornwall
Occasional Contributor

ML530 ASR(s)

The server in question:
ML530
1 GHZ P3 XEON processor
1.2 GB RAM
Smart Array 5300 controller
6 X 18.2 Ultra3 SCSI 10K drives (RAID 5 - 1 Hot spare)

It is a file and print server running Win2K Server SP3.

We have experienced ASRs since it was put into
service almost 2 years ago.

The ASRs were very infrequent at first, but now seem to occur every couple of weeks, usually during the nightly backup. (Using Retrospect Server v. 5.6)

Two weeks ago, I finally got around to installing update pack 6.3, and installed all components as recommended. The problem was not solved.

I have attached the IML with all the details, but basically there are Blue Screen Traps, Post Errors, and most importantly, Uncorrectable memory errors.

The IML indicates the Memory Module(s) in question for each hard memory error.

The modules originally occupied the slots thusly: (By recommended Bank Order)

DIMM 1: 512 (Kingston)
DIMM 5: 512 (Kingston)
DIMM 2: 256 (Compaq)

The IML originally reported the bad module as the 512 in number 1. I then swapped it with a 512 from number 5. The next ASR with a hard memory error reported the bad module as number 2. Then I swapped the 256 in number 2 with the 512 in number 1. The next ASR reported number 2 with the 512 again, so I removed this module from the equation altogether. I have not seen Uncorrectable memory errors since, but I DO get correctable errors every time the machines boots up.

The only things I haven't tried yet are:

1. Try it out with just the 256 in slot 1, just in case the Kingston Memory is no good. (2 bad modules?)

2. Boot the server with no memory mods to 'reset' the slots, then re-load them all and try it again.

3. Buy a brand new module and give it a try.

I am, however, beginning to think that the slots/motherboard may be bad, since by my estimation I've tried enough combinations of module placement.

The last ASR we had 4 days ago logged a blue screen trap.

The event (system) log is still reporting the soft memory errors after each boot, as well as SNMP warnings about DHCPMib WINSMib and CPQASMMib having misconfigured or missing registry entries........

Any help here would be appreciated.

Phil Cornwall

A closed mouth gathers no foot.
1 REPLY 1
Phil Cornwall
Occasional Contributor

Re: ML530 ASR(s)

I was wrong about the "CPQASMMibAgent warning in the event log. I just get the same 2 SNMP warnings re: the soft memory errors upon each and every boot-up.

See Attachment showing the SNMP trap with the memory module details from the system log.

As I said before, I've played musical slots with the memory modules, and removed one module that might be bad, but still get these correctable memory errors.

Phil
A closed mouth gathers no foot.