HP 9000
cancel
Showing results for 
Search instead for 
Did you mean: 

rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

SOLVED
Go to solution
moonchild
Regular Advisor

rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

8 x 1 G DIMMS where installed on february on cell0 for a total of 16G on cell 0

on august we noticed that 8G on cell 0 are deconfigured after patch installation.

we reinstalled the system from an ignite image and we still see the 8G deconfigured.

attached the mp log
no stm events regarding the memory deallocation
the memlog shows 0 orrors in PDT.

your help is much appreciated
thank you in advance
10 REPLIES
Steven E. Protter
Exalted Contributor

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

Shalom,

I still suspect a hardware problem and would call the hardware group in for diagnostics.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
moonchild
Regular Advisor

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

after upgrading STM:

Memory Board Inventory
CAB/CELL: 0/0
DIMM A DIMM B DIMM C DIMM D
Slot Size (MB) Size (MB) Size (MB) Size (MB)
---- --------- --------- --------- ---------
0 1024 1024 1024 1024
1 0 0 0 0
2 1024 1024 1024 1024
3 0 0 0 0
Cell Total (MB): 8192
-------------------------------------------------
CAB/CELL: 0/1
DIMM A DIMM B DIMM C DIMM D
Slot Size (MB) Size (MB) Size (MB) Size (MB)
---- --------- --------- --------- ---------
0 1024 1024 1024 1024
1 1024 1024 1024 1024
2 1024 1024 1024 1024
3 0 0 0 0
Cell Total (MB): 12288
-------------------------------------------------
CAB/CELL: 0/2
DIMM A DIMM B DIMM C DIMM D
Slot Size (MB) Size (MB) Size (MB) Size (MB)
---- --------- --------- --------- ---------
0 2048 2048 2048 2048
1 1024 1024 1024 1024
2 1024 1024 1024 1024
3 0 0 0 0
Cell Total (MB): 16384
-------------------------------------------------
System Total (MB): 36864
Memory Error Log Summary
The memory error log is empty.
Page Deallocation Table (PDT)
PDT Entries Used: 0
PDT Entries Free: 200
PDT Total Size: 200
Khairy
Esteemed Contributor

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

hi,

"the memlog shows 0 orrors in PDT"
AFAIK, you will see something in pdt if system firmwares encounter single/double bit error in mem dimms.

My opinion, you got bad memory dimms. I think 1 or 2 dimm in memory rank 1 and 3 in cell 0 is failing or maybe all in rank 1 and 3.

If you have contract with HP, it would be good if you could contact them.

if you want to try on your own, do this.

i see cell2 is not associate with any partition. Get all the working dimms in cell2 and installed in cell1.

Then power up par 0. If all ok, then you cld begin with troubleshooting with cell2 which now contain all the mem dimms from cell0.

try to swap dimm by dimm.

post us the progress.

Rgds
Torsten.
Acclaimed Contributor

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

PDT is something different.

You will see entries if you have bad areas on a DIMM (4k areas will be excluded).

If you have a bad dimm, there will be nothing in pdt.

look:

ERR_MEM_SUBSYS_FAIL_MID1
MEM_BUS_PARITY_ERROR
MEM_DIMM_DEALLOCATED_PARITY_ERROR


I would suggest to give hp support a call ...

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Torsten.
Acclaimed Contributor

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

Try to execute "in me" from the PDC boot menu level (BCH).

Example for cell 6 (your is cell 0):

Main Menu: Enter command or menu > in me 6

CELL MEMORY INFORMATION

Memory Information for Cell: 6 Cab/Slot: 0/ 6

---- DIMM A ---- ---- DIMM B ----
DIMM Current DIMM Current
Echelon Size Status Size Status
------- ------ ---------- ------ ----------
0 1024MB Active 1024MB Active
1 1024MB HW Deconf 1024MB HW Deconf
2 1024MB Active 1024MB Active
3 1024MB Active 1024MB Active
4 1024MB Active 1024MB Active
5 1024MB HW Deconf 1024MB HW Deconf
6 1024MB Active 1024MB Active
7 1024MB Active 1024MB Active
8 1024MB Active 1024MB Active
9 1024MB HW Deconf 1024MB HW Deconf
A 1024MB Active 1024MB Active
B 1024MB Active 1024MB Active
C --- ---
D --- ---
E --- ---
F --- ---
...

Results?

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
moonchild
Regular Advisor

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

attached is the mem info on cell 0.

but could it be that all 8 x Dimms went bad? so they got deconfigured?
Torsten.
Acclaimed Contributor

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

On this system memory must be installed in "ranks" (=4 Dimms). if there is a bad dimm, the whole rank will probably be disabled during selftest.

You have either a number of bad DIMMS in both ranks or a problem on the cellboard itself.

If you want to test, you can try to move all dimms from 1x and 3x to 0x and 2x and vice versa.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Stefan Stechemesser
Honored Contributor
Solution

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

Hi Moonchild,

your system had an address or control parity error (look at the Chassis codes, Torsten already filtered out):

470 PDC 0,0,0 *6 0x20000760000063dc 0x00ffff00ffffff93 ERR_MEM_SUBSYS_FAIL_MID1
470 PDC 0,0,0 *6 0x58000f00000063d0 0x00006c07030a0904 08/03/2008 10:09:04
471 PDC 0,0,0 *10 0x180001ad76001091 0x0000000000000001 MEM_BUS_PARITY_ERROR
471 PDC 0,0,0 *10 0x5800090000001090 0x00006c07030a0904 08/03/2008 10:09:04


before that you had many system panics. Could be that the memory error was responsible for those panics, too.
Note that this was not a data error, that would be fixed by ECC correction.
The HP Dimms contain a special chip that also parity protects the address and control lines. If such an error happens during normal work, then an HPMC will happen.

I guess that this error above happened in the selftest after an OS panic and firmware did deconfigure ALL Dimms on that memory bis "MID1" to prevent a crash loop.

The rp8400 cells have each two memory busses (MID 0 and MID 1) this is the reason why half of the Dimms are now deconfigured.

I would reseat all deconfigured dimms. Maybe one is not properly seated. Then you have to reconfigure them in the BCH configuration menu.
It this does not help, then one of the Dimms or the cellboard may be defect.

best regards

Stefan
Torsten.
Acclaimed Contributor

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

How to activate them again?

DimmDealloc ON Schedule the DIMM for reconfiguration




Service Menu: Enter command > dd 0 1a on

DIMM: 1A is scheduled for allocation after the next reboot.


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
moonchild
Regular Advisor

Re: rp8400 - 11.11 - 4 x 2G DIMMS are deallocated

1 x bad DIMM was the culprit

Thank you all for your help