HPE 9000 and HPE e3000 Servers
1747985 Members
4729 Online
108756 Solutions
New Discussion юеВ

Possible Memory Issue on 8420

 
Paul Coffey_1
Advisor

Possible Memory Issue on 8420

We recently had a memory problem with DIMM 1b on cell 0 which deconfigured. Today we replaced the DIMMS for 1b and 1a.

When we boot the system at BCH all of the memory is reported but when we look at CSTM cell 0 is reporting 1 dimms worth of memory less than it should.

We replaced the dimms a second time and still have the same issue.

Swapinfo also shows less total memory than what is in the system but dmesg shows the correct amount.

I believe BCH but I need to determine why CSTM and swapinfo show the incorrect amount especially CSTM since that is what our team monitors. I'm assuming the stm issue is cached data or a patch issue.

Below is the output showing memory. Any help is greatly appreciated.


DoCalllist done



Memory Class Setup

-------------------------------------------------------------------------

Class Physmem Lockmem Swapmem

-------------------------------------------------------------------------

System : 159661 MB 159661 MB 159661 MB

Kernel : 159661 MB 159661 MB 159661 MB

User : 152142 MB 132602 MB 133122 MB


swapinfo -tm

Mb Mb Mb PCT START/ Mb

TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME

dev 73728 0 73728 0% 0 - 1 /dev/vg00/lvol2

dev 40928 0 40928 0% 0 - 1 /dev/vg00/lvol11

reserve - 1533 -1533

memory 159661 14727 144934 9%

total 274317 16260 258057 6% - 0 -

Support Tools Manager


OUTPUT OF BCH
Cell Echelon 0-3 Echelon 4-7

Size Status Size Status

---- ------ --------- ------ ---------

0 8192MB Active 8192MB Active

8192MB Active 8192MB Active

8192MB Active 8192MB Active

8192MB Active 8192MB Active

1 8192MB Active 8192MB Active

8192MB Active 8192MB Active

8192MB Active 8192MB Active

8192MB Active 8192MB Active

2 4096MB Active 4096MB Active

4096MB Active 4096MB Active

4096MB Active 4096MB Active

4096MB Active 4096MB Active



Partition Total Memory: 163840 MB

Partition Active Memory: 163840 MB

Partition Deconfigured Memory: 0 MB


OUTPUT OF STM

Version C.58.00

-- Information Tool Log for MEMORY on path 0/5 --

Log creation time: Sun May 8 03:52:10 2011

Hardware path: 0/5


Basic Memory Description

Module Type: MEMORY
Total Configured Memory : 159744 MB
Page Size: 4096 Bytes

Memory interleaving is supported on this machine and is ON.

Memory Board Inventory

CAB/CELL: 0/0

DIMM A DIMM B
Slot Size (MB) Size (MB)
---- --------- ---------
0 4096 4096
1 4096 4096
2 4096 4096
3 4096 4096
4 4096 4096
5 4096 4096
6 4096 4096
7 4096 4096

Cell Total (MB): 61440
-------------------------------------------------

CAB/CELL: 0/1

DIMM A DIMM B
Slot Size (MB) Size (MB)
---- --------- ---------
0 4096 4096
1 4096 4096
2 4096 4096
3 4096 4096
4 4096 4096
5 4096 4096
6 4096 4096
7 4096 4096

Cell Total (MB): 65536
-------------------------------------------------

CAB/CELL: 0/2

DIMM A DIMM B
Slot Size (MB) Size (MB)
---- --------- ---------
0 2048 2048
1 2048 2048
2 2048 2048
3 2048 2048
4 2048 2048
5 2048 2048
6 2048 2048
7 2048 2048

Cell Total (MB): 32768
-------------------------------------------------

System Total (MB): 159744

Memory Error Log Summary

Error
CAB/CELL DIMM Error Address Error Type Page Count
------------- ----------------- ---------- --------- -----
0/1 1b 0x00000008f5b8b200 Single-Bit 0x08f5b8b 1
0/1 6a 0x00000018014bcd80 Single-Bit 0x18014bc 1
0/2 6a/b 0x000000207428b600 Single-Bit 0x207428b 0
0/1 6a/b 0x000000166909d4c0 Single-Bit 0x166909d 0

Page Deallocation Table (PDT)

CAB/CELL Slot/Set Error Address Error Type Error Page
----------------- -------------- ---------- ---------
0/1 6a/b 0x000000166909d4c0 Single-Bit 0x166909d
0/2 6a/b 0x000000207428b600 Single-Bit 0x207428b

PDT Entries Used: 2
PDT Entries Free: 198
PDT Total Size: 200

=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=

-- Information Tool Log for MEMORY on path 1/5 --

Log creation time: Sun May 8 03:52:10 2011

Hardware path: 1/5


Basic Memory Description

Module Type: MEMORY
Total Configured Memory : 159744 MB
Page Size: 4096 Bytes

Memory interleaving is supported on this machine and is ON.

Memory Board Inventory

CAB/CELL: 0/0

DIMM A DIMM B
Slot Size (MB) Size (MB)
---- --------- ---------
0 4096 4096
1 4096 4096
2 4096 4096
3 4096 4096
4 4096 4096
5 4096 4096
6 4096 4096
7 4096 4096

Cell Total (MB): 61440
-------------------------------------------------

CAB/CELL: 0/1

DIMM A DIMM B
Slot Size (MB) Size (MB)
---- --------- ---------
0 4096 4096
1 4096 4096
2 4096 4096
3 4096 4096
4 4096 4096
5 4096 4096
6 4096 4096
7 4096 4096

Cell Total (MB): 65536
-------------------------------------------------

CAB/CELL: 0/2

DIMM A DIMM B
Slot Size (MB) Size (MB)
---- --------- ---------
0 2048 2048
1 2048 2048
2 2048 2048
3 2048 2048
4 2048 2048
5 2048 2048
6 2048 2048
7 2048 2048

Cell Total (MB): 32768
-------------------------------------------------

System Total (MB): 159744

Memory Error Log Summary

Error
CAB/CELL DIMM Error Address Error Type Page Count
------------- ----------------- ---------- --------- -----
0/1 1b 0x00000008f5b8b200 Single-Bit 0x08f5b8b 1
0/1 6a 0x00000018014bcd80 Single-Bit 0x18014bc 1
0/2 6a/b 0x000000207428b600 Single-Bit 0x207428b 0
0/1 6a/b 0x000000166909d4c0 Single-Bit 0x166909d 0


Page Deallocation Table (PDT)

CAB/CELL Slot/Set Error Address Error Type Error Page
----------------- -------------- ---------- ---------
0/1 6a/b 0x000000166909d4c0 Single-Bit 0x166909d
0/2 6a/b 0x000000207428b600 Single-Bit 0x207428b

PDT Entries Used: 2
PDT Entries Free: 198
PDT Total Size: 200

=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=

-- Information Tool Log for MEMORY on path 2/5 --

Log creation time: Sun May 8 03:52:10 2011

Hardware path: 2/5


Basic Memory Description

Module Type: MEMORY
Total Configured Memory : 159744 MB
Page Size: 4096 Bytes

Memory interleaving is supported on this machine and is ON.

Memory Board Inventory

CAB/CELL: 0/0

DIMM A DIMM B
Slot Size (MB) Size (MB)
---- --------- ---------
0 4096 4096
1 4096 4096
2 4096 4096
3 4096 4096
4 4096 4096
5 4096 4096
6 4096 4096
7 4096 4096

Cell Total (MB): 61440
-------------------------------------------------

CAB/CELL: 0/1

DIMM A DIMM B
Slot Size (MB) Size (MB)
---- --------- ---------
0 4096 4096
1 4096 4096
2 4096 4096
3 4096 4096
4 4096 4096
5 4096 4096
6 4096 4096
7 4096 4096

Cell Total (MB): 65536
-------------------------------------------------

CAB/CELL: 0/2

DIMM A DIMM B
Slot Size (MB) Size (MB)
---- --------- ---------
0 2048 2048
1 2048 2048
2 2048 2048
3 2048 2048
4 2048 2048
5 2048 2048
6 2048 2048
7 2048 2048

Cell Total (MB): 32768
-------------------------------------------------

System Total (MB): 159744

Memory Error Log Summary

Error
CAB/CELL DIMM Error Address Error Type Page Count
------------- ----------------- ---------- --------- -----
0/1 1b 0x00000008f5b8b200 Single-Bit 0x08f5b8b 1
0/1 6a 0x00000018014bcd80 Single-Bit 0x18014bc 1
0/2 6a/b 0x000000207428b600 Single-Bit 0x207428b 0
0/1 6a/b 0x000000166909d4c0 Single-Bit 0x166909d 0

Page Deallocation Table (PDT)

CAB/CELL Slot/Set Error Address Error Type Error Page
----------------- -------------- ---------- ---------
0/1 6a/b 0x000000166909d4c0 Single-Bit 0x166909d
0/2 6a/b 0x000000207428b600 Single-Bit 0x207428b

PDT Entries Used: 2
PDT Entries Free: 198
PDT Total Size: 200
-- Information Tool Log for each selected
7 REPLIES 7
Dennis Handly
Acclaimed Contributor

Re: Possible Memory Issue on 8420

>swapinfo also shows less total memory than what is in the system but dmesg shows the correct amount.

swapinfo(1m) only shows pseudo-swap, not memory size.
What HP-UX version are you using?
Paul Coffey_1
Advisor

Re: Possible Memory Issue on 8420

11.23
Emil Velez
Honored Contributor

Re: Possible Memory Issue on 8420

make sure the extended memory tests are enabled at boot..
Paul Coffey_1
Advisor

Re: Possible Memory Issue on 8420

How do I enable extended memory tests?

It would be several months before this system can be rebooted and extended tests ran.
Paul Coffey_1
Advisor

Re: Possible Memory Issue on 8420

On boot the system according to syslog saw the correct amount of memory and looking at parstatus all of the memory is reported. I thought this might be an interleaving issue but there is only partion on this server. Looks like an stm problem to me.


May 8 05:07:01 vmunix: Memory Information:
May 8 05:07:01 vmunix: physical page size = 4096 bytes, logical page size = 4096 bytes
May 8 05:07:01 vmunix: Physical: 163492864 Kbytes, lockable: 128700440 Kbytes, available: 148709816 Kbytes


# parstatus -c 0 -V
[Cell]
Hardware Location : cab0,cell0
Global Cell Number : 0
Actual Usage : Active Core
Normal Usage : Base
Connected To : cab0,bay0,chassis0
Core Cell Capable : yes
Firmware Revision : 24.4
Failure Usage : Normal
Use On Next Boot : yes
Partition Number : 0
Partition Name : g01
Requested CLM value : 0.0 GB
Allocated CLM value : 0.0 GB
Cell Architecture Type : PA-RISC
CPU Compatibility : BCE-0

[CPU Details]

Type : 88D0
Speed : 1000 MHz
CPU Status
=== ======
0 OK
1 OK
2 OK
3 OK
4 OK
5 OK
6 OK
7 OK

CPUs
===========
OK : 8
Deconf : 0
Max : 8

[Memory Details]

DIMM Size (MB) Status
==== ========= =========
0A 4096 OK
4A 4096 OK
0B 4096 OK
4B 4096 OK
1A 4096 OK
5A 4096 OK
1B 4096 OK
5B 4096 OK
2A 4096 OK
6A 4096 OK
2B 4096 OK
6B 4096 OK
3A 4096 OK
7A 4096 OK
3B 4096 OK
7B 4096 OK

Memory
=========================
DIMM OK : 16
DIMM Deconf : 0
Max DIMMs : 16
Memory OK : 64.00 GB
Memory Deconf : 0.00 GB


# parstatus -c 1 -V
[Cell]
Hardware Location : cab0,cell1
Global Cell Number : 1
Actual Usage : Active
Normal Usage : Base
Connected To : cab0,bay0,chassis1
Core Cell Capable : yes
Firmware Revision : 24.4
Failure Usage : Normal
Use On Next Boot : yes
Partition Number : 0
Partition Name : g01
Requested CLM value : 0.0 GB
Allocated CLM value : 0.0 GB
Cell Architecture Type : PA-RISC
CPU Compatibility : BCE-0

[CPU Details]

Type : 88D0
Speed : 1000 MHz
CPU Status
=== ======
0 OK
1 OK
2 OK
3 OK
4 OK
5 OK
6 OK
7 OK

CPUs
===========
OK : 8
Deconf : 0
Max : 8

[Memory Details]

DIMM Size (MB) Status
==== ========= =========
0A 4096 OK
4A 4096 OK
0B 4096 OK
4B 4096 OK
1A 4096 OK
5A 4096 OK
1B 4096 OK
5B 4096 OK
2A 4096 OK
6A 4096 OK
2B 4096 OK
6B 4096 OK
3A 4096 OK
7A 4096 OK
3B 4096 OK
7B 4096 OK

Memory
=========================
DIMM OK : 16
DIMM Deconf : 0
Max DIMMs : 16
Memory OK : 64.00 GB
Memory Deconf : 0.00 GB


# parstatus -c 2 -V
[Cell]
Hardware Location : cab0,cell2
Global Cell Number : 2
Actual Usage : Active
Normal Usage : Base
Connected To : -
Core Cell Capable : no
Firmware Revision : 24.4
Failure Usage : Normal
Use On Next Boot : yes
Partition Number : 0
Partition Name : g01
Requested CLM value : 0.0 GB
Allocated CLM value : 0.0 GB
Cell Architecture Type : PA-RISC
CPU Compatibility : BCE-0

[CPU Details]

Type : 88D0
Speed : 1000 MHz
CPU Status
=== ======
0 OK
1 OK
2 OK
3 OK
4 OK
5 OK
6 OK
7 OK

CPUs
===========
OK : 8
Deconf : 0
Max : 8

[Memory Details]

DIMM Size (MB) Status
==== ========= =========
0A 2048 OK
4A 2048 OK
0B 2048 OK
4B 2048 OK
1A 2048 OK
5A 2048 OK
1B 2048 OK
5B 2048 OK
2A 2048 OK
6A 2048 OK
2B 2048 OK
6B 2048 OK
3A 2048 OK
7A 2048 OK
3B 2048 OK
7B 2048 OK

Memory
=========================
DIMM OK : 16
DIMM Deconf : 0
Max DIMMs : 16
Memory OK : 32.00 GB
Memory Deconf : 0.00 GB
Andrew Rutter
Honored Contributor

Re: Possible Memory Issue on 8420

Paul,

I would tend to agree with you you that it looks like a Support tools bug.

your version is a few behind, so if you have the time it be worth while looking through the release notes for the later versions and see if anything crops up

http://docs.hp.com/en/diag/stm/stm_upd.htm

Also fi you have a quieter time on the system you could always run the exercise tool or the verify tool on the memory

see what that comes back with.

Andy
Paul Coffey_1
Advisor

Re: Possible Memory Issue on 8420

Still working on getting a maintenance window. I'll post back what I find. Thanks for all the assistance