GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Integrity Servers
- >
- memory dimm problem rx7640
Integrity Servers
1848401
Members
2875
Online
104026
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-16-2010 04:19 PM
07-16-2010 04:19 PM
memory dimm problem rx7640
Hi,
We have a rx7640 server, that is freezing very frequently,it leaves no crash, even if we do a toc from th MP, it doesn't leave mca files in the /var/tombstones dir.
We had some dimm erros in stm so we changed the dimms with errors: 1a,1b,3a,3b
after a pdt clear, we did a reset and booted the server, the problem is the errors are still in stm:
===========================================================================
Memory Error Log Summary
DIMM Location Error Address Error Type Page Count
---------------------- ---------------- ---------- ------------- -----
Cab 0 Cell 1 DIMM 1A 0x372b1c500 Single-Bit 0x372b1c 1
Cab 0 Cell 1 DIMM 1A 0x13f1a4100 Single-Bit 0x13f1a4 1
Cab 0 Cell 1 DIMM 1A 0x11e2a7500 Single-Bit 0x11e2a7 1
Cab 0 Cell 1 ECHELON 3 0x107b46000 Multi-Bit 0x107b46 N/A
Cab 0 Cell 1 DIMM 1A 0x40ff1a2100 Single-Bit 0x40ff1a2 1
Cab 0 Cell 1 DIMM 1A 0x40fdea6500 Single-Bit 0x40fdea6 1
Cab 0 Cell 1 DIMM 1A 0x40df1ae100 Single-Bit 0x40df1ae 1
Cab 0 Cell 1 DIMM 1A 0x40df1a4500 Single-Bit 0x40df1a4 1
Cab 0 Cell 1 DIMM 1A 0x40d231ad00 Single-Bit 0x40d231a 1
Cab 0 Cell 1 DIMM 1A 0x3dea0100 Single-Bit 0x3dea0 1
Cab 0 Cell 1 DIMM 1A 0x3231c100 Single-Bit 0x3231c 1
System start: Wed Jul 14 01:15:13 2010.
Last error detected: Wed Jul 14 01:15:13 2010.
Logging interval: 900 seconds.
11 address(es) with errors logged in memory error log.
The Logtool Utility provides full details about the memory error log.
Page Deallocation Table (PDT)
DIMM Location Error Address Error Type Page
---------------------- ---------------- ---------- -------------
Cab 0 Cell 1 ECHELON 3 0x107b46000 Multi-Bit 0x107b46
PDT Entries Used: 1
PDT Entries Free: 199
PDT Total Size: 200
I have several questions.
Do you have to run anymore commands to get rid of the old memory errors on stm?
What does
Cab 0 Cell 1 ECHELON 3 mean?, dimms 3a and 3b?
If I changed the dimms, and still get the same errors in the same dimms, could it be the cell board?
in the FPL i have:65126 SFW 0,1,0 0 03000f1d10e00000 0000000000000003 SAL_INFO_REC_CLEAR
65125 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65124 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65123 SFW 0,1,0 0 03000f1d10e00000 0000000000000003 SAL_INFO_REC_CLEAR
65122 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65121 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65120 SFW 0,1,0 0 03000f1d10e00000 0000000000000003 SAL_INFO_REC_CLEAR
65119 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65118 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65117 SFW 0,1,0 0 03000f1d10e00000 0000000000000003 SAL_INFO_REC_CLEAR
65116 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65115 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65114 SFW 0,1,4 0 03000f1d14e00000 0000000000000000 SAL_INFO_REC_CLEAR
65113 SFW 0,1,4 1 2e000a6314e00000 0000000000000000 NO_UNCONSUMED_LOGS_FOUND
65112 SFW 0,1,6 0 03000f1d16e00000 0000000000000000 SAL_INFO_REC_CLEAR
65111 SFW 0,1,6 1 2e000a6316e00000 0000000000000000 NO_UNCONSUMED_LOGS_FOUND
65110 SFW 0,1,2 0 03000f1d12e00000 0000000000000000 SAL_INFO_REC_CLEAR
65109 SFW 0,1,2 1 2e000a6312e00000 0000000000000000 NO_UNCONSUMED_LOGS_FOUND
65108 SFW 0,1,0 0 0300122810e00000 0000000000000000 SAL_SET_VECTORS_CS1
65107 SFW 0,1,0 0 0300122710e00000 0000000000000002 SAL_SET_VECTORS_TYPE
65106 214c40e4da020000 ff0f066f001f0300 IPMI Type-02 Event
65106 07/16/2010 23:01:46
Thnx
We have a rx7640 server, that is freezing very frequently,it leaves no crash, even if we do a toc from th MP, it doesn't leave mca files in the /var/tombstones dir.
We had some dimm erros in stm so we changed the dimms with errors: 1a,1b,3a,3b
after a pdt clear, we did a reset and booted the server, the problem is the errors are still in stm:
===========================================================================
Memory Error Log Summary
DIMM Location Error Address Error Type Page Count
---------------------- ---------------- ---------- ------------- -----
Cab 0 Cell 1 DIMM 1A 0x372b1c500 Single-Bit 0x372b1c 1
Cab 0 Cell 1 DIMM 1A 0x13f1a4100 Single-Bit 0x13f1a4 1
Cab 0 Cell 1 DIMM 1A 0x11e2a7500 Single-Bit 0x11e2a7 1
Cab 0 Cell 1 ECHELON 3 0x107b46000 Multi-Bit 0x107b46 N/A
Cab 0 Cell 1 DIMM 1A 0x40ff1a2100 Single-Bit 0x40ff1a2 1
Cab 0 Cell 1 DIMM 1A 0x40fdea6500 Single-Bit 0x40fdea6 1
Cab 0 Cell 1 DIMM 1A 0x40df1ae100 Single-Bit 0x40df1ae 1
Cab 0 Cell 1 DIMM 1A 0x40df1a4500 Single-Bit 0x40df1a4 1
Cab 0 Cell 1 DIMM 1A 0x40d231ad00 Single-Bit 0x40d231a 1
Cab 0 Cell 1 DIMM 1A 0x3dea0100 Single-Bit 0x3dea0 1
Cab 0 Cell 1 DIMM 1A 0x3231c100 Single-Bit 0x3231c 1
System start: Wed Jul 14 01:15:13 2010.
Last error detected: Wed Jul 14 01:15:13 2010.
Logging interval: 900 seconds.
11 address(es) with errors logged in memory error log.
The Logtool Utility provides full details about the memory error log.
Page Deallocation Table (PDT)
DIMM Location Error Address Error Type Page
---------------------- ---------------- ---------- -------------
Cab 0 Cell 1 ECHELON 3 0x107b46000 Multi-Bit 0x107b46
PDT Entries Used: 1
PDT Entries Free: 199
PDT Total Size: 200
I have several questions.
Do you have to run anymore commands to get rid of the old memory errors on stm?
What does
Cab 0 Cell 1 ECHELON 3 mean?, dimms 3a and 3b?
If I changed the dimms, and still get the same errors in the same dimms, could it be the cell board?
in the FPL i have:65126 SFW 0,1,0 0 03000f1d10e00000 0000000000000003 SAL_INFO_REC_CLEAR
65125 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65124 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65123 SFW 0,1,0 0 03000f1d10e00000 0000000000000003 SAL_INFO_REC_CLEAR
65122 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65121 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65120 SFW 0,1,0 0 03000f1d10e00000 0000000000000003 SAL_INFO_REC_CLEAR
65119 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65118 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65117 SFW 0,1,0 0 03000f1d10e00000 0000000000000003 SAL_INFO_REC_CLEAR
65116 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65115 SFW 0,1,0 0 03000f1c10e00000 0000013000000003 SAL_INFO_REC
65114 SFW 0,1,4 0 03000f1d14e00000 0000000000000000 SAL_INFO_REC_CLEAR
65113 SFW 0,1,4 1 2e000a6314e00000 0000000000000000 NO_UNCONSUMED_LOGS_FOUND
65112 SFW 0,1,6 0 03000f1d16e00000 0000000000000000 SAL_INFO_REC_CLEAR
65111 SFW 0,1,6 1 2e000a6316e00000 0000000000000000 NO_UNCONSUMED_LOGS_FOUND
65110 SFW 0,1,2 0 03000f1d12e00000 0000000000000000 SAL_INFO_REC_CLEAR
65109 SFW 0,1,2 1 2e000a6312e00000 0000000000000000 NO_UNCONSUMED_LOGS_FOUND
65108 SFW 0,1,0 0 0300122810e00000 0000000000000000 SAL_SET_VECTORS_CS1
65107 SFW 0,1,0 0 0300122710e00000 0000000000000002 SAL_SET_VECTORS_TYPE
65106 214c40e4da020000 ff0f066f001f0300 IPMI Type-02 Event
65106 07/16/2010 23:01:46
Thnx
Windows?, no thanks
2 REPLIES 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2010 07:48 PM
07-17-2010 07:48 PM
Re: memory dimm problem rx7640
Hi
What does
Cab 0 Cell 1 ECHELON 3 mean?, dimms 3a and 3b?
Yes.
New memory term "Echelon" replaces "Ranks"
An Echelon refers to the smallest allocate-able memory unit.
==> If I changed the dimms, and still get the same errors in the same dimms, could it be the cell board?
Do you have a chance to intechange the modules from other cells. Still you are facing issue, you may need to replace the cellboard.
Regards,
Sooraj
What does
Cab 0 Cell 1 ECHELON 3 mean?, dimms 3a and 3b?
Yes.
New memory term "Echelon" replaces "Ranks"
An Echelon refers to the smallest allocate-able memory unit.
==> If I changed the dimms, and still get the same errors in the same dimms, could it be the cell board?
Do you have a chance to intechange the modules from other cells. Still you are facing issue, you may need to replace the cellboard.
Regards,
Sooraj
"UNIX is basically a simple operating system, but you have to be a genius to understand the simplicity" - Dennis Ritchie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2010 11:08 PM
07-18-2010 11:08 PM
Re: memory dimm problem rx7640
So when you a have an error on echelon 3, like this multibit error:
DIMM Location Error Address Error Type Page
---------------------- ---------------- ---------- -------------
Cab 0 Cell 1 ECHELON 3 0x107b46000 Multi-Bit 0x107b46
It you should change dimms, in slots 3a and 3b ?
We finally replaced the cell and 2 cpus.
because after replacing the memory the machine still kept freezing....
DIMM Location Error Address Error Type Page
---------------------- ---------------- ---------- -------------
Cab 0 Cell 1 ECHELON 3 0x107b46000 Multi-Bit 0x107b46
It you should change dimms, in slots 3a and 3b ?
We finally replaced the cell and 2 cpus.
because after replacing the memory the machine still kept freezing....
Windows?, no thanks
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Events and news
Customer resources
© Copyright 2026 Hewlett Packard Enterprise Development LP