- Community Home
- >
- Servers and Operating Systems
- >
- Legacy
- >
- HPE 9000 and HPE e3000 Servers
- >
- SuperDome Hardware
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-19-2003 11:57 PM
тАО03-19-2003 11:57 PM
I supppose the memory on cell boards is globally available and in case of a memory fault even though the remaining memory may be available to the system, does the system continue to run or does it reboot ?
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-20-2003 02:26 AM
тАО03-20-2003 02:26 AM
Re: SuperDome Hardware
It's depend on type of instruction execution on that DIMM if some kernel thread is running on Faulty thread it will give panic and rebooted.
Sunil
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-20-2003 02:55 AM
тАО03-20-2003 02:55 AM
Re: SuperDome Hardware
Is that still the case ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-20-2003 05:15 AM
тАО03-20-2003 05:15 AM
SolutionSecond, most HW on SD's are hot swappable.
Third, what does "parstatus" indicate?
parstatus -A C (* all avail cells *)
parstatus -P (* partition *)
Fourth, what does dmesg from the partition indicate your physical memory is?
dmesg | grep -i phy (* cross reference with what should be in there *)
Fifth, what does your virtual front panel say? Any faults indicated? Especially upon boot up?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-20-2003 05:22 AM
тАО03-20-2003 05:22 AM
Re: SuperDome Hardware
I meant reboot the cell not the superdome :-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-20-2003 09:12 PM
тАО03-20-2003 09:12 PM
Re: SuperDome Hardware
I would like to rephrase my statement.
In case of an cell board failure. The system reboots and comes up with the other cellboard within the partition or system. I am looking to figure out if there is an individual CPU failure or any of the Memory DIMM's fail within a cellboard, then does the system isolate these components and continue to run or does it reboot with the remaining working cellboard.
Its clear that the Superdome does not funtion like a non stop mainframe but more of HA system. But I am trying to understand how much redundancy can be built into it.
Your answers have given a more clear idea about the architecture, but would request you to provide me any links to articles which talk about the architecture for superdomes in detail.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-20-2003 11:26 PM
тАО03-20-2003 11:26 PM
Re: SuperDome Hardware
As i told you in my earlier posting.
chances of server rebooting or recovering from that error is fully depends on the instruction/process thread running at that point of time at defective parts. in HP terms if system in receiving LPMC kind of instruction it can recover but if it is HPMC it can not recover. Suppose there is some realt time process is running on cpu1 at the time CPU get failed system will panic but if some regular process is running like some application process that process will only crash/core dump.
Sunil
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-21-2003 12:02 AM
тАО03-21-2003 12:02 AM
Re: SuperDome Hardware
http://www.hp.com/products1/servers/scalableservers/superdome/infolibrary/index.html
Look for examples using Superdomes & Serviceguard. If one cell does recieve a HPMC and panics, serviceguard can be used to switch the application to another cell or even another superdome.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-21-2003 06:46 AM
тАО03-21-2003 06:46 AM
Re: SuperDome Hardware
Can you run parmgr and get the partition configuration where the total memory is listed?
You have a Genesis cell which is the largest on the SD and located in partition 0, cell 0 in cabinet 0. User 1 has access to this cell. What is your GSP user id?
Also, SD's will HPMC so also check the /var/tombstones/ts99 file. Let me know if its there.
Learn these GSP commands for SD and check the GSP logs:
control b
login
co - console (* list of partitions *)
cm - command menu
cl - console logs (* GSP>cl *)
sl - chassis logs (* GSP>sl>e>return *)
vfp - virtual front panel (* GSP>vfp>partition #>E>errors since boot
who - list of logged in users
x - exit