- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Crashdump and HeartBeat with MC/SerivceGuard
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2003 08:45 PM
06-12-2003 08:45 PM
Now my system is running Oracle 9i RAC(Real Application Cluster) with MC/ServiceGuard 11.14.(2node cluster) And we have two hearbeat LAN with cross cable(1000Base-SX).
When we test for heartbeat-fail that we disconnect heartbeat LAN, Node2 was down.
So far it seemed to be good.
However, We found /var/adm/crash was broken after Node2 was boot.
In more detail, we failed to mount /var/adm/crash with /dev/vg00/lvol10.
So we have to do "newfs -F vxfs /dev/vg00/rlvol10", and we found mounting
was succeeded.
We never face to such a case.
We tested four times with same operation,
but it was the same every time when we disconnect heartbeat LAN.
We cannot find out what was wrong.
Thanks.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2003 08:57 PM
06-12-2003 08:57 PM
Re: Crashdump and HeartBeat with MC/SerivceGuard
1) Is /var/adm/crash a seperate FS? Or is it just a directory under /var file system?
2) Did you check the crash file (crash analysis) in /var/adm/crash? Does it says anything? Is it so big that cannot be included in the FS?
3) Post the exact error message you are getting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2003 08:57 PM
06-12-2003 08:57 PM
Re: Crashdump and HeartBeat with MC/SerivceGuard
Can you post your cluster configuration script to see how you have configured the standby heartbeat.
Cheers
Rajeev
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2003 10:28 PM
06-12-2003 10:28 PM
Re: Crashdump and HeartBeat with MC/SerivceGuard
And I'm so sorry that it was NOT enough information what I asked.
So, I answer and show some information.
1. Is /var/adm/crash a seperate FS?
Yes it is.
/dev/vg00/lvol9 /var
/dev/vg00/lvol10 /var/adm/crash
and "/var/adm/crash" has the same size of physical memory.
2.Error message
When we checked /etc/rc.log, we found some messeage as follows.
Save system crash dump if needed
Output from "/sbin/rc1.d/S440savecrash start":
----------------------------
savecrash directory not set; defaulting to: /var/adm/crash
savecrash: savecrash running in the background
EXIT CODE: 4 - savecrash proceeding in background
"/sbin/rc1.d/S440savecrash start" FAILED
But, we checked "lvlnboot -v", and it seemed to be no problem.
# lvlnboot -v
Boot Definitions for Volume Group /dev/vg00:
Physical Volumes belonging in Root Volume Group:
/dev/dsk/c1t0d0 (0/0/1/1.0.0) -- Boot Disk
/dev/dsk/c2t0d0 (0/0/2/0.0.0) -- Boot Disk
Boot: lvol1 on: /dev/dsk/c1t0d0
/dev/dsk/c2t0d0
Root: lvol3 on: /dev/dsk/c1t0d0
/dev/dsk/c2t0d0
Swap: lvol2 on: /dev/dsk/c1t0d0
/dev/dsk/c2t0d0
Dump: lvol10 on: /dev/dsk/c1t0d0, 0
3. Cluster configuration script
NODE_NAME node1
NETWORK_INTERFACE lan1
HEARTBEAT_IP 172.16.247.185
NETWORK_INTERFACE lan4
HEARTBEAT_IP 172.16.247.189
NETWORK_INTERFACE lan5
STATIONARY_IP 172.16.247.5
FIRST_CLUSTER_LOCK_PV /dev/dsk/c4t0d0
NODE_NAME node2
NETWORK_INTERFACE lan1
HEARTBEAT_IP 172.16.247.186
NETWORK_INTERFACE lan4
HEARTBEAT_IP 172.16.247.190
NETWORK_INTERFACE lan5
STATIONARY_IP 172.16.247.6
We do not have alternate Data LAN.
Our customer approved it.
As for crash dump, there were so many files and folders, so we could not find out any messages what we should show.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2003 11:43 PM
06-12-2003 11:43 PM
Re: Crashdump and HeartBeat with MC/SerivceGuard
We often get questions asking whether Crossover cables are supported for use in a ServiceGuard cluster. The short answer is YES, but there are some important issues that you should be aware of:
This solution only works in a two node cluster. There is no way to have a Standby LAN card when using a Crossover LAN cable.
When either LAN card fails, or the crossover cable is disconnected, both LAN cards go down. This is because the electrical signals necessary for the cards to determine that a valid LAN connection exists are not present. The result is that since both nodes appear to have a bad LAN card, ServiceGuard may TOC the wrong node. If a hub was used between the two LAN cards, then the hub would provide the electrical signals to the other card, allowing it to stay up.
On multi-speed cards, such as 10/100Base-T, the cards must negotiate which speed will be used when the system boots up. If only one system is booted and the remote system is down, then the negotiation will fail, and the card will not be enabled at all. So when the second node eventually comes up, it's LAN will also be down. If a hub is used, then the negotiation will succeed, so the LAN cards will come up at bootup, even if only one node is running.
It may be possible to force some multi-speed LAN cards to bypass the negotiation at bootup and to use a predetermined fixed speed. If this is possible, then would allow the two systems to boot up at different times and still use the Crossover cable connected LAN cards once they are both booted up.
Since both cards may go down when there is a failure when a Crossover cable is used, it can be difficult to determine where the problem lies. Another problem using Crossover cables is that if they are not properly labeled, they may accidently be used in situations where they will not work.
For the reasons listed above, HP does not recommend using Crossover cables for ServiceGuard configurations. However, they are still supported as long as you are willing to accept the above limitations. Using a Crossover cables is cheaper than using a hub, but it compromises the HA solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2003 11:45 PM
06-12-2003 11:45 PM
Re: Crashdump and HeartBeat with MC/SerivceGuard
Please post your netstat -in output.
Also, you say the problem is the same every time. Which problem? The fact that a node crashes? or that it loses the file system?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2003 12:08 AM
06-13-2003 12:08 AM
Re: Crashdump and HeartBeat with MC/SerivceGuard
You problem is clearly not related to Service Guard ... You would have the same disaster with any TOC or PANIC. When you want to get a dump you need to have :
- a dump volume, used as a raw device at saving time to copy memory content. This can be a swap device but it could slow down the reboot, so I better use a dedicated lvol
- a filesystem to get the image as files at reboot
These devices CAN'T be the same. So each TOC uses your /dev/vg00/lvol10 as a dump device (and destroy the filesystem) and at reboot, dump can't be saved.
I think that you've configured /var/adm/crash after lvlnboot -d else it would have been refused.
The solution is to create 2 separate lvols or configure your swap as a dump device.
Hope this helps
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2003 12:12 AM
06-13-2003 12:12 AM
Re: Crashdump and HeartBeat with MC/SerivceGuard
We have already recommend heartbeat should be connected with hub and prepare alternate DATA LAN.
However our customer said they could NOT pay
and approve.
By the way, regarding "netstat -in",
Of course, we separate network segments with subnet mask as follows.
lan1 and lan4 are "hearbeat LAN".
------ Output of netstat -in (node1) ------
Name Mtu Network Address Ipkts Opkts
lan3 1500 172.16.247.176 172.16.247.177 371915 401603
lan2 1500 172.16.247.32 172.16.247.36 623902 550509
lan5:1 1500 172.16.247.0 172.16.247.8 0 0
lan9 1500 172.16.247.180 172.16.247.181 371090 400735
lan1 1500 172.16.247.184 172.16.247.185 587419 532177
lan0 1500 172.16.247.64 172.16.247.78 407238 442279
lo0 4136 127.0.0.0 127.0.0.1 89712 89712
lan5 1500 172.16.247.0 172.16.247.5 1115549 855003
lan4 1500 172.16.247.188 172.16.247.189 587938 532532
Furthermore, "everytime" means that
"When we pull out two heartbeat LAN cable".
Sorry to tell not enough information.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2003 12:30 AM
06-13-2003 12:30 AM
SolutionDump: lvol10 on: /dev/dsk/c1t0d0, 0
You have your dump device set to the lvol which you are using for the filesystem /var/adm/crash.
When HPUX dumps core, it writes it to a raw device NOT to a file system, then when the system reboots the savecrash command takes care of moving the dump off the raw device to a file system (usually /var/adm/crash). Whats happening is that when your system TOCs, it is writing a crash dump to the raw lvol lvol10, and overwriting your file system headers and structure. So every time you get a crash, your having to recreate the filesystem.
You should tell HPUX to dump core to a raw device. Most people use swap for this, as there's nothing you need to keep in swap after a reboot, but you can create another raw logical volume to use solely as dump if you really want to. If you do create another device for use as dump then remember that it MUST be contiguous and have bad block relocation turned OFF (thats using the -C y -r N options on lvcreate). To tell HPUX to use a different dump lvol use lvrmboot and lvlnboot (you may need to do these in LVM maintenance mode). The following example sets the dumpt to go to swap, assuming this is a default install with swap in lvol2:
lvrmboot -v -d lvol10 /dev/vg00
lvlnboot -d /dev/vg00/lvol2
lvlnboot -R /dev/vg00
I would also share Melvyns concerns about your network config, and using X-over cables - particularly as you've gone to the expense of using 9iRAC! A couple of switches/hubs are *very* cheap in comparison
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2003 02:44 AM
06-13-2003 02:44 AM
Re: Crashdump and HeartBeat with MC/SerivceGuard
Our team really appreciated.
Best regards and Thank you.