- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: bonding 10g rac failover
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-08-2004 03:22 AM
11-08-2004 03:22 AM
Hope someone has a solution:
I have the following:
3 dl380 g3
6 nics per machine
RedHat AS 3 update 2
bcm5700-7.1.9e-1
e1000-5.2.16b-1
bonding-1.0.4o-1
Oracle 10g RAC 10.1.0.2/10.1.0.3
NetApp FAS 270c
I have all nics bonded successfully. And the system is stable. We are doing fault testing and are trying to mimic a complete loss to the bonds that manage the nfs storage or to the bond that manages the interconnect. We pull both cables. OS sees that the nics have faulted but the bond still remains and the database still runs or hangs. When we put the cables back it then the database faults and the system reboots.
The situation is we want the system to reboot when the all nics in a bond are gone. I know the document says that the bond will still be there when the nics fail but does someone have a way to make the bond fail in the scenario?
Thanks.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-08-2004 07:10 AM
11-08-2004 07:10 AM
Re: bonding 10g rac failover
Here is the doc I used to bond two Intel NIC cards. Note: If you are not using an Intel card or another card that supports bonding it won't work. You need to check to see if your NIC explicitly supports bonding.
http://www.redhat.com/docs/manuals/enterprise/RHEL-3-Manual/ref-guide/s1-modules-ethernet.html
It may say its bonded but still not work.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-08-2004 07:28 AM
11-08-2004 07:28 AM
Re: bonding 10g rac failover
The scenario is this:
Bond0 has 2 nics eth0 and eth1.
Eth0 fails and bond now uses eth1.
Eth1 fails and there is no active nic in the bond. Bond0 does not fail at all which causes the database to hang.
I have to manually bring down the bond which then causes the server to reboot (as expected within the rac environment).
So instead of me manually bringing the bond down is there a parameter that I can pass to the bonding driver? or some sort of other parameter I can use to cause the bond to fail if all nics are dead?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-08-2004 11:10 PM
11-08-2004 11:10 PM
Solutioni have no experince with nic bonding under linux but i know a workaround. try to run this script by cron every minute or whatever you want as interval.
#!/bin/bash
ifconfig eth0 2>/dev/null | grep UP >/dev/null 2>/dev/null
$returnvalue=`echo $?`
if [ $returnvalue -ne 0 ]; then
ifconfig eth1 2>/dev/null | grep UP >/dev/null 2>/dev/null
$returnvalue=`echo $?`
if [ $returnvalue -ne 0 ]; then
ifconfig bond0 down
fi
fi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-09-2004 01:48 AM
11-09-2004 01:48 AM
Re: bonding 10g rac failover
Here's what I got after reviewing your response:
#!/bin/bash
#this little script is to check for the bond and associated slaves being up.
#bonding does not allow for the bond to fail even when both all slaces in the
#bond are down so this script will bring the bond down when all slaves are down.
#adding the sleep command to start the script so that the service network restarrt
#does not cause the script to drop the bond
sleep 60
echo "Slept for 1 minute"
sleep 60
echo "Slept for 2 minutes"
sleep 60
echo "Slept for 3 minutes"
ifconfig bond2 2>/dev/null | grep UP >/dev/null 2>/dev/null
valup=$?
echo $valup
# a value of 0 means that the card is up and a value of 1 means that the card is down
if [ $valup = 0 ]; then
echo 'Start test'
until [ $valup != 0 ]
do
sleep 10
# this is to check for the first slave in the bond
ifconfig eth4 2>/dev/null | grep UP >/dev/null 2>/dev/null
valeth4up=$?
echo "A Value of 0 means the card is up"
echo "State of eth4"
echo $valeth4up
#this is to check for the second slave in the bond
ifconfig eth5 2>/dev/null | grep UP >/dev/null 2>/dev/null
valeth5up=$?
echo "State of eth5"
echo $valeth5up
if [ $valeth4up != 0 ]; then
echo "First step :"
echo $valeth4up
if [ $valeth5up != 0 ]; then
echo "Second step might kill the bond"
echo $valeth5up
ifconfig bond2 down
fi
fi
done
fi
Do you think this will do the trick? or am I missing something?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-09-2004 07:04 AM
11-09-2004 07:04 AM
Re: bonding 10g rac failover
i think the script should work but i have some little things to say about the script:
1) let the script run by cron so you don't need to sleep for 180 seconds. add the following line to /etc/crontab:
*/3 * * * * root /path/scriptname
2) don't echo anything in the script because if the script echos anything an e-mail will be generated every 3 minutes. echo only if both nic's fail so you get an e-mail if the server restarts.
3) why don't you do a 'shutdown -r now' when both nic's fail instead of 'ifconfig bond2 down'?
best regards,
johannes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-09-2004 07:09 AM
11-09-2004 07:09 AM
Re: bonding 10g rac failover
I just do the ifconfig down because I let the oracle cluster software determine when the node reboots so that the nodes can do a quick reconfig on their own before the problem one comes down.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-09-2004 07:52 AM
11-09-2004 07:52 AM
Re: bonding 10g rac failover
it would be nice if you asign some points to the answers that helps you.
thanks,
johannes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-09-2004 08:01 AM
11-09-2004 08:01 AM