- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Error while switching the cluster
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-19-2013 10:19 PM
03-19-2013 10:19 PM
Error while switching the cluster
I am explaining you the senario.
We have two nodes and on both node we have insatalled redhat linux and for the cluster we have install HP service gaurd.
The problem what we are facing is , when our NODE1 goes down the confingured NODE2 for failover does not take place automatically . I dont know about if it shifts mannually because I was assigned this case now and no one is there for details.
Also when I am using cmviewcl -v i got the output on node1
CLUSTER STATUS
JISP_DATABASE_CLUSTER up
NODE STATUS STATE
hathdb1 up running
Cluster_Lock_LUN:
DEVICE STATUS
/dev/cciss/c0d0p1 up
Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1
PACKAGE STATUS STATE AUTO_RUN NODE
oracle up running disabled hathdb1
Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual
Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Service up 0 0 oracle_db_mon
Service up 5 0 oracle_lsnr_mon
Subnet up 202.88.149.0
Subnet up 192.168.0.0
Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled hathdb1 (current)
Alternate up enabled hathdb2
NODE STATUS STATE
hathdb2 up running
Cluster_Lock_LUN:
DEVICE STATUS
/dev/cciss/c0d0p1 up
Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1
and when i go to /oracle and more to clusterciew i got the output as below
[root@hathdb1 oracle]#
[root@hathdb1 oracle]# ll
total 2352
-rw-r--r-- 1 root root 8106 Feb 14 2011 1
drwxr-xr-x 2 root root 4096 Apr 24 2010 backup
-rw-r--r-- 1 root root 1603 Apr 9 2010 clusterciew
-rwx------ 1 root root 8105 Feb 15 2011 oracle.conf
-rwx------ 1 root root 8106 Feb 14 2011 oracle.conf-FEB02
-rwx------ 1 root root 8105 Aug 18 2010 oracle.conf.old
-rwx------ 1 root root 39407 Aug 4 2011 oracle.ctrl
-rwx------ 1 root root 39407 Aug 4 2011 oracle.ctrl_04-08-2011
-rwx------ 1 root root 39407 Feb 7 2007 oracle.ctrl.back
-rwx------ 1 root root 39407 Aug 18 2010 oracle.ctrl.back.old
-rwx------ 1 root root 39457 Feb 14 2011 oracle.ctrl-FEB02
-rwx------ 1 root root 39407 Feb 14 2011 oracle.ctrl-FEB-13-11
-rw-r--r-- 1 root root 610796 Mar 11 15:25 oracle.ctrl.log
-rw-r--r-- 1 root root 1454460 Apr 9 2010 oracle.ctrl.log_primary
-rwx------ 1 root root 39407 Aug 18 2010 oracle.ctrl.old
[root@hathdb1 oracle]# more
usage: more [-dflpcsu] [+linenum | +/pattern] name1 name2 ...
[root@hathdb1 oracle]#
[root@hathdb1 oracle]#
[root@hathdb1 oracle]#
[root@hathdb1 oracle]#
[root@hathdb1 oracle]# more clusterciew
CLUSTER STATUS
JISP_DATABASE_CLUSTER down
NODE STATUS STATE
hathdb1 down unknown
Cluster_Lock_LUN:
DEVICE STATUS
/dev/cciss/c0d0p1 unknown
Network_Parameters:
INTERFACE STATUS NAME
PRIMARY unknown eth0
PRIMARY unknown eth1
NODE STATUS STATE
hathdb2 down unknown
Cluster_Lock_LUN:
DEVICE STATUS
/dev/cciss/c0d0p1 unknown
Network_Parameters:
INTERFACE STATUS NAME
PRIMARY unknown eth0
PRIMARY unknown eth1
UNOWNED_PACKAGES
PACKAGE STATUS STATE AUTO_RUN NODE
oracle down unowned
Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover unknown
Failback unknown
Script_Parameters:
ITEM STATUS NODE_NAME NAME
Subnet unknown hathdb1 202.88.149.0
Subnet unknown hathdb1 192.168.0.0
Subnet unknown hathdb2 202.88.149.0
Subnet unknown hathdb2 192.168.0.0
Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary down hathdb1
Alternate down hathdb2
[root@hathdb1 oracle]#
[root@hathdb1 oracle]#
i didnot understand this one can you explain why it is so.
Also I am new to cluster so give me the solution i will be thankfull to you.
For details please find the attachment.
Thanks and regaurd.
Ashish
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-20-2013 01:17 AM - edited 03-20-2013 01:23 AM
03-20-2013 01:17 AM - edited 03-20-2013 01:23 AM
Re: Error while switching the cluster
> PACKAGE STATUS STATE AUTO_RUN NODE
> oracle up running disabled hathdb1
This is the problem.
When AUTO_RUN is disabled, the automatic failover will not happen.
When you use the cmhaltpkg to halt a package, it will automatically disarm the AUTO_RUN for that package as a side effect. If you start the package using cmrunpkg, you must remember to re-arm the automatic failover using the "cmmodpkg -e" command. In the case of the package listed above, the command would be:
# cmmodpkg -e oracle
Newer versions of Serviceguard will actually remind you of this requirement each time you use the cmhaltpkg/cmrunpkg commands.
If the package is started as part of a cluster startup (cmruncl), then the AUTO_RUN state of each package will automatically be set to the default value set in the package configuration.
Note: there is also another form of the cmmodpkg command: "cmmodpkg -n <node_name> -e <package_name>". It looks very similar to the command listed above, but has a different purpose. If a package is disabled from starting on a particular node (e.g. because it failed starting up last time Serviceguard tried it there), you can use this command to re-enable it. In effect, this command tells Serviceguard: "The problem that prevented this package from starting on this node is now fixed, the package can be allowed to run on this node again."
The timestamp of the /oracle/clusterciew file is April 9, 2010. So it's almost three years old now. It seems to be an old copy of cmviewcl output, and certainly not relevant any more. The only thing it can tell you is that the cluster was down at some time on that day. This file is not used by standard Serviceguard configuration: most likely you can just delete it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-20-2013 02:58 AM
03-20-2013 02:58 AM
Re: Error while switching the cluster
so there is now relation between /oracle/clusterciew file is April 9, 2010 and the present one.
One more thing i want to ask you is i have searched every where in the server but i was not able to find the cmclustercl.ascii file. What i have seen in HP UX that there is an ascii file, is that is not the case with the Linux?
So if I am not wrong the above is the only error in my cluster and i have to fix that.
- Tags:
- Matti_Kurkela
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-20-2013 05:15 AM
03-20-2013 05:15 AM
Re: Error while switching the cluster
The ASCII file is actually used only when submitting configuration changes to Serviceguard using the cmapplyconf command. After the cmapplyconf command has completed successfully, the ASCII configuration file is not needed, since the configuration has been stored in the binary configuration file, which is automatically kept in sync between cluster members by Serviceguard.
It is the habit of many Serviceguard administrators (including myself) to leave the latest ASCII file around for documentation/reference purposes, but if you don't have it, you can easily get an ASCII copy of the current configuration with the cmgetconf command if you need it.
(In some situations, it would be better to use cmgetconf instead of relying on old, possibly out-of-date or out-of-sync copies of the ASCII configuration files. Therefore, if the current configuration is sufficiently documented elsewhere, it is possible to argue that it might actually be a Good Thing to *not* have the old ASCII files around: it removes the temptation to look at the possibly-obsolete files and forces the sysadmin to always get the up-to-date configuration information with cmgetconf.
The counterargument is that in disaster-recovery situations, having the ASCII files around will speed up recovery. Well, yes; but *only* if they are up to date and if there is no need to significantly change them to adapt to post-disaster situation.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-20-2013 09:29 PM
03-20-2013 09:29 PM
Re: Error while switching the cluster
But if you can see the attachment that service is enabled ie AUTO_RUN YES
please reply ASAP
Regards Ashish.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2013 02:11 AM
03-21-2013 02:11 AM
Re: Error while switching the cluster
The AUTO_RUN configuration setting and the AUTO_RUN package state are not quite the same thing.
When you run "cmviewcl", the AUTO_RUN field in the output indicates the package state.
The configuration setting is listed in the package configuration file and is either YES or NO. It determines what happens to the package when the cluster is being started:
- If it is set to YES, the package will be automatically started at cluster start-up and its AUTO_RUN package state will be set to "enabled" when the cluster startup is completed, and the package will be up & running and fully ready for failover.
- If it is set to NO, the package will not be automatically started at cluster start-up, and its AUTO_RUN package state will be "disabled" at the end of the cluster start-up.
The AUTO_RUN package state is dynamic: it is maintained by Serviceguard as part of the overall cluster state information. It can be updated by commands like cmhaltpkg and cmmodpkg. Updating the package state will not change the configuration setting: it just means that the package state has been changed from the configured initial state. When the cluster is halted, all the package state information will be forgotten: when the cluster starts up, the package states will be initialized from scratch to the values set in the package configuration file.
If the AUTO_RUN package state is "enabled" and the package is down, Serviceguard will immediately attempt to restart the package on the most appropriate node (as determined by the package configuration). For this reason, the cmhaltpkg command must set the AUTO_RUN package state to "disabled" before halting the package, or else Serviceguard would just immediately restart the package. That would be rather silly.
Sometimes, you may want to disable failover temporarily but keep the package running on the current node, for example when you're performing maintenance on the alternative node and don't want the package to failover there while you're doing the maintenance work. This is easy to do: just use "cmmodpkg -d <package name>" to set the AUTO_RUN package state to "disabled", and the package won't failover automatically. When the maintenance is over, you can use "cmmodpkg -e <package name>" to restore the package state to "enabled", and the automatic failover will work again.