- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Reason for Package failure ?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-30-2008 03:45 PM
тАО08-30-2008 03:45 PM
Reason for Package failure ?
They are connected together to form a cluster consisting of the two nodes. The servers have 3 shared volume groups , with two disks mirrored disks in each volume group. These shared disks are activated in exclusive mode by the server running the single package at any one time.
The problem I have is that recently two disks have been replaced in these shared volumes (Each failed disk was in different volume groups) These failed disks were replaced and mirrored of the remaining good disk of the volume group in each case.
If I manually activate the shared volumes on each of the nodes in turn and view them and their logical volumes in SAM everything looks good.
However, now .. when restarting the servers, the cluster forms OK and the package appears to start OK... but then the node re-boots with the message "A crucial package has failed" and the package attempts to move to the other node. However, the package then fails in exactly the same way on the new node.
I have looked at the syslog.log file and the package.cntl.log files... but i cannot see a reason for the package failure.
Is there somewhere I should be looking in order to determine the reason for the package failure on both nodes ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-30-2008 04:16 PM
тАО08-30-2008 04:16 PM
Re: Reason for Package failure ?
what do you mean by following
>>package appears to start OK >>
Are you able to see package status running in
cmviewcl -vp pkgname
or
cmviewcl -v
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-30-2008 05:00 PM
тАО08-30-2008 05:00 PM
Re: Reason for Package failure ?
#swlist |grep -i serviceguard
also
#what /usr/lbin/cmcld
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-30-2008 05:12 PM
тАО08-30-2008 05:12 PM
Re: Reason for Package failure ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2008 01:43 AM
тАО08-31-2008 01:43 AM
Re: Reason for Package failure ?
With the cluster of both nodes up i can manually start the package (the package is named "package") on a node by entering:
# cmrunpkg package
The response is "cmrunpkg completed successfully on all packages specified"
If I quickly enter "cmviewcl -vp package" at this time I see the following response
PACKAGE STATUS STATE PKG_SWITCH NODE
package up running disabled kwamc0s
Policy Parametrs
POLICY NAME CONFIGURED_VALUE
Failover configured_node
Failback manual
Script_Parameters
ITEM STATUS MAXRESTARTS RESTARTS NAME
Service up 0 0 cmsmgr
Service up 0 0 ovdm
Service up 0 0 tmn
Service up 0 0 oracle
Service up 0 0 neos
Service up 0 0 lms
Service up 0 0 shut
Subnet up 128.4.0.0
Subnet up 192.168.1.0
Node_Switching_Parameters
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled kwamc0s (current)
Alternate up enabled kwamc1s
.... So at this point everything looks OK (neos and lms are our two custom application programs) and the package looks to be up and running to me.
But aftr a few seconds the message appears:
kwamc0s cmcld: Halting kwamc0s to preserve data integrity
Reason: A crucial package failed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2008 01:53 AM
тАО08-31-2008 01:53 AM
Re: Reason for Package failure ?
PHSS_17581 1.0 MC ServiceGuard 11.05 Cummalative Patch
#what /usr/lbin/cmcld reports the following:
HP92453-02A.10.20 HP_UX SYMBOLIC DEBUGGER (END.0) $Revision 7403 $
Build Date: Wed mar 3 14:17:00 PST 1999
Build id: ibld_sg_a1105_patch
A.100.05 Date 99/02/22 PHSS_17581 (SG English/Japanese) PHSS_17483(LM English) PHSS_17484 (LM Japanese) Date: 99/01/13 PHSS_17230
Daemon
Config DB
Cluster Monitor
Command Srv
CommunicationSrv
Config
Dlm
Local Comm
Network Sensor
Package Manager
Remote Comm
API
Service Sesor
Cluster LVM
Status DB
Sync
Util
A.01.01 Resource Monitor API (11_00_AR: Oct 17 1997 09:24:32)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2008 02:01 AM
тАО08-31-2008 02:01 AM
Re: Reason for Package failure ?
All software within this installation is out of date and support.
That being said, the package seems to be shut down to to concerns about data integrity on shared storage.
You need to check all lock disks and shared disks configured within packages and configurations for trouble.
dmesg to start, perhaps disk exercise with mstm/cstm/xstm (your choice) to find the root of the problem.
It would not hurt to plan to bring this cluster back into the world of supported OS/system software and perhaps gain help from your HP Software service contract.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2008 02:11 AM
тАО08-31-2008 02:11 AM
Re: Reason for Package failure ?
However, i had wondered about the clusterlock before and I think we have performed the correct procedure to ensure that the clusterlock is recreated correctly.
What I did was this:
cmhaltcl
vgchange -c n vg04
vgchange -c n vg05
vgchange -a y vg04
vgchange -a y vg05
cd /etc/cmcluster
I then did the cmapplyconf command to recompile and distribute the package, before deactivateing the volume groups and running the cluster again.
So, I believe the clusterlocks are OK, but is there a way to check this for sure?
Note: I have remotely logged into kwamc0s with a seperate console and I used the command
#tail -f /var/adm/syslog/syslog.log
to follow events in this log when the package is started on kwamc0s.
I see the line
cmcld: Service PKG*3841 terminated due to an exit(0)
However, this line appears immediately BEFORE the line shown below with the same timestamp which says:
cmcld: Started package package on node kwamc0s
I've no idea why the package is failing and rebooting the node.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2008 02:26 AM
тАО08-31-2008 02:26 AM
Re: Reason for Package failure ?
I think the problem with support is that the applications being run use HP Openview DM TMN and support for this has been discontinued by HP. The servers help supervise an older submarine cable system and there is no means to update the application software, so the OS has not been touched for many years an we would not be permitted to upgrade it (by managers on high !). These type of systems are installed and commisioned and then we are not permitted to be upgraded/patch further after the initial commissioning/proving period unless there is an exceptional requirement and proof that any work will not affect current custom built application software.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2008 05:09 AM
тАО08-31-2008 05:09 AM
Re: Reason for Package failure ?
Run
# cmapplyconf -v -C /etc/cmcluster/cluster-configfilename
and then start package.
I guess this could be due to missing lock info on newly replaced disk.
or
If any previous backup is exists for lock vg configuration then
#vgcfgrestore -n /dev/lock-vg-name disk-device-name
example
#vgcfgrestore -n /dev/vg_lock /dev/dsk/c4t6d0
Yes, it is absolutely true that serviceguard version you are running is no more supported by HP.