- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Problem with Linux ‘ps’ command can cause false fa...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2005 10:55 AM
06-09-2005 10:55 AM
Problem with Linux ‘ps’ command can cause false failover of packages
Various Serviceguard for Linux toolkits use this command and it is a suggested method for users writing their own scripts. They check the pid to see if the process being monitored is still running. The ‘ps’ error causes the monitor script to falsely determine that the process is no longer running, causing the package to failover.
The exact command lines that have problems are:
pid=`ps $p_pid | grep $PROC ! awk ‘{print $1}’`
if [ -z “$pid” ]; then
This should be replaced with:
grep $PROC /proc/$p_pid/stat >/dev/null
if [ $? –ne 0 ]; then
Rather than looking through all of the pids in the /proc filesystem, this just checks the pid that is being monitored.
If you think you have experienced a false failover, then check the monitor scripts and make this change.
Even if you have not experienced a false failover, it is recommended that you make this change. Any contributed toolkit that uses the ‘ps’ command in this way will be changed in their next release. Because of testing, this may take up to 3 months for any specific toolkit.
Remember to make the change on all servers that may run the package. Note that because the file is open on the server running the package, it will not be updated immediately. This last node will only be updated after the package is moved. During a maintenance period, move the package and recheck the file on all nodes. Remember, if a server fails between a change to the file and the maintenance period, the file may not have been updated. That is why it is CRITICAL to recheck all nodes after the package move.
As new or updated toolkits are released,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2005 11:14 AM
06-09-2005 11:14 AM
Re: Problem with Linux ‘ps’ command can cause false failover of packages
The furthering of this would be to put the grep straight into the if:
if grep -q $PROC /proc/$p_pid/stat 2>/dev/null
as 'if' checks the exit state of the application.. just a bit quicker than launching 'test' ([) and checking $?.
Anyway, just some thoughts.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2005 10:17 PM
06-09-2005 10:17 PM
Re: Problem with Linux ‘ps’ command can cause false failover of packages
Will find/search all my scripts for the use of this.
I did read the bugzilla entry, to try and understand it all, but seem to me Stuart Browne thoughts are correct! way to go, or is there something we missed ?
Jean-Pierre Huc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2005 06:04 AM
06-13-2005 06:04 AM
Re: Problem with Linux ‘ps’ command can cause false failover of packages
We will keep it this way because we'll get any errors from "grep" logged. Also, test is a built in function so there is not major launch overhead.
There may be some advantage to the -q.
We really want to change as little as possible to minimze teh risk of introducing another problem.
Huc,
That's why I posted it with the description - to make everyone who uses this aware of possible problems. Glad it may help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2005 10:36 AM
06-13-2005 10:36 AM
Re: Problem with Linux ‘ps’ command can cause false failover of packages
lrwxrwxrwx 1 root root 4 May 1 2004 /usr/bin/[ -> test
and that '[[]]' are inbuilt. Sometimes shell man pages are just too long:
man bash: under 'CONDITIONAL EXPRESSIONS'
Conditional expressions are used by the [[ compound command and the test and [ builtin commands to test file attributes and perform string and arithmetic comparisons.
My apologies.
In any case, as the grep isn't disposing of STDERR, you'll still get an ugly-error when the '$p_pid' doesn't exist.