HPE GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: Service guard cluster failed to halt - corrupt...
Operating System - Linux
1832527
Members
8406
Online
110043
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-24-2009 01:08 AM
09-24-2009 01:08 AM
Service guard cluster failed to halt - corrupted the filesystem
Hello Everyone,
We are facing critical issue during the failover.
Cluster is failed to halt properly and corrupted the filesystem.
The senario is during halt
1. enexporting nfs file system
2. Stopping NFS service (failed)
3. unable to unmount the filesystem
4. fsck running on mounted filesystem corrupted the data (during startup of cluster)
5. Filesystem gone.
The filesystems are NFS exported.
==============
Aug 14 14:29:08 - Node "stcrm93a": Unexporting filesystem on *:/opt/Car
Aug 14 14:29:08 - Node "stcrm93a": Unexporting filesystem on *:/users
ERROR: sync_rmtab: can't open rmtab_sync file for write
ERROR: sync_rmtab: fail to export the rmtab data
ERROR: Aug 14 14:29:08 - Failed to stop NFS.
ERROR: Function verify_ha_server; Failed to stop HA servers
==================
We are facing critical issue during the failover.
Cluster is failed to halt properly and corrupted the filesystem.
The senario is during halt
1. enexporting nfs file system
2. Stopping NFS service (failed)
3. unable to unmount the filesystem
4. fsck running on mounted filesystem corrupted the data (during startup of cluster)
5. Filesystem gone.
The filesystems are NFS exported.
==============
Aug 14 14:29:08 - Node "stcrm93a": Unexporting filesystem on *:/opt/Car
Aug 14 14:29:08 - Node "stcrm93a": Unexporting filesystem on *:/users
ERROR: sync_rmtab: can't open rmtab_sync file for write
ERROR: sync_rmtab: fail to export the rmtab data
ERROR: Aug 14 14:29:08 - Failed to stop NFS.
ERROR: Function verify_ha_server; Failed to stop HA servers
==================
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-24-2009 01:33 AM
09-24-2009 01:33 AM
Re: Service guard cluster failed to halt - corrupted the filesystem
We have tried FS_UNMOUNT_COUNT=3 option earlier and it was not helpful to fix the issue.
After cluster failed to halt the cluster package, we have tried fuser and umount manually number of times.
The filesystems (/opt/Carmen & /users) which are part of this cluster are NFS exported and many clients are accessing through NFS.
Since the clients are accessing these filesystem through NFS, it is not allowing to unmount the filesystems.
We have simulated this in our test environment (without cluster) and we were able to unmount the filesystem only after
1. exportfs -a, 2. stopping NFS service, 3. umount
The Cluster control script is also trying to unexport the filesystems, then stopping NFS services and trying to unmount the filesystems, but it is failing when trying to stop the NFS service.
Suspecting NFS service stop will be the issue.
======
Please find attached cluster control script and log files.
=======
Please suggest how to resolve the issue.
After cluster failed to halt the cluster package, we have tried fuser and umount manually number of times.
The filesystems (/opt/Carmen & /users) which are part of this cluster are NFS exported and many clients are accessing through NFS.
Since the clients are accessing these filesystem through NFS, it is not allowing to unmount the filesystems.
We have simulated this in our test environment (without cluster) and we were able to unmount the filesystem only after
1. exportfs -a, 2. stopping NFS service, 3. umount
The Cluster control script is also trying to unexport the filesystems, then stopping NFS services and trying to unmount the filesystems, but it is failing when trying to stop the NFS service.
Suspecting NFS service stop will be the issue.
======
Please find attached cluster control script and log files.
=======
Please suggest how to resolve the issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-24-2009 04:21 AM
09-24-2009 04:21 AM
Re: Service guard cluster failed to halt - corrupted the filesystem
Shalom,
check the man page options in umount
You are probably getting a device busy on the umount. If in your SG configuration you use a more forceful option, you can probably kick out the users.
If this is like an Oracle database or something, you may need to configure a second package, or into this package to shut down immediate as part of the failover process.
SEP
check the man page options in umount
You are probably getting a device busy on the umount. If in your SG configuration you use a more forceful option, you can probably kick out the users.
If this is like an Oracle database or something, you may need to configure a second package, or into this package to shut down immediate as part of the failover process.
SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-24-2009 07:12 AM
09-24-2009 07:12 AM
Re: Service guard cluster failed to halt - corrupted the filesystem
Thanks for the update.
Yes - we have tried umount -l
((-l Lazy unmount. Detach the filesystem from the filesystem hierarchy now, and cleanup all references to the filesystem as soon as it is not busy anymore. This option allows a busy filesystem to be unmounted.))
But this was also not helpful.
Filesystem corrupted after using this option.
We have tried fuser & kill -9 to kill the process. But still issue.
No Oracle Database...NFS mounted Filesystems are using here.
Please let me know if need more details
Yes - we have tried umount -l
((-l Lazy unmount. Detach the filesystem from the filesystem hierarchy now, and cleanup all references to the filesystem as soon as it is not busy anymore. This option allows a busy filesystem to be unmounted.))
But this was also not helpful.
Filesystem corrupted after using this option.
We have tried fuser & kill -9 to kill the process. But still issue.
No Oracle Database...NFS mounted Filesystems are using here.
Please let me know if need more details
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Events and news
Customer resources
© Copyright 2025 Hewlett Packard Enterprise Development LP