<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Service guard cluster failed to halt - corrupted the filesystem in Operating System - Linux</title>
    <link>https://community.hpe.com/t5/operating-system-linux/service-guard-cluster-failed-to-halt-corrupted-the-filesystem/m-p/4502700#M56516</link>
    <description>We have tried FS_UNMOUNT_COUNT=3 option earlier and it was not helpful to fix the issue.&lt;BR /&gt;&lt;BR /&gt;After cluster failed to halt the cluster package, we have tried fuser and umount manually number of times.&lt;BR /&gt;&lt;BR /&gt;The filesystems (/opt/Carmen &amp;amp; /users) which are part of this cluster are NFS exported and many clients are accessing through NFS.&lt;BR /&gt;&lt;BR /&gt;Since the clients are accessing these filesystem through NFS, it is not allowing to unmount the filesystems. &lt;BR /&gt;&lt;BR /&gt;We have simulated this in our test environment (without cluster) and we were able to unmount the filesystem only after  &lt;BR /&gt;1. exportfs -a, 2. stopping NFS service, 3. umount&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;The Cluster control script is also trying to unexport the filesystems, then stopping NFS services and trying to unmount the filesystems, but it is failing when trying to stop the NFS service.&lt;BR /&gt;&lt;BR /&gt;Suspecting NFS service stop will be the  issue.&lt;BR /&gt;======&lt;BR /&gt;Please find attached cluster control script and log files.&lt;BR /&gt;=======&lt;BR /&gt;&lt;BR /&gt;Please suggest how to resolve the issue.</description>
    <pubDate>Thu, 24 Sep 2009 08:33:26 GMT</pubDate>
    <dc:creator>skd</dc:creator>
    <dc:date>2009-09-24T08:33:26Z</dc:date>
    <item>
      <title>Service guard cluster failed to halt - corrupted the filesystem</title>
      <link>https://community.hpe.com/t5/operating-system-linux/service-guard-cluster-failed-to-halt-corrupted-the-filesystem/m-p/4502699#M56515</link>
      <description>Hello Everyone,&lt;BR /&gt;&lt;BR /&gt;We are facing critical issue during the failover.&lt;BR /&gt;&lt;BR /&gt;Cluster is failed to halt properly and corrupted the filesystem.&lt;BR /&gt;&lt;BR /&gt;The senario is during halt&lt;BR /&gt;1. enexporting nfs file system&lt;BR /&gt;2. Stopping NFS service (failed)&lt;BR /&gt;3. unable to unmount the filesystem&lt;BR /&gt;4. fsck running on mounted filesystem corrupted the data (during startup of cluster)&lt;BR /&gt;5. Filesystem gone.&lt;BR /&gt;&lt;BR /&gt;The filesystems are NFS exported.&lt;BR /&gt;==============&lt;BR /&gt;Aug 14 14:29:08 - Node "stcrm93a": Unexporting filesystem on *:/opt/Car&lt;BR /&gt;Aug 14 14:29:08 - Node "stcrm93a": Unexporting filesystem on *:/users&lt;BR /&gt;ERROR: sync_rmtab: can't open rmtab_sync file for write&lt;BR /&gt;ERROR: sync_rmtab: fail to export the rmtab data&lt;BR /&gt;ERROR: Aug 14 14:29:08 - Failed to stop NFS.&lt;BR /&gt;ERROR: Function verify_ha_server; Failed to stop HA servers&lt;BR /&gt;==================</description>
      <pubDate>Thu, 24 Sep 2009 08:08:15 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/service-guard-cluster-failed-to-halt-corrupted-the-filesystem/m-p/4502699#M56515</guid>
      <dc:creator>skd</dc:creator>
      <dc:date>2009-09-24T08:08:15Z</dc:date>
    </item>
    <item>
      <title>Re: Service guard cluster failed to halt - corrupted the filesystem</title>
      <link>https://community.hpe.com/t5/operating-system-linux/service-guard-cluster-failed-to-halt-corrupted-the-filesystem/m-p/4502700#M56516</link>
      <description>We have tried FS_UNMOUNT_COUNT=3 option earlier and it was not helpful to fix the issue.&lt;BR /&gt;&lt;BR /&gt;After cluster failed to halt the cluster package, we have tried fuser and umount manually number of times.&lt;BR /&gt;&lt;BR /&gt;The filesystems (/opt/Carmen &amp;amp; /users) which are part of this cluster are NFS exported and many clients are accessing through NFS.&lt;BR /&gt;&lt;BR /&gt;Since the clients are accessing these filesystem through NFS, it is not allowing to unmount the filesystems. &lt;BR /&gt;&lt;BR /&gt;We have simulated this in our test environment (without cluster) and we were able to unmount the filesystem only after  &lt;BR /&gt;1. exportfs -a, 2. stopping NFS service, 3. umount&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;The Cluster control script is also trying to unexport the filesystems, then stopping NFS services and trying to unmount the filesystems, but it is failing when trying to stop the NFS service.&lt;BR /&gt;&lt;BR /&gt;Suspecting NFS service stop will be the  issue.&lt;BR /&gt;======&lt;BR /&gt;Please find attached cluster control script and log files.&lt;BR /&gt;=======&lt;BR /&gt;&lt;BR /&gt;Please suggest how to resolve the issue.</description>
      <pubDate>Thu, 24 Sep 2009 08:33:26 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/service-guard-cluster-failed-to-halt-corrupted-the-filesystem/m-p/4502700#M56516</guid>
      <dc:creator>skd</dc:creator>
      <dc:date>2009-09-24T08:33:26Z</dc:date>
    </item>
    <item>
      <title>Re: Service guard cluster failed to halt - corrupted the filesystem</title>
      <link>https://community.hpe.com/t5/operating-system-linux/service-guard-cluster-failed-to-halt-corrupted-the-filesystem/m-p/4502701#M56517</link>
      <description>Shalom,&lt;BR /&gt;&lt;BR /&gt;check the man page options in umount&lt;BR /&gt;&lt;BR /&gt;You are probably getting a device busy on the umount. If in your SG configuration you use a more forceful option, you can probably kick out the users.&lt;BR /&gt;&lt;BR /&gt;If this is like an Oracle database or something, you may need to configure a second package, or into this package to shut down immediate as part of the failover process.&lt;BR /&gt;&lt;BR /&gt;SEP</description>
      <pubDate>Thu, 24 Sep 2009 11:21:35 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/service-guard-cluster-failed-to-halt-corrupted-the-filesystem/m-p/4502701#M56517</guid>
      <dc:creator>Steven E. Protter</dc:creator>
      <dc:date>2009-09-24T11:21:35Z</dc:date>
    </item>
    <item>
      <title>Re: Service guard cluster failed to halt - corrupted the filesystem</title>
      <link>https://community.hpe.com/t5/operating-system-linux/service-guard-cluster-failed-to-halt-corrupted-the-filesystem/m-p/4502702#M56518</link>
      <description>Thanks for the update.&lt;BR /&gt;&lt;BR /&gt;Yes - we have tried umount -l&lt;BR /&gt;((-l     Lazy  unmount.  Detach  the  filesystem  from the filesystem hierarchy now, and cleanup all references to the      filesystem as soon as it is not busy anymore. This  option  allows  a  busy  filesystem  to  be  unmounted.))&lt;BR /&gt;&lt;BR /&gt;But this was also not helpful.&lt;BR /&gt;&lt;BR /&gt;Filesystem corrupted after using this option.&lt;BR /&gt;&lt;BR /&gt;We have tried fuser &amp;amp; kill -9 to kill the process. But still issue.&lt;BR /&gt;&lt;BR /&gt;No Oracle Database...NFS mounted Filesystems are using here.&lt;BR /&gt;&lt;BR /&gt;Please let me know if need more details&lt;BR /&gt;</description>
      <pubDate>Thu, 24 Sep 2009 14:12:42 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/service-guard-cluster-failed-to-halt-corrupted-the-filesystem/m-p/4502702#M56518</guid>
      <dc:creator>skd</dc:creator>
      <dc:date>2009-09-24T14:12:42Z</dc:date>
    </item>
  </channel>
</rss>

