<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic EDF 7.3: cldb CID1 container disk marked as failed IO Timeout in HPE Ezmeral Software platform</title>
    <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/edf-7-3-cldb-cid1-container-disk-marked-as-failed-io-timeout/m-p/7232142#M857</link>
    <description>&lt;P&gt;Hi!&lt;/P&gt;&lt;P&gt;We have a problem with the disk that is used as &lt;STRONG&gt;maprfs&lt;/STRONG&gt;. After some mistake with Ceph keys creation for virtual disks in proxmox - some disks were read-only for some time but returned to normal read-write status. But 2 disks on 1 control node with &lt;STRONG&gt;CLDB&lt;/STRONG&gt; and with &lt;STRONG&gt;Name Container (CID1)&lt;/STRONG&gt; on them were marked as failed by MapR&amp;nbsp;&lt;STRONG&gt;handle_disk_failure.sh&lt;/STRONG&gt; script. So &lt;STRONG&gt;CLDB&lt;/STRONG&gt; cannot start without &lt;STRONG&gt;CID1&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;############################ Disk Failure Report ###########################

Disk                :    sdc
Failure Reason      :    I/O time out
Time of Failure     :    Thu 26 Dec 2024 01:27:49 AM EET
Resolution          :
   Please refer to Data Fabric online documentation at https://docs.datafabric.hpe.com/home/AdministratorGuide/Managing-Disks.html
   on how to handle disk failures. If you have further questions, please either post on https://community.datafabric.hpe.com/s/
   or contact Data Fabric technical support.

############################ Disk Failure Report ###########################

Disk                :    sdb
Failure Reason      :    I/O time out
Time of Failure     :    Thu 26 Dec 2024 01:33:55 AM EET
Resolution          :
   Please refer to Data Fabric online documentation at https://docs.datafabric.hpe.com/home/AdministratorGuide/Managing-Disks.html
   on how to handle disk failures. If you have further questions, please either post on https://community.datafabric.hpe.com/s/
   or contact Data Fabric technical support.&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;There is a suggestion to use &lt;STRONG&gt;/opt/mapr/server/fsck&lt;/STRONG&gt; to check the disks, but as I understand we need to remove them first with &lt;STRONG&gt;mrconfig disk remove&lt;/STRONG&gt; - this may cause a loss of &lt;STRONG&gt;CID1&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Trying to bring back SP online we get this log in mfs.log-3:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;2024-12-26 18:55:32,2630 INFO IOMgr iomgr.cc:1764 SP1:/dev/sdc on DG Concat1-3 consists of 2 disks:
2024-12-26 18:55:32,2630 INFO IOMgr iomgr.cc:1767 SP1:0 /dev/sdc
2024-12-26 18:55:32,2630 INFO IOMgr iomgr.cc:1767 SP1:1 /dev/sdd
2024-12-26 18:55:32,2630 INFO IOMgr spinit.cc:251 Read SP /dev/sdc Superblock
2024-12-26 18:55:32,2639 ERROR IOMgr spinit.cc:310 SP SP1:/dev/sdc online failed, it was previously marked with disk ERROR: I/O time out error, 110. To bring it online first repair the SP using fsck utility.
2024-12-26 18:55:32,2639 INFO IOMgr spinit.cc:51 Storage Pool DeInit()
2024-12-26 18:55:32,2639 INFO IOMgr spserver.cc:1001 &amp;lt; SPOnline ctx 0x558b4f3b4000 err 110&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;mrconfig sp list
ListSPs resp: status 0:3
No. of SPs (3), totalsize 3234281 MiB, totalfree 1122840 MiB

SP 0: name SP1, Offline, size 3686398 MiB, free 0 MiB, path /dev/sdc
SP 1: name SP3, Online, size 1667316 MiB, free 595549 MiB, path /dev/sdb
SP 2: name SP4, Online, size 1566964 MiB, free 527290 MiB, path /dev/sde

mrconfig disk list
ListDisks resp: status 0 count=4
ListDisks /dev/sdc
size 1843200MB
	DG 0: Single SingleDisk1 Online
	DG 1: Raid0 Stripe1-2 Online
	DG 2: Concat Concat1-3 Online
	SP 0: name SP1, Offline, size 3686398 MiB, free 0 MiB, path /dev/sdc
ListDisks /dev/sdb
size 1740800MB
	DG 0: Single SingleDisk5 Online
	DG 1: Concat Concat5-2 Online
	SP 0: name SP3, Online, size 1667316 MiB, free 595549 MiB, path /dev/sdb
ListDisks /dev/sde
size 1638400MB
	DG 0: Single SingleDisk6 Online
	DG 1: Concat Concat6-2 Online
	SP 0: name SP4, Online, size 1566964 MiB, free 527290 MiB, path /dev/sde
ListDisks /dev/sdd
size 1843200MB
	DG 0: Single SingleDisk3 Online
	DG 1: Raid0 Stripe1-2 Online
	SP 0: name SP1, Offline, size 3686398 MiB, free 0 MiB, path /dev/sdc&lt;/LI-CODE&gt;&lt;P&gt;We believe that those disks are good now and there is no failure on them (it can still be).&lt;BR /&gt;Using&amp;nbsp;&lt;SPAN&gt;Ezmeral Data Fabric&lt;/SPAN&gt; v7.3 with 1 CLDB node&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;How could we unmark those disks and try to restart the cluster? Is there any chance of losing data with this?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 01 Jan 2025 09:09:30 GMT</pubDate>
    <dc:creator>filip_novak</dc:creator>
    <dc:date>2025-01-01T09:09:30Z</dc:date>
    <item>
      <title>EDF 7.3: cldb CID1 container disk marked as failed IO Timeout</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/edf-7-3-cldb-cid1-container-disk-marked-as-failed-io-timeout/m-p/7232142#M857</link>
      <description>&lt;P&gt;Hi!&lt;/P&gt;&lt;P&gt;We have a problem with the disk that is used as &lt;STRONG&gt;maprfs&lt;/STRONG&gt;. After some mistake with Ceph keys creation for virtual disks in proxmox - some disks were read-only for some time but returned to normal read-write status. But 2 disks on 1 control node with &lt;STRONG&gt;CLDB&lt;/STRONG&gt; and with &lt;STRONG&gt;Name Container (CID1)&lt;/STRONG&gt; on them were marked as failed by MapR&amp;nbsp;&lt;STRONG&gt;handle_disk_failure.sh&lt;/STRONG&gt; script. So &lt;STRONG&gt;CLDB&lt;/STRONG&gt; cannot start without &lt;STRONG&gt;CID1&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;############################ Disk Failure Report ###########################

Disk                :    sdc
Failure Reason      :    I/O time out
Time of Failure     :    Thu 26 Dec 2024 01:27:49 AM EET
Resolution          :
   Please refer to Data Fabric online documentation at https://docs.datafabric.hpe.com/home/AdministratorGuide/Managing-Disks.html
   on how to handle disk failures. If you have further questions, please either post on https://community.datafabric.hpe.com/s/
   or contact Data Fabric technical support.

############################ Disk Failure Report ###########################

Disk                :    sdb
Failure Reason      :    I/O time out
Time of Failure     :    Thu 26 Dec 2024 01:33:55 AM EET
Resolution          :
   Please refer to Data Fabric online documentation at https://docs.datafabric.hpe.com/home/AdministratorGuide/Managing-Disks.html
   on how to handle disk failures. If you have further questions, please either post on https://community.datafabric.hpe.com/s/
   or contact Data Fabric technical support.&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;There is a suggestion to use &lt;STRONG&gt;/opt/mapr/server/fsck&lt;/STRONG&gt; to check the disks, but as I understand we need to remove them first with &lt;STRONG&gt;mrconfig disk remove&lt;/STRONG&gt; - this may cause a loss of &lt;STRONG&gt;CID1&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Trying to bring back SP online we get this log in mfs.log-3:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;2024-12-26 18:55:32,2630 INFO IOMgr iomgr.cc:1764 SP1:/dev/sdc on DG Concat1-3 consists of 2 disks:
2024-12-26 18:55:32,2630 INFO IOMgr iomgr.cc:1767 SP1:0 /dev/sdc
2024-12-26 18:55:32,2630 INFO IOMgr iomgr.cc:1767 SP1:1 /dev/sdd
2024-12-26 18:55:32,2630 INFO IOMgr spinit.cc:251 Read SP /dev/sdc Superblock
2024-12-26 18:55:32,2639 ERROR IOMgr spinit.cc:310 SP SP1:/dev/sdc online failed, it was previously marked with disk ERROR: I/O time out error, 110. To bring it online first repair the SP using fsck utility.
2024-12-26 18:55:32,2639 INFO IOMgr spinit.cc:51 Storage Pool DeInit()
2024-12-26 18:55:32,2639 INFO IOMgr spserver.cc:1001 &amp;lt; SPOnline ctx 0x558b4f3b4000 err 110&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;mrconfig sp list
ListSPs resp: status 0:3
No. of SPs (3), totalsize 3234281 MiB, totalfree 1122840 MiB

SP 0: name SP1, Offline, size 3686398 MiB, free 0 MiB, path /dev/sdc
SP 1: name SP3, Online, size 1667316 MiB, free 595549 MiB, path /dev/sdb
SP 2: name SP4, Online, size 1566964 MiB, free 527290 MiB, path /dev/sde

mrconfig disk list
ListDisks resp: status 0 count=4
ListDisks /dev/sdc
size 1843200MB
	DG 0: Single SingleDisk1 Online
	DG 1: Raid0 Stripe1-2 Online
	DG 2: Concat Concat1-3 Online
	SP 0: name SP1, Offline, size 3686398 MiB, free 0 MiB, path /dev/sdc
ListDisks /dev/sdb
size 1740800MB
	DG 0: Single SingleDisk5 Online
	DG 1: Concat Concat5-2 Online
	SP 0: name SP3, Online, size 1667316 MiB, free 595549 MiB, path /dev/sdb
ListDisks /dev/sde
size 1638400MB
	DG 0: Single SingleDisk6 Online
	DG 1: Concat Concat6-2 Online
	SP 0: name SP4, Online, size 1566964 MiB, free 527290 MiB, path /dev/sde
ListDisks /dev/sdd
size 1843200MB
	DG 0: Single SingleDisk3 Online
	DG 1: Raid0 Stripe1-2 Online
	SP 0: name SP1, Offline, size 3686398 MiB, free 0 MiB, path /dev/sdc&lt;/LI-CODE&gt;&lt;P&gt;We believe that those disks are good now and there is no failure on them (it can still be).&lt;BR /&gt;Using&amp;nbsp;&lt;SPAN&gt;Ezmeral Data Fabric&lt;/SPAN&gt; v7.3 with 1 CLDB node&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;How could we unmark those disks and try to restart the cluster? Is there any chance of losing data with this?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Jan 2025 09:09:30 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/edf-7-3-cldb-cid1-container-disk-marked-as-failed-io-timeout/m-p/7232142#M857</guid>
      <dc:creator>filip_novak</dc:creator>
      <dc:date>2025-01-01T09:09:30Z</dc:date>
    </item>
    <item>
      <title>Re: EDF 7.3: cldb CID1 container disk marked as failed IO Timeout</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/edf-7-3-cldb-cid1-container-disk-marked-as-failed-io-timeout/m-p/7232153#M858</link>
      <description>&lt;P&gt;Hi, there's no need to remove the disks.&lt;/P&gt;&lt;P&gt;If fsck fails with 'Device or resource busy' then it needs to either be offlined or unloaded:&lt;/P&gt;&lt;P&gt;mrconfig sp offline&amp;nbsp;/dev/sdc&lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;mrconfig sp unload SP1&lt;/P&gt;&lt;P&gt;One of these should allow fsck -r to complete, then use:&lt;/P&gt;&lt;P&gt;mrconfig sp refresh&lt;/P&gt;&lt;P&gt;to bring it back online.&lt;/P&gt;</description>
      <pubDate>Tue, 31 Dec 2024 15:21:55 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/edf-7-3-cldb-cid1-container-disk-marked-as-failed-io-timeout/m-p/7232153#M858</guid>
      <dc:creator>ldarby</dc:creator>
      <dc:date>2024-12-31T15:21:55Z</dc:date>
    </item>
    <item>
      <title>Re: EDF 7.3: cldb CID1 container disk marked as failed IO Timeout</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/edf-7-3-cldb-cid1-container-disk-marked-as-failed-io-timeout/m-p/7232445#M859</link>
      <description>&lt;P&gt;&lt;a href="https://community.hpe.com/t5/user/viewprofilepage/user-id/2052722"&gt;@ldarby&lt;/a&gt;&amp;nbsp; Thank you for the advice!&lt;BR /&gt;We had run&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;/opt/mapr/server/fsck -n SP1&lt;/PRE&gt;&lt;P&gt;Without &lt;FONT face="andale mono,times"&gt;&lt;STRONG&gt;-r&lt;/STRONG&gt;&lt;/FONT&gt; option first, to see the problem&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;2025-01-06 12:03:06,5367 INFO CacheMgr cachemgr.cc:3517 cachePercentagesIn: inode:0:log:0:meta:2:dir:0:small:0:db:0:valc:0&lt;BR /&gt;2025-01-06 12:03:06,5367 INFO CacheMgr cachemgr.cc:3533 CacheSize &lt;SPAN&gt;74126 &lt;/SPAN&gt;MB, inode:0:meta:2:dir:0:small:0:large:98:db:0:valc:0:spillc:0:segmc:0&lt;BR /&gt;2025-01-06 12:03:07,1887 INFO CacheMgr cachemgr.cc:3243 lru   meta 0: start      &lt;SPAN&gt;1 &lt;/SPAN&gt;end &lt;SPAN&gt;189762 &lt;/SPAN&gt;blocks &lt;SPAN&gt;189762 &lt;/SPAN&gt;&lt;SPAN&gt;[&lt;/SPAN&gt;1482M&lt;SPAN&gt;]&lt;/SPAN&gt;, dirtyquota  &lt;SPAN&gt;75904 &lt;/SPAN&gt;&lt;SPAN&gt;[ &lt;/SPAN&gt;593M&lt;SPAN&gt;]&lt;BR /&gt;&lt;/SPAN&gt;2025-01-06 12:03:07,2472 INFO CacheMgr cachemgr.cc:3243 lru  large 2: start &lt;SPAN&gt;189763 &lt;/SPAN&gt;end &lt;SPAN&gt;9488128 &lt;/SPAN&gt;blocks &lt;SPAN&gt;9298366 &lt;/SPAN&gt;&lt;SPAN&gt;[&lt;/SPAN&gt;72643M&lt;SPAN&gt;]&lt;/SPAN&gt;, dirtyquota &lt;SPAN&gt;8368529 &lt;/SPAN&gt;&lt;SPAN&gt;[&lt;/SPAN&gt;65379M&lt;SPAN&gt;]&lt;BR /&gt;&lt;/SPAN&gt;2025-01-06 12:03:07,2524 INFO CacheMgr cachemgr.cc:3610 BlockCacheCount &lt;SPAN&gt;9488128&lt;BR /&gt;&lt;/SPAN&gt;2025-01-06 12:03:42,4551 INFO IO iodispatch.cc:110 using IO maxEvents: &lt;SPAN&gt;5000&lt;BR /&gt;&lt;/SPAN&gt;2025-01-06 12:03:42,4553 INFO IOMgr iomgr.cc:363 maxSlowIOs 30, slowDiskTimeOut &lt;SPAN&gt;240 &lt;/SPAN&gt;s, maxOutstandingIOsPerDisk 50000, MaxStoragePools 129, port 0, isDARE &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;fsck.cc:656 Repair flag: &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;iomgr.cc:3069 found &lt;SPAN&gt;4 &lt;/SPAN&gt;disks &lt;SPAN&gt;in &lt;/SPAN&gt;disktab&lt;BR /&gt;lun.cc:1127 Loading disk:/dev/sdc&lt;BR /&gt;lun.cc:1139 /dev/sdc LoadDisk &lt;SPAN&gt;0x55a97ecfb200 &lt;/SPAN&gt;retry &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:838 disk /dev/sdc numaid -1&lt;BR /&gt;lun.cc:775 Disk Open /dev/sdc isSSD_ initialized to &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:1127 Loading disk:/dev/sdd&lt;BR /&gt;lun.cc:1139 /dev/sdd LoadDisk &lt;SPAN&gt;0x55a97ecfb608 &lt;/SPAN&gt;retry &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:838 disk /dev/sdd numaid -1&lt;BR /&gt;lun.cc:775 Disk Open /dev/sdd isSSD_ initialized to &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:1127 Loading disk:/dev/sdb&lt;BR /&gt;lun.cc:1139 /dev/sdb LoadDisk &lt;SPAN&gt;0x55a97ecfba10 &lt;/SPAN&gt;retry &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:735 target device open /dev/sdb failed: Device or resource busy, errno &lt;SPAN&gt;16&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:1143 OnlineDisk /dev/sdb failed Device or resource busy, errno &lt;SPAN&gt;16&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:1127 Loading disk:/dev/sde&lt;BR /&gt;lun.cc:1139 /dev/sde LoadDisk &lt;SPAN&gt;0x55a97ecfbe18 &lt;/SPAN&gt;retry &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:735 target device open /dev/sde failed: Device or resource busy, errno &lt;SPAN&gt;16&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:1143 OnlineDisk /dev/sde failed Device or resource busy, errno &lt;SPAN&gt;16&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:1318 Disk /dev/sdc, Loading concat DG Concat1-3 readystate&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;BR /&gt;&lt;/SPAN&gt;iomgr.cc:1807 SP SP1 found on disk /dev/sdc&lt;BR /&gt;lun.cc:1435 /dev/sdc Disk Loaded&lt;BR /&gt;lun.cc:1436 Disk /dev/sdc loaded numRecords &lt;SPAN&gt;3&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:1318 Disk /dev/sdd, Loading concat DG Concat1-3 readystate&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;BR /&gt;&lt;/SPAN&gt;lun.cc:1374 DG already added to sptable&lt;BR /&gt;lun.cc:1435 /dev/sdd Disk Loaded&lt;BR /&gt;lun.cc:1436 Disk /dev/sdd loaded numRecords &lt;SPAN&gt;2&lt;BR /&gt;&lt;/SPAN&gt;12:03:50 phase1.cc:39 ERROR FSERR Superblock is marked with error &lt;SPAN&gt;110&lt;BR /&gt;&lt;/SPAN&gt;phase2.cc:725 start orphanage container processing&lt;BR /&gt;phase2.cc:2367 WalkContainer 64: rw &lt;SPAN&gt;64 &lt;/SPAN&gt;inodes &lt;SPAN&gt;333824 &lt;/SPAN&gt;clus &lt;SPAN&gt;1304 &lt;/SPAN&gt;rblock &lt;SPAN&gt;0x1013419 &lt;/SPAN&gt;size 85458944: con &lt;SPAN&gt;1 &lt;/SPAN&gt;of &lt;SPAN&gt;276&lt;BR /&gt;&lt;/SPAN&gt;phase2.cc:746 &lt;SPAN&gt;done &lt;/SPAN&gt;orphanage container processing&lt;BR /&gt;phase2.cc:929 runningSnapChainWalks &lt;SPAN&gt;7 &lt;/SPAN&gt;maxSnapChains &lt;SPAN&gt;0 &lt;/SPAN&gt;maxInodeScans 35, numInodeScansPerContainer &lt;SPAN&gt;5&lt;BR /&gt;&lt;/SPAN&gt;phase2.cc:2367 WalkContainer 2052: rw &lt;SPAN&gt;2052 &lt;/SPAN&gt;inodes &lt;SPAN&gt;256 &lt;/SPAN&gt;clus &lt;SPAN&gt;1 &lt;/SPAN&gt;rblock &lt;SPAN&gt;0x0 &lt;/SPAN&gt;size 65536: con &lt;SPAN&gt;8 &lt;/SPAN&gt;of &lt;SPAN&gt;276&lt;BR /&gt;&lt;/SPAN&gt;phase2.cc:2367 WalkContainer 2067: rw &lt;SPAN&gt;2067 &lt;/SPAN&gt;inodes &lt;SPAN&gt;256 &lt;/SPAN&gt;clus &lt;SPAN&gt;1 &lt;/SPAN&gt;rblock &lt;SPAN&gt;0x0 &lt;/SPAN&gt;size 65536: con &lt;SPAN&gt;8 &lt;/SPAN&gt;of &lt;SPAN&gt;276&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;...&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;phase2.cc:2367 WalkContainer 3624: rw &lt;SPAN&gt;3624 &lt;/SPAN&gt;inodes &lt;SPAN&gt;4096 &lt;/SPAN&gt;clus &lt;SPAN&gt;16 &lt;/SPAN&gt;rblock &lt;SPAN&gt;0x5460048 &lt;/SPAN&gt;size 1048576: con &lt;SPAN&gt;275 &lt;/SPAN&gt;of &lt;SPAN&gt;276&lt;BR /&gt;&lt;/SPAN&gt;phase2.cc:2367 WalkContainer 3634: rw &lt;SPAN&gt;3634 &lt;/SPAN&gt;inodes &lt;SPAN&gt;4096 &lt;/SPAN&gt;clus &lt;SPAN&gt;16 &lt;/SPAN&gt;rblock &lt;SPAN&gt;0x10cc0380 &lt;/SPAN&gt;size 1048576: con &lt;SPAN&gt;276 &lt;/SPAN&gt;of &lt;SPAN&gt;276&lt;BR /&gt;&lt;/SPAN&gt;12:04:24 fsck.cc:542 FSCK start time&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;1736157830 &lt;/SPAN&gt;| &lt;SPAN&gt;348397&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;BR /&gt;&lt;/SPAN&gt;fsck.cc:544 FSCK end time&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;1736157864 &lt;/SPAN&gt;| &lt;SPAN&gt;639374&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;BR /&gt;&lt;/SPAN&gt;fsck.cc:545 FSCK time taken: &lt;SPAN&gt;34 &lt;/SPAN&gt;sec&lt;BR /&gt;fsck.cc:551 FSCK read-ahead stats: t-was: 1155, i-was: 0, y-was: 128, n-was: 0, btd: 18, btr: 18, dd: 28528, dr: &lt;SPAN&gt;16027&lt;BR /&gt;&lt;/SPAN&gt;fsck.cc:554 FSCK cache stats: lu: 2802772, mi: &lt;SPAN&gt;1453250&lt;BR /&gt;&lt;/SPAN&gt;fsck.cc:561 FSCK IO stats: reads: 1410027, readBlocks: &lt;SPAN&gt;1500527 &lt;/SPAN&gt;writes: 2331, writeBlocks: &lt;SPAN&gt;27106&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of Data blocks &lt;SPAN&gt;310818696 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of Inode blocks &lt;SPAN&gt;48689 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of Orphanage blocks &lt;SPAN&gt;0 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of BTreeIntr blocks &lt;SPAN&gt;52371 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of BTreeLeaf blocks &lt;SPAN&gt;1317911 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of Log blocks &lt;SPAN&gt;51200 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of BlockBitmap blocks &lt;SPAN&gt;14400 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of SPMetaBlock blocks &lt;SPAN&gt;67 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of DGPrivate blocks &lt;SPAN&gt;0 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of Fidmap blocks &lt;SPAN&gt;0 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of Misc blocks &lt;SPAN&gt;260 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of SymLink blocks &lt;SPAN&gt;0 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:297 Number of Unknown blocks &lt;SPAN&gt;0 &lt;/SPAN&gt;shared &lt;SPAN&gt;0&lt;BR /&gt;&lt;/SPAN&gt;alloc.cc:303 Total Number of blocks &lt;SPAN&gt;312303594 &lt;/SPAN&gt;shared &lt;SPAN&gt;0 &lt;/SPAN&gt;crc checked &lt;SPAN&gt;722&lt;BR /&gt;&lt;/SPAN&gt;fsck.cc:570 errorsInFsck = &lt;SPAN&gt;1&lt;BR /&gt;&lt;/SPAN&gt;fsck.cc:576 ERROR&lt;BR /&gt;FSCK completed with errors.&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;So there was a Superblock marked with 110 (timeout I guess)&lt;/P&gt;&lt;P&gt;&lt;FONT face="andale mono,times" size="2"&gt;12:03:50 phase1.cc:39 ERROR FSERR Superblock is marked with error &lt;SPAN&gt;110&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;Is this safe to do &lt;FONT face="andale mono,times"&gt;fsck -r&lt;/FONT&gt;&amp;nbsp;now?&lt;/P&gt;&lt;P&gt;Also, there is no&amp;nbsp;&lt;FONT face="andale mono,times"&gt;faileddisk.log&lt;/FONT&gt; inside &lt;FONT face="andale mono,times"&gt;/opt/mapr/logs&lt;/FONT&gt;, does&amp;nbsp;&lt;FONT face="andale mono,times"&gt;fsck &lt;/FONT&gt;delete it?&lt;/P&gt;&lt;P&gt;I have checked mfs.conf and at the bottom, I saw:&lt;/P&gt;&lt;P&gt;&lt;FONT face="andale mono,times"&gt;mfs.on.virtual.machine=0&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;But we are running nodes on proxmox VMs, should I change it to 1 on all nodes?&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 07 Jan 2025 14:29:51 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/edf-7-3-cldb-cid1-container-disk-marked-as-failed-io-timeout/m-p/7232445#M859</guid>
      <dc:creator>filip_novak</dc:creator>
      <dc:date>2025-01-07T14:29:51Z</dc:date>
    </item>
    <item>
      <title>Re: EDF 7.3: cldb CID1 container disk marked as failed IO Timeout</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/edf-7-3-cldb-cid1-container-disk-marked-as-failed-io-timeout/m-p/7232453#M860</link>
      <description>&lt;P&gt;&lt;a href="https://community.hpe.com/t5/user/viewprofilepage/user-id/2380377"&gt;@filip_novak&lt;/a&gt;&amp;nbsp;Since, &lt;SPAN&gt;virtual disks issue is resolved now, y&lt;/SPAN&gt;ou should try to reboot the node and observed if you are still getting "I/O time out" error messages. If so, then verify that there is no issue at disk or OS end. If team identified that there's no issue at disk or OS end and team still observing the same error message then try to run fsck with "-r" option for SP1. Before running fsck command, please make sure that "CID:1" replica's are available and fully resync.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Vineet&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2025 20:59:40 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/edf-7-3-cldb-cid1-container-disk-marked-as-failed-io-timeout/m-p/7232453#M860</guid>
      <dc:creator>VineetKumar</dc:creator>
      <dc:date>2025-01-07T20:59:40Z</dc:date>
    </item>
  </channel>
</rss>

