topic EDF 7.3: cldb CID1 container disk marked as failed IO Timeout in HPE Ezmeral Software platform

EDF 7.3: cldb CID1 container disk marked as failed IO Timeout

filip_novak — Wed, 01 Jan 2025 09:09:30 GMT

Hi!

We have a problem with the disk that is used as maprfs. After some mistake with Ceph keys creation for virtual disks in proxmox - some disks were read-only for some time but returned to normal read-write status. But 2 disks on 1 control node with CLDB and with Name Container (CID1) on them were marked as failed by MapR handle_disk_failure.sh script. So CLDB cannot start without CID1.

############################ Disk Failure Report ########################### Disk : sdc Failure Reason : I/O time out Time of Failure : Thu 26 Dec 2024 01:27:49 AM EET Resolution : Please refer to Data Fabric online documentation at https://docs.datafabric.hpe.com/home/AdministratorGuide/Managing-Disks.html on how to handle disk failures. If you have further questions, please either post on https://community.datafabric.hpe.com/s/ or contact Data Fabric technical support. ############################ Disk Failure Report ########################### Disk : sdb Failure Reason : I/O time out Time of Failure : Thu 26 Dec 2024 01:33:55 AM EET Resolution : Please refer to Data Fabric online documentation at https://docs.datafabric.hpe.com/home/AdministratorGuide/Managing-Disks.html on how to handle disk failures. If you have further questions, please either post on https://community.datafabric.hpe.com/s/ or contact Data Fabric technical support.

There is a suggestion to use /opt/mapr/server/fsck to check the disks, but as I understand we need to remove them first with mrconfig disk remove - this may cause a loss of CID1.

Trying to bring back SP online we get this log in mfs.log-3:

2024-12-26 18:55:32,2630 INFO IOMgr iomgr.cc:1764 SP1:/dev/sdc on DG Concat1-3 consists of 2 disks: 2024-12-26 18:55:32,2630 INFO IOMgr iomgr.cc:1767 SP1:0 /dev/sdc 2024-12-26 18:55:32,2630 INFO IOMgr iomgr.cc:1767 SP1:1 /dev/sdd 2024-12-26 18:55:32,2630 INFO IOMgr spinit.cc:251 Read SP /dev/sdc Superblock 2024-12-26 18:55:32,2639 ERROR IOMgr spinit.cc:310 SP SP1:/dev/sdc online failed, it was previously marked with disk ERROR: I/O time out error, 110. To bring it online first repair the SP using fsck utility. 2024-12-26 18:55:32,2639 INFO IOMgr spinit.cc:51 Storage Pool DeInit() 2024-12-26 18:55:32,2639 INFO IOMgr spserver.cc:1001 < SPOnline ctx 0x558b4f3b4000 err 110

mrconfig sp list ListSPs resp: status 0:3 No. of SPs (3), totalsize 3234281 MiB, totalfree 1122840 MiB SP 0: name SP1, Offline, size 3686398 MiB, free 0 MiB, path /dev/sdc SP 1: name SP3, Online, size 1667316 MiB, free 595549 MiB, path /dev/sdb SP 2: name SP4, Online, size 1566964 MiB, free 527290 MiB, path /dev/sde mrconfig disk list ListDisks resp: status 0 count=4 ListDisks /dev/sdc size 1843200MB DG 0: Single SingleDisk1 Online DG 1: Raid0 Stripe1-2 Online DG 2: Concat Concat1-3 Online SP 0: name SP1, Offline, size 3686398 MiB, free 0 MiB, path /dev/sdc ListDisks /dev/sdb size 1740800MB DG 0: Single SingleDisk5 Online DG 1: Concat Concat5-2 Online SP 0: name SP3, Online, size 1667316 MiB, free 595549 MiB, path /dev/sdb ListDisks /dev/sde size 1638400MB DG 0: Single SingleDisk6 Online DG 1: Concat Concat6-2 Online SP 0: name SP4, Online, size 1566964 MiB, free 527290 MiB, path /dev/sde ListDisks /dev/sdd size 1843200MB DG 0: Single SingleDisk3 Online DG 1: Raid0 Stripe1-2 Online SP 0: name SP1, Offline, size 3686398 MiB, free 0 MiB, path /dev/sdc

We believe that those disks are good now and there is no failure on them (it can still be).
Using Ezmeral Data Fabric v7.3 with 1 CLDB node

How could we unmark those disks and try to restart the cluster? Is there any chance of losing data with this?

Re: EDF 7.3: cldb CID1 container disk marked as failed IO Timeout

ldarby — Tue, 31 Dec 2024 15:21:55 GMT

Hi, there's no need to remove the disks.

If fsck fails with 'Device or resource busy' then it needs to either be offlined or unloaded:

mrconfig sp offline /dev/sdc

mrconfig sp unload SP1

One of these should allow fsck -r to complete, then use:

mrconfig sp refresh

to bring it back online.

Re: EDF 7.3: cldb CID1 container disk marked as failed IO Timeout

filip_novak — Tue, 07 Jan 2025 14:29:51 GMT

@ldarby Thank you for the advice!
We had run

/opt/mapr/server/fsck -n SP1

Without -r option first, to see the problem

2025-01-06 12:03:06,5367 INFO CacheMgr cachemgr.cc:3517 cachePercentagesIn: inode:0:log:0:meta:2:dir:0:small:0:db:0:valc:0
2025-01-06 12:03:06,5367 INFO CacheMgr cachemgr.cc:3533 CacheSize 74126 MB, inode:0:meta:2:dir:0:small:0:large:98:db:0:valc:0:spillc:0:segmc:0
2025-01-06 12:03:07,1887 INFO CacheMgr cachemgr.cc:3243 lru   meta 0: start      1 end 189762 blocks 189762 [1482M], dirtyquota  75904 [ 593M]
2025-01-06 12:03:07,2472 INFO CacheMgr cachemgr.cc:3243 lru  large 2: start 189763 end 9488128 blocks 9298366 [72643M], dirtyquota 8368529 [65379M]
2025-01-06 12:03:07,2524 INFO CacheMgr cachemgr.cc:3610 BlockCacheCount 9488128
2025-01-06 12:03:42,4551 INFO IO iodispatch.cc:110 using IO maxEvents: 5000
2025-01-06 12:03:42,4553 INFO IOMgr iomgr.cc:363 maxSlowIOs 30, slowDiskTimeOut 240 s, maxOutstandingIOsPerDisk 50000, MaxStoragePools 129, port 0, isDARE 0
fsck.cc:656 Repair flag: 0
iomgr.cc:3069 found 4 disks in disktab
lun.cc:1127 Loading disk:/dev/sdc
lun.cc:1139 /dev/sdc LoadDisk 0x55a97ecfb200 retry 0
lun.cc:838 disk /dev/sdc numaid -1
lun.cc:775 Disk Open /dev/sdc isSSD_ initialized to 0
lun.cc:1127 Loading disk:/dev/sdd
lun.cc:1139 /dev/sdd LoadDisk 0x55a97ecfb608 retry 0
lun.cc:838 disk /dev/sdd numaid -1
lun.cc:775 Disk Open /dev/sdd isSSD_ initialized to 0
lun.cc:1127 Loading disk:/dev/sdb
lun.cc:1139 /dev/sdb LoadDisk 0x55a97ecfba10 retry 0
lun.cc:735 target device open /dev/sdb failed: Device or resource busy, errno 16
lun.cc:1143 OnlineDisk /dev/sdb failed Device or resource busy, errno 16
lun.cc:1127 Loading disk:/dev/sde
lun.cc:1139 /dev/sde LoadDisk 0x55a97ecfbe18 retry 0
lun.cc:735 target device open /dev/sde failed: Device or resource busy, errno 16
lun.cc:1143 OnlineDisk /dev/sde failed Device or resource busy, errno 16
lun.cc:1318 Disk /dev/sdc, Loading concat DG Concat1-3 readystate(0)
iomgr.cc:1807 SP SP1 found on disk /dev/sdc
lun.cc:1435 /dev/sdc Disk Loaded
lun.cc:1436 Disk /dev/sdc loaded numRecords 3
lun.cc:1318 Disk /dev/sdd, Loading concat DG Concat1-3 readystate(1)
lun.cc:1374 DG already added to sptable
lun.cc:1435 /dev/sdd Disk Loaded
lun.cc:1436 Disk /dev/sdd loaded numRecords 2
12:03:50 phase1.cc:39 ERROR FSERR Superblock is marked with error 110
phase2.cc:725 start orphanage container processing
phase2.cc:2367 WalkContainer 64: rw 64 inodes 333824 clus 1304 rblock 0x1013419 size 85458944: con 1 of 276
phase2.cc:746 done orphanage container processing
phase2.cc:929 runningSnapChainWalks 7 maxSnapChains 0 maxInodeScans 35, numInodeScansPerContainer 5
phase2.cc:2367 WalkContainer 2052: rw 2052 inodes 256 clus 1 rblock 0x0 size 65536: con 8 of 276
phase2.cc:2367 WalkContainer 2067: rw 2067 inodes 256 clus 1 rblock 0x0 size 65536: con 8 of 276

...

phase2.cc:2367 WalkContainer 3624: rw 3624 inodes 4096 clus 16 rblock 0x5460048 size 1048576: con 275 of 276
phase2.cc:2367 WalkContainer 3634: rw 3634 inodes 4096 clus 16 rblock 0x10cc0380 size 1048576: con 276 of 276
12:04:24 fsck.cc:542 FSCK start time(1736157830 | 348397)
fsck.cc:544 FSCK end time(1736157864 | 639374)
fsck.cc:545 FSCK time taken: 34 sec
fsck.cc:551 FSCK read-ahead stats: t-was: 1155, i-was: 0, y-was: 128, n-was: 0, btd: 18, btr: 18, dd: 28528, dr: 16027
fsck.cc:554 FSCK cache stats: lu: 2802772, mi: 1453250
fsck.cc:561 FSCK IO stats: reads: 1410027, readBlocks: 1500527 writes: 2331, writeBlocks: 27106
alloc.cc:297 Number of Data blocks 310818696 shared 0
alloc.cc:297 Number of Inode blocks 48689 shared 0
alloc.cc:297 Number of Orphanage blocks 0 shared 0
alloc.cc:297 Number of BTreeIntr blocks 52371 shared 0
alloc.cc:297 Number of BTreeLeaf blocks 1317911 shared 0
alloc.cc:297 Number of Log blocks 51200 shared 0
alloc.cc:297 Number of BlockBitmap blocks 14400 shared 0
alloc.cc:297 Number of SPMetaBlock blocks 67 shared 0
alloc.cc:297 Number of DGPrivate blocks 0 shared 0
alloc.cc:297 Number of Fidmap blocks 0 shared 0
alloc.cc:297 Number of Misc blocks 260 shared 0
alloc.cc:297 Number of SymLink blocks 0 shared 0
alloc.cc:297 Number of Unknown blocks 0 shared 0
alloc.cc:303 Total Number of blocks 312303594 shared 0 crc checked 722
fsck.cc:570 errorsInFsck = 1
fsck.cc:576 ERROR
FSCK completed with errors.

So there was a Superblock marked with 110 (timeout I guess)

12:03:50 phase1.cc:39 ERROR FSERR Superblock is marked with error 110

Is this safe to do fsck -r now?

Also, there is no faileddisk.log inside /opt/mapr/logs, does fsck delete it?

I have checked mfs.conf and at the bottom, I saw:

mfs.on.virtual.machine=0

But we are running nodes on proxmox VMs, should I change it to 1 on all nodes?

Re: EDF 7.3: cldb CID1 container disk marked as failed IO Timeout

VineetKumar — Tue, 07 Jan 2025 20:59:40 GMT

@filip_novak Since, virtual disks issue is resolved now, you should try to reboot the node and observed if you are still getting "I/O time out" error messages. If so, then verify that there is no issue at disk or OS end. If team identified that there's no issue at disk or OS end and team still observing the same error message then try to run fsck with "-r" option for SP1. Before running fsck command, please make sure that "CID:1" replica's are available and fully resync.

Thanks,
Vineet