- Community Home
- >
- Software
- >
- HPE Ezmeral Software platform
- >
- Unable to connect to any of the cluster's CLDBs
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-05-2024 12:11 AM
тАО06-05-2024 12:11 AM
Unable to connect to any of the cluster's CLDBs
Hi,
I have a EDF 7.3 cluster with 8 nodes. CLDB service was running in first 3 nodes. Now, I am getting below error when i am trying to intialize maprticket and when accessing HDFS.
[mapr@node ~]$ maprlogin password
[Password for user 'mapr' at cluster 'cluster.ezm.tst': ]
Unable to connect to any of the cluster's CLDBs. CLDBs tried: node2.ezm.tst:7443, node3.ezm.tst:7443, node1.ezm.tst:7443. Please check your cluster configuration.
[mapr@node ~]$ hdfs dfs -ls
ls: Could not create FileClient err: 104
I have checked the cldb.log and cldb.out files and below are those respectively
java.lang.Exception: Username in ticket file doesn't match with cluster owner
at com.mapr.fs.cldb.CLDBServer.initSecurity(CLDBServer.java:1605)
at com.mapr.fs.cldb.CLDBServer.<init>(CLDBServer.java:523)
at com.mapr.fs.cldb.CLDBServerHolder.getInstance(CLDBServerHolder.java:24)
at com.mapr.fs.cldb.CLDB.<init>(CLDB.java:76)
at com.mapr.fs.cldb.CLDB.main(CLDB.java:411)
2024-06-05 12:05:37,5476 :1596 Obtained CLDB key from PKCS#11 file store
CLDBJNI: Initializing cldb jni with memory 838860800 estContainerSize:144 maxContainersInCache:5825422 mapr-version: $Id: mapr-version: 7.3.0.0.20230425002320.GA 35c1bacac83b999156e2572f2619da84fe2e225e $
fs/common/daremgr.cc:189: HSM enabled, but DARE key not found on HSM. Check log for details
I am using the same mapr user for intailizing the ticket which I used while creating setup. Can anyone please help to bring up the CLDB server?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-05-2024 01:17 AM
тАО06-05-2024 01:17 AM
Re: Unable to connect to any of the cluster's CLDBs
Hello,
Let's check cluster health.
1. Are SP on all the nodes Online ?
/opt/mapr/server/mrconfig sp list -v
2. Is cluster user present in all the nodes?
grep CMD /opt/mapr/logs/configure.log
3. Number of cluster nodes?
4. Any configuration change make recently ?
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-05-2024 10:22 PM - last edited on тАО09-16-2024 02:18 AM by support_s
тАО06-05-2024 10:22 PM - last edited on тАО09-16-2024 02:18 AM by support_s
Re: Unable to connect to any of the cluster's CLDBs
1. In only one node, it is showing below message
[mapr@node1 ~]$ /opt/mapr/server/mrconfig sp list -v
ListSPs resp: status 0:1
No. of SPs (1), totalsize 290266 MiB, totalfree 288052 MiB
SP 0: name SP1, Online, size 290266 MiB, free 288052 MiB, path /dev/sdc, log 200 MiB, port 5660, guid c015e6bc572b86630065800c66090a38, clusterUuid -5881150811843788447--6420490077672223050, disks /dev/sdc /dev/sdb /dev/sdd, dare 0, label default:0
And in all other nodes, it is showing as
[mapr@node8 ~]$ /opt/mapr/server/mrconfig sp list -v
2024-06-06 10:00:10,1077 ERROR Global mrconfig.cc:782 ListSPs rpc failed Connection reset by peer.(104).
2024-06-06 10:00:10,1078 ERROR Global mrconfig.cc:10539 ProcessSPList failed Connection reset by peer.(104).
2. Yes cluster user(mapr) is available in all nodes.
[mapr@node8 ~]$ grep CMD /opt/mapr/logs/configure.log
2023-12-18 15:20:31.46 node8.ezm.tst configure.sh(14967) Install main:4180 CMD: /opt/mapr/server/configure.sh -N cluster.ezm.tst -u mapr -g mapr -f -no-autostart -on-prompt-cont y -secure -v -no-autostart -HS node4.ezm.tst -OT node5.ezm.tst,node6.ezm.tst,node7.ezm.tst -C node1.ezm.tst,node2.ezm.tst,node3.ezm.tst -Z node1.ezm.tst,node2.ezm.tst,node3.ezm.tst -EC -hiveMetastoreHost node4.ezm.tst
2023-12-18 15:34:06.796 node8.ezm.tst configure.sh(31059) Install main:4180 CMD: /opt/mapr/server/configure.sh -R -v -no-autostart -HS node4.ezm.tst -OT node5.ezm.tst,node6.ezm.tst,node7.ezm.tst -EPcollectd -all -EC -hiveMetastoreHost node4.ezm.tst
2023-12-27 09:34:39.842 node8.ezm.tst configure.sh(27919) Install main:4180 CMD: /opt/mapr/server/configure.sh --noRecalcMem -R
2023-12-27 09:45:55.252 node8.ezm.tst configure.sh(8256) Install main:4180 CMD: /opt/mapr/server/configure.sh -R -v -no-autostart -HS node4.ezm.tst -OT node5.ezm.tst,node6.ezm.tst,node7.ezm.tst -EPcollectd -all -EC -hiveMetastoreHost node4.ezm.tst
3. Totally cluster contains 8 nodes.
4. No configurations made recently.
- Tags:
- Port
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-05-2024 11:02 PM - last edited on тАО09-16-2024 02:18 AM by support_s
тАО06-05-2024 11:02 PM - last edited on тАО09-16-2024 02:18 AM by support_s
Re: Unable to connect to any of the cluster's CLDBs
Hi
Can you share the below logs
grep "ZK-Connect" /opt/mapr/logs/cldb.log
grep "FATAL" /opt/mapr/logs/cldb.log
Also please confirm the mapr user is present on all the node and is there any UID change for the mapr user.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Tags:
- storage controller
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-05-2024 11:30 PM
тАО06-05-2024 11:30 PM
Re: Unable to connect to any of the cluster's CLDBs
Hello,
If this is the output for "/opt/mapr/server/mrconfig sp list -v" from all the 7 nodes. Then there might be high chance of CID:1(Primary) would be down, due to SP's are down on all 7 nodes.
And in all other nodes, it is showing as
[mapr@node8 ~]$ /opt/mapr/server/mrconfig sp list -v
2024-06-06 10:00:10,1077 ERROR Global mrconfig.cc:782 ListSPs rpc failed Connection reset by peer.(104).
2024-06-06 10:00:10,1078 ERROR Global mrconfig.cc:10539 ProcessSPList failed Connection reset by peer.(104).
ASK: Could you please share below command output as well, along with the details asked by mapr experts to check this further.
Run below commands on all the cldb nodes.
#maprcli dump cldbstate
#/opt/mapr/server/mrconfig info dumpcontainers | grep "cid:1"
Below from all the zookeeper nodes.
#/opt/mapr/initscripts/zookeeper qstatus
#/opt/mapr/initscripts/zookeeper status
note: What is the status of warden and zookeeper on all 7 nodes?
#systemctl status mapr-zookeeper
#systemctl status mapr-warden
Thanks,
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-06-2024 12:01 AM - last edited on тАО09-16-2024 02:18 AM by support_s
тАО06-06-2024 12:01 AM - last edited on тАО09-16-2024 02:18 AM by support_s
Re: Unable to connect to any of the cluster's CLDBs
[mapr@node1 ~]$ grep "ZK-Connect" /opt/mapr/logs/cldb.log
2023-12-18 14:41:16,343 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-12-18 14:41:36,345 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient : No KvStore Epoch info found in ZooKeeper. New Installation, becoming Master
2023-12-18 14:41:36,345 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_ZK_CONNECT to AWAITING_MASTER_LOCK
2023-12-18 14:41:36,349 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: CLDB is current Master
2023-12-18 14:41:36,349 INFO ZooKeeperClient [ZK-Connect]: CLDB became master. Creating new KvStoreContainer with no fileservers for cid: 1
2023-12-18 14:41:36,352 INFO ZooKeeperClient [ZK-Connect]: Storing KvStoreContainerInfo to ZooKeeper Container ID:1 Servers: Inactive: Unused: Epoch:3 SizeMB:0 CType:NameSpaceContainer
2023-12-18 14:41:36,366 INFO ZooKeeperClient [ZK-Connect]: CLDB became master. Initializing KvStoreContainer for cid: 1
2023-12-18 14:41:36,369 INFO ZooKeeperClient [ZK-Connect]: becomeMasterForKvStoreContainer: CID 1 servers info Container ID:1 Servers: Inactive: Unused: Epoch:3 SizeMB:0 CType:NameSpaceContainer
2023-12-18 14:41:36,369 INFO ZooKeeperClient [ZK-Connect]: Storing KvStoreContainerInfo to ZooKeeper Container ID:1 Servers: Inactive: Unused: Epoch:3 SizeMB:0 CType:NameSpaceContainer
2023-12-18 14:41:36,371 INFO CLDBConfiguration [ZK-Connect]: cldb mode changed from INITIALIZE to MASTER_REGISTER_READY
2023-12-18 14:41:36,371 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_MASTER_LOCK to AWAITING_FS_REGISTER
2023-12-18 14:41:36,371 INFO CLDBServer [ZK-Connect]: Starting thread to monitor waiting for local kvstore to become master
2023-12-18 15:36:03,285 ERROR CLDB [main-EventThread]: Thread: ZK-Connect ID: 21
2023-12-18 15:37:45,957 INFO CLDBServer [ZK-Connect]: tryBecomeMaster: Waiting for cldb init to complete.
2023-12-18 15:37:48,961 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore is of latest epoch CLDB trying to become Master
2023-12-18 15:37:48,962 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_ZK_CONNECT to AWAITING_MASTER_LOCK
2023-12-18 15:37:48,964 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient createActiveEphemeralMasterZNode: /datacenter/controlnodes/cldb/active/CLDBMaster already exists
2023-12-18 15:37:48,964 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: Some other CLDB become master. Current CLDB is Slave
2023-12-18 15:37:48,965 INFO ZooKeeperClient [ZK-Connect]: CLDB got role of slave
2023-12-18 15:37:48,965 INFO CLDBConfiguration [ZK-Connect]: cldb mode changed from INITIALIZE to BECOMING_SLAVE
2023-12-18 15:37:48,965 INFO CLDBServer [ZK-Connect]: Starting thread to become slave CLDB
2023-12-18 15:38:15,252 ERROR CLDB [Becoming Slave Thread]: Thread: ZK-Connect ID: 21
2023-12-18 15:38:47,791 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-12-18 15:39:07,796 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore is of latest epoch CLDB trying to become Master
2023-12-18 15:39:07,798 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_ZK_CONNECT to AWAITING_MASTER_LOCK
2023-12-18 15:39:07,801 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient createActiveEphemeralMasterZNode: /datacenter/controlnodes/cldb/active/CLDBMaster already exists
2023-12-18 15:39:07,802 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: Some other CLDB become master. Current CLDB is Slave
2023-12-18 15:39:07,802 INFO ZooKeeperClient [ZK-Connect]: CLDB got role of slave
2023-12-18 15:39:07,802 INFO CLDBConfiguration [ZK-Connect]: cldb mode changed from INITIALIZE to BECOMING_SLAVE
2023-12-18 15:39:07,802 INFO CLDBServer [ZK-Connect]: Starting thread to become slave CLDB
2023-12-18 15:39:14,431 ERROR CLDB [Becoming Slave Thread]: Thread: ZK-Connect ID: 21
2023-12-18 15:39:46,963 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-12-18 15:40:06,968 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore is of latest epoch CLDB trying to become Master
2023-12-18 15:40:06,968 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_ZK_CONNECT to AWAITING_MASTER_LOCK
2023-12-18 15:40:06,971 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient createActiveEphemeralMasterZNode: /datacenter/controlnodes/cldb/active/CLDBMaster already exists
2023-12-18 15:40:06,971 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: Some other CLDB become master. Current CLDB is Slave
2023-12-18 15:40:06,971 INFO ZooKeeperClient [ZK-Connect]: CLDB got role of slave
2023-12-18 15:40:06,971 INFO CLDBConfiguration [ZK-Connect]: cldb mode changed from INITIALIZE to BECOMING_SLAVE
2023-12-18 15:40:06,972 INFO CLDBServer [ZK-Connect]: Starting thread to become slave CLDB
2023-12-18 15:40:13,539 ERROR CLDB [Becoming Slave Thread]: Thread: ZK-Connect ID: 21
2023-12-18 15:51:20,547 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-12-18 15:51:40,551 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore is of latest epoch CLDB trying to become Master
2023-12-18 15:51:40,552 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_ZK_CONNECT to AWAITING_MASTER_LOCK
2023-12-18 15:51:40,557 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient createActiveEphemeralMasterZNode: /datacenter/controlnodes/cldb/active/CLDBMaster already exists
2023-12-18 15:51:40,557 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: Some other CLDB become master. Current CLDB is Slave
2023-12-18 15:51:40,557 INFO ZooKeeperClient [ZK-Connect]: CLDB got role of slave
2023-12-18 15:51:40,558 INFO CLDBConfiguration [ZK-Connect]: cldb mode changed from INITIALIZE to BECOMING_SLAVE
2023-12-18 15:51:40,558 INFO CLDBServer [ZK-Connect]: Starting thread to become slave CLDB
2023-12-27 09:01:05,202 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-12-27 09:01:25,204 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore is of latest epoch CLDB trying to become Master
2023-12-27 09:01:25,204 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from CLDB_IS_SLAVE_READ_ONLY to AWAITING_MASTER_LOCK
2023-12-27 09:01:25,208 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient createActiveEphemeralMasterZNode: /datacenter/controlnodes/cldb/active/CLDBMaster already exists
2023-12-27 09:01:25,208 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: Some other CLDB become master. Current CLDB is Slave
2023-12-27 09:01:25,208 INFO ZooKeeperClient [ZK-Connect]: CLDB got role of slave
2023-12-27 09:01:25,208 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_MASTER_LOCK to CLDB_IS_SLAVE_READ_ONLY
2023-12-27 09:01:34,934 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-12-27 09:01:54,935 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore is of latest epoch CLDB trying to become Master
2023-12-27 09:01:54,935 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from CLDB_IS_SLAVE_READ_ONLY to AWAITING_MASTER_LOCK
2023-12-27 09:01:54,938 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient createActiveEphemeralMasterZNode: /datacenter/controlnodes/cldb/active/CLDBMaster already exists
2023-12-27 09:01:54,938 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: Some other CLDB become master. Current CLDB is Slave
2023-12-27 09:01:54,939 INFO ZooKeeperClient [ZK-Connect]: CLDB got role of slave
2023-12-27 09:01:54,939 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_MASTER_LOCK to CLDB_IS_SLAVE_READ_ONLY
2023-12-29 11:07:26,420 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-12-29 11:07:46,421 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore is of latest epoch CLDB trying to become Master
2023-12-29 11:07:46,421 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from CLDB_IS_SLAVE_READ_ONLY to AWAITING_MASTER_LOCK
2023-12-29 11:07:46,422 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient createActiveEphemeralMasterZNode: /datacenter/controlnodes/cldb/active/CLDBMaster already exists
2023-12-29 11:07:46,423 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: Some other CLDB become master. Current CLDB is Slave
2023-12-29 11:07:46,423 INFO ZooKeeperClient [ZK-Connect]: CLDB got role of slave
2023-12-29 11:07:46,423 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_MASTER_LOCK to CLDB_IS_SLAVE_READ_ONLY
2023-12-29 17:06:36,196 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-12-29 17:06:56,197 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore is of latest epoch CLDB trying to become Master
2023-12-29 17:06:56,197 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from CLDB_IS_SLAVE_READ_ONLY to AWAITING_MASTER_LOCK
2023-12-29 17:06:56,202 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient createActiveEphemeralMasterZNode: /datacenter/controlnodes/cldb/active/CLDBMaster already exists
2023-12-29 17:06:56,202 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: Some other CLDB become master. Current CLDB is Slave
2023-12-29 17:06:56,203 INFO ZooKeeperClient [ZK-Connect]: CLDB got role of slave
2023-12-29 17:06:56,203 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_MASTER_LOCK to CLDB_IS_SLAVE_READ_ONLY
2023-12-29 17:07:06,937 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-12-29 17:07:26,940 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore is of latest epoch CLDB trying to become Master
2023-12-29 17:07:26,941 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from CLDB_IS_SLAVE_READ_ONLY to AWAITING_MASTER_LOCK
2023-12-29 17:07:26,949 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: CLDB is current Master
2023-12-29 17:07:26,949 INFO ZooKeeperClient [ZK-Connect]: CLDB became master. Initializing KvStoreContainer for cid: 1
2023-12-29 17:07:26,952 INFO ZooKeeperClient [ZK-Connect]: becomeMasterForKvStoreContainer: CID 1 servers info Container ID:1 Master: 10.1.84.67-5(3162807577909630400) SPGUID:8db905e34208aef00065800c660aea86 Servers: 10.1.84.67-5(3162807577909630400) SPGUID:8db905e34208aef00065800c660aea86 10.1.84.66-5(1017878648718736928) SPGUID:c015e6bc572b86630065800c66090a38 10.1.84.68-5(4530750987536816096) SPGUID:2d9f890c872085ea0065800c66058d47 Inactive: Unused: Epoch:5 SizeMB:0 CType:NameSpaceContainer
2023-12-29 17:07:26,953 INFO ZooKeeperClient [ZK-Connect]: Storing KvStoreContainerInfo to ZooKeeper Container ID:1 Servers: 10.1.84.66-5(1017878648718736928) SPGUID:c015e6bc572b86630065800c66090a38 Inactive: 10.1.84.67-5(3162807577909630400) SPGUID:8db905e34208aef00065800c660aea86 10.1.84.68-5(4530750987536816096) SPGUID:2d9f890c872085ea0065800c66058d47 Unused: Epoch:5 SizeMB:0 CType:NameSpaceContainer
2023-12-29 17:07:26,955 INFO CLDBConfiguration [ZK-Connect]: cldb mode changed from SLAVE_READ_ONLY to MASTER_REGISTER_READY
2023-12-29 17:07:26,955 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_MASTER_LOCK to AWAITING_FS_REGISTER
2023-12-29 17:07:26,955 INFO CLDBServer [ZK-Connect]: Starting thread to monitor waiting for local kvstore to become master
2024-01-09 04:30:16,049 INFO CLDBServer [ZK-Connect]: This CLDB is not currently connected to ZooKeeper. It will try to reestablish a connection to the ZooKeeper ensemble for up to 10000 milliseconds before giving up and shutting down
2024-01-23 11:43:52,998 INFO CLDBServer [ZK-Connect]: This CLDB is not currently connected to ZooKeeper. It will try to reestablish a connection to the ZooKeeper ensemble for up to 10000 milliseconds before giving up and shutting down
2024-01-23 11:43:53,107 ERROR CLDB [main-EventThread]: Thread: ZK-Connect ID: 21
2024-06-05 12:05:39,681 INFO CLDBServer [ZK-Connect]: tryBecomeMaster: Waiting for cldb init to complete.
2024-06-05 12:05:42,685 INFO ZooKeeperClient [ZK-Connect]: ZooKeeperClient: KvStore does not have epoch entry CLDB trying to wait until it is Ready
2024-06-05 12:05:42,686 INFO CldbDiagnostics [ZK-Connect]: cldbState changed from AWAITING_ZK_CONNECT to AWAITING_CID1_EPOCH
2024-06-05 12:05:45,702 INFO ZooKeeperClient [ZK-Connect]: Waiting for local KvStoreContainer to become valid. KvStore ContainerInfo Container ID:1 Servers: 10.1.84.67-75508(3162807577909630400) SPGUID:8db905e34208aef00065800c660aea86 Inactive: 10.1.84.73-75508(5333387396529766432) SPGUID:aa1fb11bae23e61d006580162b0392d6 10.1.84.68-75508(4530750987536816096) SPGUID:2d9f890c872085ea0065800c66058d47 Unused: Epoch:75508 SizeMB:0 CType:NameSpaceContainer CLDB ServerID : 1017878648718736928
2024-06-05 12:07:58,229 WARN ZooKeeperClient [ZK-Connect]: ZooKeeperClient : KvStoreContainerInfo read received connection loss exception. Sleeping for 30 Number of retry left 1
[mapr@node1 ~]$ grep "FATAL" /opt/mapr/logs/cldb.log
2023-12-18 15:36:03,258 FATAL CLDB [main-EventThread]: CLDBShutdown: This CLDB will shutdown now because it was holding the master CLDB lock and received notification from the ZooKeeper ensemble that the lock was deleted
2023-12-18 15:38:15,224 FATAL BecomeSlaveThread [Becoming Slave Thread]: license not found for CLDB HA: shutting down
2023-12-18 15:38:15,224 FATAL CLDB [Becoming Slave Thread]: CLDBShutdown: license not found for CLDB HA: shutting down
2023-12-18 15:39:14,412 FATAL BecomeSlaveThread [Becoming Slave Thread]: license not found for CLDB HA: shutting down
2023-12-18 15:39:14,413 FATAL CLDB [Becoming Slave Thread]: CLDBShutdown: license not found for CLDB HA: shutting down
2023-12-18 15:40:13,533 FATAL BecomeSlaveThread [Becoming Slave Thread]: license not found for CLDB HA: shutting down
2023-12-18 15:40:13,534 FATAL CLDB [Becoming Slave Thread]: CLDBShutdown: license not found for CLDB HA: shutting down
2024-01-23 11:43:53,090 FATAL CLDB [main-EventThread]: CLDBShutdown: This CLDB will shutdown now because it was holding the master CLDB lock and received notification from the ZooKeeper ensemble that the lock was deleted
Yes, mapr user is available in all the nodes and there is no UID change.
- Tags:
- enclosure
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-06-2024 12:19 AM - last edited on тАО09-16-2024 02:18 AM by support_s
тАО06-06-2024 12:19 AM - last edited on тАО09-16-2024 02:18 AM by support_s
Re: Unable to connect to any of the cluster's CLDBs
Hi
Thanks for sharing the logs. From the rcent logs I could see the container one is not yet the valid copy and hence CLDB is waiting for CID 1 to become valid. You can check for the storage pool status on the node and /opt/mapr/logs/mfs.log-3 for any errors in loading the storage pools.
/opt/mapr/server/mrconfig sp list -v
Also please confirm the below command output as well.
maprcli dump cldbstate
And below command from all three CLDB node.
/opt/mapr/server/mrconfig info dumpcontainers | grep -w "cid:1"
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Tags:
- storage controller
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-06-2024 12:55 AM - last edited on тАО09-16-2024 02:18 AM by support_s
тАО06-06-2024 12:55 AM - last edited on тАО09-16-2024 02:18 AM by support_s
Re: Unable to connect to any of the cluster's CLDBs
CLDB is running in first three nodes(node1 to node3) in the cluster.
1. In node 1, i am getting below message,
[mapr@node1 ~]$ /opt/mapr/server/mrconfig sp list -v
ListSPs resp: status 0:1
No. of SPs (1), totalsize 290266 MiB, totalfree 288052 MiB
SP 0: name SP1, Online, size 290266 MiB, free 288052 MiB, path /dev/sdc, log 200 MiB, port 5660, guid c015e6bc572b86630065800c66090a38, clusterUuid -5881150811843788447--6420490077672223050, disks /dev/sdc /dev/sdb /dev/sdd, dare 0, label default:0
in all other nodes, i am getting below error,
[mapr@node2 ~]$ /opt/mapr/server/mrconfig sp list -v
2024-06-06 13:18:13,4211 ERROR Global mrconfig.cc:782 ListSPs rpc failed Connection reset by peer.(104).
2024-06-06 13:18:13,4211 ERROR Global mrconfig.cc:10539 ProcessSPList failed Connection reset by peer.(104).
2. I am getting the same message in all the nodes for cldbstate
[mapr@node2 ~]$ maprcli dump cldbstate
ip error
10.x.x.x Couldn't connect to the CLDB service
10.x.x.x Couldn't connect to the CLDB service
10.x.x.x Couldn't connect to the CLDB service
3. I am getting below message in node1.
[mapr@node1 ~]$ /opt/mapr/server/mrconfig info dumpcontainers | grep -w "cid:1"
cid:1 volid:1 sp:SP1:/dev/sdc spid:c015e6bc572b86630065800c66090a38 prev:0 next:0 issnap:0 isclone:0 deleteinprog:0 fixedbyfsck:0 stale:1 querycldb:0 resyncinprog:0 shared:0 owned:0 logical:0 snapusage:0 snapusageupdated:0 ismirror:0 isrwmirrorcapable:0 role:-1 awaitingrole:0 totalInodes:0 freeInodes:0 dare:0 istiered:0 numtotalblocks:0 numpurgedblocks:0 numoffloadedblocks:0 isConStatsEnabled:0 mirrorId:0 rollforwardInProg:0 rollforwardpending:0 maxUniq:0 isResyncSnapshot:0 snapId:0 port:5660
I am getting below error in node 2 and node3.
[mapr@node3 ~]$ /opt/mapr/server/mrconfig info dumpcontainers | grep -w "cid:1"
2024-06-06 13:19:21,0401 ERROR Global mrconfig.cc:4069 RPC to dump containers failed Connection reset by peer.(104).
- Tags:
- Port
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-06-2024 02:34 AM - last edited on тАО09-16-2024 02:18 AM by support_s
тАО06-06-2024 02:34 AM - last edited on тАО09-16-2024 02:18 AM by support_s
Re: Unable to connect to any of the cluster's CLDBs
Hi
Thanks for sharing the details. storage pool on the first CLDb node seems to be file but the CID 1 is in stale state. And in other 2 nodes the storage pools are not yet loaded. For CLDB to start, CID 1 should have min valid replica. Please check if mapr-warden service is running on 2nd and 3rd cldb nodes. else please start the service.
systemctl status mapr-warden
systemctl start mapr-warden
Once it started please check the storage status.
/opt/mapr/server/mrconfig sp list -v
If you are still facing error with above command, please check /opt/mapr/logs/mfs.log-3 and share if there is any error.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Tags:
- storage controller
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-06-2024 02:55 AM - last edited on тАО06-06-2024 11:10 PM by Sunitha_Mod
тАО06-06-2024 02:55 AM - last edited on тАО06-06-2024 11:10 PM by Sunitha_Mod
Re: Unable to connect to any of the cluster's CLDBs
@Mahesh_S I started the mapr-warden and checked the storage status. It is showing below error.
2024-06-06 15:23:49,1548 ERROR Global mrconfig.cc:782 ListSPs rpc failed Connection reset by peer.(104).
2024-06-06 15:23:49,1548 ERROR Global mrconfig.cc:10539 ProcessSPList failed Connection reset by peer.(104).
Then i checked the /opt/mapr/logs/mfs.log-3 file and noticed that it is updated two days back and showing the below error
2024-06-04 16:14:00,2027 INFO CLDBHA cldbha.cc:1170 Above message hit 2 times in 1717497825236 ms
2024-06-04 16:14:02,1826 INFO CLDBHA cldbha.cc:1170 RegisterToCldbDone iid 0 regnErr 30 from CLDB 10.x.x.x:7222
2024-06-04 16:14:02,2112 INFO CLDBHA cldbha.cc:1170 RegisterToCldbDone iid 0 regnErr 3 from CLDB 10.x.x.x:7222
2024-06-04 16:14:03,2142 INFO CLDBHA cldbha.cc:1170 RegisterToCldbDone iid 0 regnErr 3 from CLDB 10.x.x.x:7222
2024-06-04 16:14:03,2142 INFO CLDBHA cldbha.cc:1170 Above message hit 1 times in 1717497824236 ms
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-07-2024 12:34 AM
тАО06-07-2024 12:34 AM
Re: Unable to connect to any of the cluster's CLDBs
Hi
Thanks for sharing the details. Seems like mfs is not started yet. You can check /opt/mapr/logs/warden.log and check if there is any error.
Also please check if there is any pid file exist in /opt/mapr/pid. if Pid files exists, stop warden serivce and the clear all the pid files. check if there is any mapr process are still running (ps -ef | grep mapr), if yes stop them and restart mapr-warden service. Once started please monitor warden.log and mgs.log-3.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
