- Community Home
- >
- Software
- >
- HPE Ezmeral Software platform
- >
- cldb.pid exists with pid 11461 but no CLDB
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-26-2023 05:57 AM - edited 09-26-2023 06:35 AM
09-26-2023 05:57 AM - edited 09-26-2023 06:35 AM
After shutting down 2 nodes at the OS level and restarting these nodes, the cluster can not communicate with CLDB. I put a few command outputs below and the full cluster logs are in the drive link.
> jps
16643 FsShell
25748 Jps
16390 FsShell
24621 CentralConfigCopyHelper
11533 WardenMain
2189 QuorumPeerMain
13327 CLDB
> sudo /etc/init.d/mapr-cldb status
/opt/mapr/pid/cldb.pid exists with pid 11461 but no CLDB.
> tail -n 25 /opt/mapr/logs/cldb.log
mapr@node0:~$ tail -n 25 /opt/mapr/logs/cldb.log
2023-09-26 08:36:05,591 INFO ZooKeeperClient [main-EventThread]: Setting Cldb Info in ZooKeeper, external Port:7222
2023-09-26 08:36:05,598 INFO CLDBServer [main-EventThread]: The CLDB received notification that a ZooKeeper event of type None occurred on path null
2023-09-26 08:36:05,603 INFO CLDBServer [ZK-Connect]: Previous CLDB was not a clean shutdown waiting for 20000ms before attempting to become master
2023-09-26 08:36:05,614 INFO ECTierManager [main]: Subscribed for EC gateway registration notifications. Current gateways...
2023-09-26 08:36:05,619 INFO ClusterGroup [Thread-7]: isInited: isClusterGroupDbInited:false, isExternalServerDbInited:false
2023-09-26 08:36:05,619 INFO ClusterGroup [Thread-7]: isInited: isClusterGroupDbInited:false, isExternalServerDbInited:false
2023-09-26 08:36:07,115 INFO HttpServer [main]: Disabled algorithms are : TLS_AES_128_GCM_SHA256
2023-09-26 08:36:07,115 INFO HttpServer [main]: Disabled protocols are : TLSv1.3
2023-09-26 08:36:07,157 INFO CLDB [main]: CLDBState: CLDB State change : WAIT_FOR_FILESERVERS
2023-09-26 08:36:07,160 INFO CLDBWatchdog [main]: CLDB memory threshold(heap + non heap) is set to : 8096 MB. Xmx: 4000, Configured Non-Heap: 4096
2023-09-26 08:36:07,161 INFO CLDB [main]: [Starting RPCServer] port: 7222 num threads: 10 heap size: 4000MB IPGutsShm 32768 startup options: -Xms2400m -Xmx4000m -XX:ErrorFile=/opt/cores/hs_err_pid%p.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/cores -XX:ThreadStackSize=1024
2023-09-26 08:36:07,164 INFO CLDB [main]: Starting 2 RPC Instances for CLDB
2023-09-26 08:36:07,169 ERROR CLDB [main]: Exception in RPC init
2023-09-26 08:36:07,169 ERROR CLDB [main]: Could not initialize RPC.. aborting
2023-09-26 08:36:22,953 INFO CLDB [main]: Loading properties file : /opt/mapr/conf/cldb.conf
2023-09-26 08:36:23,297 INFO CLDBMetrics [main]: Initializing CLDB Metrics with serviceName: cldbServer
2023-09-26 08:36:23,301 INFO CLDB [main]: CLDBInit: Using hostname file /opt/mapr/hostname and hostid file /opt/mapr/hostid
2023-09-26 08:36:23,302 INFO CLDB [main]: CLDB Properties from configuration file: num.volmirror.threads=1cldb.numthreads=10cldb.web.https.port=7443cldb.port=7222cldb.detect.dup.hostid.enabled=falsecldb.min.fileservers=1cldb.zookeeper.servers=node0.cluster:5181cldb.web.port=7221enable.replicas.invariant.check=falsecldb.jmxremote.port=7220hadoop.version=3.3.4
2023-09-26 08:36:23,302 INFO CLDB [main]: CLDB Command line args: /opt/mapr/conf/cldb.conf
2023-09-26 08:36:23,302 INFO CLDB [main]: CLDBInit: Initializing CLDB
2023-09-26 08:36:23,303 INFO CLDB [main]: MapR BuildVersion: 7.4.0.0.20230728133744.GA
2023-09-26 08:36:23,303 INFO CLDB [main]: $Id: mapr-version: 7.4.0.0.20230728133744.GA 9601b852f5443980b667e1a7910c66d39cb84c77 $
2023-09-26 08:36:23,303 INFO CLDB [main]: CLDBInit: Start CLDBServer
2023-09-26 08:36:23,349 INFO CLDBServer [main]: CLDBInit: HostName: node0.cluster ServerId: 3977623316996854432
2023-09-26 08:36:23,444 ERROR CLDBServer [main]: Username in ticket file /opt/mapr/conf/maprserverticket doesn't match with cluster owner's username. ticket's user: mapr cluster owner: root
mapr@node0:~$ sudo maprlogin password
[Password for user 'root' at cluster 'cluster.treo.com.tr': ]
MapR credentials of user 'root' for cluster 'cluster.treo.com.tr' are written to '/tmp/maprticket_0'
mapr@node0:~$ sudo /etc/init.d/mapr-cldb stop
CLDB not running.
mapr@node0:~$ sudo /etc/init.d/mapr-cldb start
Starting CLDB, logging to /opt/mapr/logs/cldb.log
mapr@node0:~$ sudo /etc/init.d/mapr-cldb status
/opt/mapr/pid/cldb.pid exists with pid 30042 but no CLDB.
mapr@node0:~$ tail -n 15 /opt/mapr/logs/cldb.log
2023-09-26 08:36:23,303 INFO CLDB [main]: $Id: mapr-version: 7.4.0.0.20230728133744.GA 9601b852f5443980b667e1a7910c66d39cb84c77 $
2023-09-26 08:36:23,303 INFO CLDB [main]: CLDBInit: Start CLDBServer
2023-09-26 08:36:23,349 INFO CLDBServer [main]: CLDBInit: HostName: node0.cluster ServerId: 3977623316996854432
2023-09-26 08:36:23,444 ERROR CLDBServer [main]: Username in ticket file /opt/mapr/conf/maprserverticket doesn't match with cluster owner's username. ticket's user: mapr cluster owner: root
2023-09-26 08:39:28,657 INFO CLDB [main]: Loading properties file : /opt/mapr/conf/cldb.conf
2023-09-26 08:39:29,008 INFO CLDBMetrics [main]: Initializing CLDB Metrics with serviceName: cldbServer
2023-09-26 08:39:29,012 INFO CLDB [main]: CLDBInit: Using hostname file /opt/mapr/hostname and hostid file /opt/mapr/hostid
2023-09-26 08:39:29,013 INFO CLDB [main]: CLDB Properties from configuration file: num.volmirror.threads=1cldb.numthreads=10cldb.web.https.port=7443cldb.port=7222cldb.detect.dup.hostid.enabled=falsecldb.min.fileservers=1cldb.zookeeper.servers=node0.cluster:5181cldb.web.port=7221enable.replicas.invariant.check=falsecldb.jmxremote.port=7220hadoop.version=3.3.4
2023-09-26 08:39:29,013 INFO CLDB [main]: CLDB Command line args: /opt/mapr/conf/cldb.conf
2023-09-26 08:39:29,013 INFO CLDB [main]: CLDBInit: Initializing CLDB
2023-09-26 08:39:29,014 INFO CLDB [main]: MapR BuildVersion: 7.4.0.0.20230728133744.GA
2023-09-26 08:39:29,014 INFO CLDB [main]: $Id: mapr-version: 7.4.0.0.20230728133744.GA 9601b852f5443980b667e1a7910c66d39cb84c77 $
2023-09-26 08:39:29,014 INFO CLDB [main]: CLDBInit: Start CLDBServer
2023-09-26 08:39:29,059 INFO CLDBServer [main]: CLDBInit: HostName: node0.cluster ServerId: 3977623316996854432
2023-09-26 08:39:29,154 ERROR CLDBServer [main]: Username in ticket file /opt/mapr/conf/maprserverticket doesn't match with cluster owner's username. ticket's user: mapr cluster owner: root
> tail -n 25 /opt/mapr/logs/cldb.out
fs/common/daremgr.cc:189: HSM enabled, but DARE key not found on HSM. Check log for details
2023-09-26 09:00:05,8499 :2732 Listen: 2732: bind: error 98 port 7222java.io.IOException: Could not intialize RPC java.io.IOException: Exception in RPC init at com.mapr.fs.cldb.CLDB.initializeRpcInstances(CLDB.java:179)
at com.mapr.fs.cldb.CLDB.<init>(CLDB.java:95)
at com.mapr.fs.cldb.CLDB.main(CLDB.java:411)
CLDBShm: ***shmget with key 7222, size: 70848
CLDBShm created rpc guts shared memory, size 70848
2023-09-26 09:00:06,4176 :1606 Obtained CLDB key from PKCS#11 file store
CLDBJNI: Initializing cldb jni with memory 838860800 estContainerSize:144 maxContainersInCache:5825422 mapr-version: $Id: mapr-version: 7.4.0.0.20230728133744.GA 9601b852f5443980b667e1a7910c66d39cb84c77 $
fs/common/daremgr.cc:189: HSM enabled, but DARE key not found on HSM. Check log for details
2023-09-26 09:00:08,5433 :2732 Listen: 2732: bind: error 98 port 7222java.io.IOException: Could not intialize RPC java.io.IOException: Exception in RPC init at com.mapr.fs.cldb.CLDB.initializeRpcInstances(CLDB.java:179)
at com.mapr.fs.cldb.CLDB.<init>(CLDB.java:95)
at com.mapr.fs.cldb.CLDB.main(CLDB.java:411)
CLDBShm: ***shmget with key 7222, size: 70848
CLDBShm created rpc guts shared memory, size 70848
2023-09-26 09:00:09,1707 :1606 Obtained CLDB key from PKCS#11 file store
CLDBJNI: Initializing cldb jni with memory 838860800 estContainerSize:144 maxContainersInCache:5825422 mapr-version: $Id: mapr-version: 7.4.0.0.20230728133744.GA 9601b852f5443980b667e1a7910c66d39cb84c77 $
fs/common/daremgr.cc:189: HSM enabled, but DARE key not found on HSM. Check log for details
2023-09-26 09:00:11,2179 :2732 Listen: 2732: bind: error 98 port 7222java.io.IOException: Could not intialize RPC java.io.IOException: Exception in RPC init at com.mapr.fs.cldb.CLDB.initializeRpcInstances(CLDB.java:179)
at com.mapr.fs.cldb.CLDB.<init>(CLDB.java:95)
at com.mapr.fs.cldb.CLDB.main(CLDB.java:411)
I've tried the command below to create new serverticket;
mapr@node0:~$ sudo /opt/mapr/server/configure.sh -N cluster.treo.com.tr -C node0.cluster -Z node0.cluster -secure
Node setup configuration: apiserver cldb collectd drill-bits drill-internal fileserver gateway grafana hadoop-client hadoop-util hbase hbaserest hbmaster hbregionserver httpfs mastgateway nodemanager resourcemanager s3server spark spark-historyserver spark-thriftserver zookeeper
Log can be found at: /opt/mapr/logs/configure.log
CLDB node list: node0.cluster:7222
Zookeeper node list: node0.cluster:5181
External Zookeeper node list:
FIPS is not enabled. Verifying JKS, P12 and PEM key and trust stores
ERROR: Required /opt/mapr/conf/ssl_truststore.pem not present. Please copy from first CLDB node.
Configuring nodemanager
Configuring hbase
Configuring collectd
find: paths must precede expression: `/opt/mapr/lib/slf4j-api-1.7.36.jar'
find: possible unquoted pattern after predicate `-regex'?
awk: not an option: -r
awk: not an option: -r
awk: not an option: -r
awk: not an option: -r
awk: not an option: -r
awk: not an option: -r
awk: not an option: -r
Configuring resourcemanager
Configuring hadoop-util
Configuring httpfs
Configuring spark
Configuring hadoop-client
Configuring grafana
usage: /opt/mapr/grafana/grafana-7.5.10/bin/configure.sh [-help] [-nodeCount <cnt>] [-nodePort <port>] [-grafanaPort <port>]
[-loadDataSourceOnly] [-customSecure] [-secure] [-unsecure] [-EC <commonEcoOpts>]
[-password <pw>] [-R] -OT "ip:port,ip1:port,"
Configuring drill
OTNodesList:
Configuring apiserver
Running restart script /opt/mapr/conf/restart/hbaserest-1.4.14.restart
Running restart script /opt/mapr/conf/restart/hbmaster-1.4.14.restart
Running restart script /opt/mapr/conf/restart/hbregionserver-1.4.14.restart
Cluster Logs: https://drive.google.com/open?id=199KEwyPeXtmXI1zCR4XkHV0usd7krcZp&usp=drive_fs
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-28-2023 02:39 AM - edited 09-28-2023 02:42 AM
09-28-2023 02:39 AM - edited 09-28-2023 02:42 AM
SolutionWith the help of @Mirza12332 ,The solution is:
Stop all MapR processes on the node by stopping mapr-warden, kill the remaining stuck/unresponsive processes, if necessary.
cd /opt/mapr/initscripts
stop warden and cldb services
Clean up all pid files:
sudo rm-f /opt/mapr/pid
Then start mapr-warden.
Keep in mind that the only initscripts MapR supports are mapr-warden and mapr-zookeeper. The others are wrappers for warden.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-28-2023 04:46 AM
09-28-2023 04:46 AM
Re: cldb.pid exists with pid 11461 but no CLDB
Hello @msaidbilgehan,
That's awesome!
We are extremely glad to know that you were able to find the solution and we appreciate you for keeping us updated.