- Community Home
- >
- Software
- >
- HPE Ezmeral Software platform
- >
- Incremental Installation of Zookeeper at Data Fabr...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-07-2023 05:43 AM
08-07-2023 05:43 AM
While Incremental Installation of the current cluster, we reproduce Zookeeper from one note to three nodes. Occurred below problem;
All logs about ZK at every node (1: Master, 2 and 3 slave): Logs
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-07-2023 12:39 PM
08-07-2023 12:39 PM
Re: Incremental Installation of Zookeeper at Data Fabric
Make sure port 5181 , 2888, 3888 is opened between all three zookeeper nodes. Also share you zoo.cfg file from all three zookeeper nodes for review.
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-08-2023 12:14 AM
08-08-2023 12:14 AM
Re: Incremental Installation of Zookeeper at Data Fabric
zoo.conf files for all nodes: zoo_conf_drive
Checked all nodes for firewall status and open port-process status below;
node 1 (Master - ZK and CLDB):
> mapr@node-1:~$ sudo ufw status verbose
Status: inactive
> mapr@node-1:~$ sudo lsof -t -i:3888
1960
> mapr@node-1:~$ sudo lsof -t -i:2888
> mapr@node-1:~$ sudo lsof -t -i:5181
1960
node 2 and 3 have same output to same commands (Slaves - Incremental installation fails):
> sudo ufw status verbose
Status: inactive
> sudo lsof -t -i:5181
> sudo lsof -t -i:2888
> sudo lsof -t -i:3888
> sudo systemctl status mapr-zookeeper.service
● mapr-zookeeper.service - MapR Technologies, Inc. zookeeper service
Loaded: loaded (/etc/systemd/system/mapr-zookeeper.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2023-08-07 10:09:30 EDT; 16h ago
Process: 3205 ExecStart=/opt/mapr/initscripts/zookeeper start (code=exited, status=0/SUCCESS)
Main PID: 3378 (code=exited, status=1/FAILURE)
Aug 07 10:09:30 node-2.treo.com.tr systemd[1]: mapr-zookeeper.service: Scheduled restart job, restart counter is at 3.
Aug 07 10:09:30 node-2.treo.com.tr systemd[1]: Stopped MapR Technologies, Inc. zookeeper service.
Aug 07 10:09:30 node-2.treo.com.tr systemd[1]: mapr-zookeeper.service: Start request repeated too quickly.
Aug 07 10:09:30 node-2.treo.com.tr systemd[1]: mapr-zookeeper.service: Failed with result 'exit-code'.
Aug 07 10:09:30 node-2.treo.com.tr systemd[1]: Failed to start MapR Technologies, Inc. zookeeper service.
> sudo systemctl start mapr-zookeeper.service
Job for mapr-zookeeper.service failed because the service did not take the steps required by its unit configuration.
See "systemctl status mapr-zookeeper.service" and "journalctl -xe" for details.
> sudo systemctl status mapr-zookeeper.service
● mapr-zookeeper.service - MapR Technologies, Inc. zookeeper service
Loaded: loaded (/etc/systemd/system/mapr-zookeeper.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: protocol) since Tue 2023-08-08 03:08:33 EDT; 1s ago
Process: 122255 ExecStart=/opt/mapr/initscripts/zookeeper start (code=exited, status=0/SUCCESS)
Incremental installation failed therefore node 2 and node 3 have no zookeeper running.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-08-2023 12:23 PM
08-08-2023 12:23 PM
Re: Incremental Installation of Zookeeper at Data Fabric
can you restart all mapr-zookeeper service on all three zookeeper node and then share below command's output from all three node.
/opt/mapr/initscripts/zookeeper qstatus
Also , please upload zookeeper logs from all thre nodes.
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-09-2023 02:15 AM
08-09-2023 02:15 AM
Re: Incremental Installation of Zookeeper at Data Fabric
After restart, zookeeper logs at every node: zk_after_restart_nodes
Here "/opt/mapr/initscripts/zookeeper qstatus" command output:
> ssh mapr@10.34.2.129 "/opt/mapr/initscripts/zookeeper qstatus"
(mapr@10.34.2.129) Password:
Using config: /opt/mapr/zookeeper/zookeeper-3.5.6/conf/zoo.cfg
Client port found: 5181. Client address: localhost.
Error contacting service. It is probably not running.
> ssh mapr@10.34.2.131 "/opt/mapr/initscripts/zookeeper qstatus"
(mapr@10.34.2.131) Password:
Using config: /opt/mapr/zookeeper/zookeeper-3.5.6/conf/zoo.cfg
Client port found: 5181. Client address: localhost.
Error contacting service. It is probably not running.
> ssh mapr@10.34.2.135 "/opt/mapr/initscripts/zookeeper qstatus"
(mapr@10.34.2.135) Password:
Using config: /opt/mapr/zookeeper/zookeeper-3.5.6/conf/zoo.cfg
Client port found: 5181. Client address: localhost.
Error contacting service. It is probably not running.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-09-2023 04:31 AM
08-09-2023 04:31 AM
Re: Incremental Installation of Zookeeper at Data Fabric
Let us know if you have gone through https://docs.ezmeral.hpe.com/datafabric-customer-managed/61/AdministratorGuide/AddingZKrole.html
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-09-2023 09:35 AM
08-09-2023 09:35 AM
Re: Incremental Installation of Zookeeper at Data Fabric
From the provided logs looks like /opt/mapr/conf/cldb.key is missing or having issue. Can you check if that file is present? Also, earlier you mentioned cluster is running with one zookeeper but from the command output seems like none of the zk nodes are running. Are there all three new nodes which you are trying to add?
2023-08-09 04:09:59,082 [myid:1] - ERROR [main:MaprSecurityLoginModule@71] - Failed to set cldb key file /opt/mapr/conf/cldb.key err com.mapr.security.MutableInt@d83da2e
2023-08-09 04:09:59,084 [myid:1] - ERROR [main:MaprSecurityLoginModule@79] - Cldb key can not be obtained: 2
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-09-2023 03:30 PM - edited 08-09-2023 03:31 PM
08-09-2023 03:30 PM - edited 08-09-2023 03:31 PM
Solution@msaidbilgehan I've reproduced this at last and found the problem, it's because the installer needs to copy the file /opt/mapr/conf/maprhsm.conf and the whole dir /opt/mapr/conf/tokens to the new Zookeeper nodes. I've raised a bug for the installer about this.
To copy them manually, do this:
scp /opt/mapr/conf/maprhsm.conf mapr@<new zk node>:/opt/mapr/conf/maprhsm.conf
scp -r /opt/mapr/conf/tokens/* mapr@<new zk node>:/opt/mapr/conf/tokens/
(needs to be as the mapr user, not root, so the mapr user can read them).
Unfortunately, these steps are missing from the docs page posted above as well, and there's a separate bug with the docs team about that.
Regards,
Laurence Darby
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2023 03:25 AM - edited 08-10-2023 03:39 AM
08-10-2023 03:25 AM - edited 08-10-2023 03:39 AM
Re: Incremental Installation of Zookeeper at Data Fabric
@ldarby At which step I should run these copy commands? Should I do this After installation fail and copy files and retry again?
BTW I just did what you have told me, now seems like all ZKs running, service status output below;
Now as @Shishir_Prakash mentioned, cldb key is missing on node-1 which should have it. Also there is one more problem, Installer reset itself as explained in here (the ticket) . Need to solve these two more problems and than it should work well I guess.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2023 03:47 AM - edited 08-10-2023 03:48 AM
08-10-2023 03:47 AM - edited 08-10-2023 03:48 AM
Re: Incremental Installation of Zookeeper at Data Fabric
As you mention, the "cldb.key" file is missing. CLDB and ZK were on node-1. Other nodes (node-2 and node-3) have other services. At Incremental installation, I have select node-2 and node-3 in addition to node-1. After failure, this issue happened. Also, the installer reset itself as explained in this ticket.
I have checked all 3 nodes for "cldb.key" and not found
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2023 04:01 AM
08-10-2023 04:01 AM
Re: Incremental Installation of Zookeeper at Data Fabric
Hi @msaidbilgehan,
Apologies for the confusion, cldb.key no longer exists, it was replaced by maprhsm.conf and the tokens dir in version 7.0, but the error message about cldb.key didn't get updated. I've again requested engineering to fix this error message. @Shishir_Prakash you may also want to push Engineering to fix this wrong error message.
Also, apoligies I'm not super familiar with the installer, I'm not sure if there is a way tell it that the problem has been manually resolved, possibly the only way is to re-run the incremental install, which should work now since you've copied the files. I'll check internally about this.
Regards,
Laurence Darby
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2023 04:25 AM
08-10-2023 04:25 AM
Re: Incremental Installation of Zookeeper at Data Fabric
I see, all good then. I can not do Incremental installation because the installer at the installation page and I can not change it right now. I will continue with the installer issue ticket then. Thanks for the support.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2023 04:32 AM - last edited on 08-14-2023 12:48 AM by Sunitha_Mod
08-10-2023 04:32 AM - last edited on 08-14-2023 12:48 AM by Sunitha_Mod
Re: Incremental Installation of Zookeeper at Data Fabric
Before leave the ticket, just realized that qstatus command output shows some errors as below;
Here are the zk logs of every node: https://drive.google.com/drive/folders/1QavBt3D1L4wU3cLECsROr7awe7voA1xC?usp=drive_link
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-14-2023 03:26 AM - edited 08-14-2023 03:26 AM
08-14-2023 03:26 AM - edited 08-14-2023 03:26 AM
Re: Incremental Installation of Zookeeper at Data Fabric
Hi @msaidbilgehan,
Unfortunately these error logs with 'java.io.IOException: ZK down' are pretty hard to diagnose. One known cause of them is if the self-signed certificate is replaced with one signed by a CA, but the cert is missing the 'TLS Web Client Authentication' flag, this flag is needed by the Zookeepers for connecting to each other with SSL client certificates (mutual SSL auth). Have you done this? (I haven't seen you mention custom CAs so far, so I think not).
To debug this, on one of the ZK nodes, edit /opt/mapr/zookeeper/zookeeper-3.5.6/bin/zkServer.sh and change the line with ZK_SUPPORT_OPTS="-XX..." to be this:
ZK_SUPPORT_OPTS="-XX:ErrorFile=${ZOO_LOG_DIR}/hs_err_pid%p.log -Djavax.net.debug=ssl:trustmanager:verbose -Djavax.net.debug=ssl:handshake:verbose "
Then start a tcpdump:
tcpdump -i any -s 0 -n -w zk.pcap
Then restart that ZK:
systemctl restart mapr-zookeeper
Then the tcpdump should capture ZK starting up and giving the error message for the first time after startup, and the logs hopefully have more info.
(this is what I had to do earlier to discover the missing TLS Client Auth flag issue).
Regards,
Laurence Darby
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-17-2023 03:20 AM
08-17-2023 03:20 AM
Re: Incremental Installation of Zookeeper at Data Fabric
I start to install it from scratch so next time it happens, I will try your answer and create a new ticket with detailed logs.