- Community Home
- >
- Software
- >
- HPE Ezmeral Software platform
- >
- Hpe Ezmeral Data Fabric Yarn service down
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-03-2024 07:54 PM - last edited on тАО12-04-2024 07:41 PM by support_s
тАО12-03-2024 07:54 PM - last edited on тАО12-04-2024 07:41 PM by support_s
Hpe Ezmeral Data Fabric Yarn service down
We are using Data Fabric 6.1 version with 5 nodes. Zookeeper is operating normally on three nodes. However, Yarn is not running properly after restarting the service. There seems to be no problem with Zookeeper, but since Yarn is not working, both Application Manager and Nodemanager are also not working. I tried restarting the service through Warden, but that didn't fix the error.
A error log is as follows:
>> org.apache.hadoop.metric2.MetricException: Metrics source ClusterMetrics already exists
I terminated the service using warden, and confirmed that it terminated normally using the ps -ef command. The node's disk and license are also fine.
Please help me solve the problem.
- Tags:
- drive
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-03-2024 08:55 PM
тАО12-03-2024 08:55 PM
Query: Hpe Ezmeral Data Fabric Yarn service down
System recommended content:
1. HPE Ezmeral Data Fabric тАУ Customer-Managed 7.6.1 Documentation | Administering Services
2. HPE Ezmeral Data Fabric тАУ Customer-Managed 7.7.0 Documentation | Administering Services
Please click on "Thumbs Up/Kudo" icon to give a "Kudo".
Thank you for being a HPE valuable community member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-03-2024 09:21 PM
тАО12-03-2024 09:21 PM
Re: Query: Hpe Ezmeral Data Fabric Yarn service down
I have been using the service reliably for two years. But this situation happened suddenly. My yarn-site.xml file doesn't seem to have any problems at all.
<configuration>
<!-- Resource Manager MapR HA Configs -->
<property>
<name>yarn.resourcemanager.ha.custom-ha-enabled</name>
<value>true</value>
<description>MapR Zookeeper based RM Reconnect Enabled. If this is true, set the failover proxy to be the class MapRZKBasedRMFailoverProxyProvider</description>
</property>
<property>
<name>yarn.client.failover-proxy-provider</name>
<value>org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider</value>
<description>Zookeeper based reconnect proxy provider. Should be set if and only if mapr-ha-enabled property is true.</description>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
<description>RM Recovery Enabled</description>
</property>
<property>
<name>yarn.resourcemanager.ha.custom-ha-rmaddressfinder</name>
<value>org.apache.hadoop.yarn.client.MapRZKBasedRMAddressFinder</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>20480</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
<source>yarn-default.xml</source>
</property>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-04-2024 08:07 AM
тАО12-04-2024 08:07 AM
Re: Query: Hpe Ezmeral Data Fabric Yarn service down
One thing you can try is moving the contents of FSRMStateRoot to a backup directory and deleting it:
# hadoop fs -mv /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/* /FSRMStateRoot_backup/
# hadoop fs -rmr /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/
After making the above changes, restart the ResourceManager service.
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-04-2024 08:29 PM
тАО12-04-2024 08:29 PM
Re: Query: Hpe Ezmeral Data Fabric Yarn service down
When I checked the logs, would it be effective to restart the resource manager and node manager after the actions you provided?
INFO org.apache-hadoop. yarn.server.api.ConfigurableAuxServices: Adding auxiliary service RMVolumeManager
INFO com.mapr.hadoop. yarn. resourcemanager .RMVolumeManager: Checking for ResourceManager volume. If volume not present command will create and mount it. Command invoked as: /opt/mapr/server/createdTVolume.sh abigufa2 /var/mapr/cluster/yarn/rm /var/mapr/cluster/yarn/rm/system with permission: rwx---
com.mapr.hadoop. yarn. resourcemanager RMVolumeManager: Successfully created ResourceManager volume and mounted at /var/mapr/cluster/yarn/rm.
org.apache.hadoop. yarn. server. resourcemanager .ResourceManager: Transitioning to active state.
org.apache.hadoop. yarn. server. resourcemanager .ResourceManager: Recovery started.
org.apache.hadoop. yarn. server. resourcemanager.recovery.RMStateStore: Loaded RM state version info 1.2.
org.apache.hadoop. yarn. server. resourcemanager.recovery.FileSystemRMStateStore: Done loading applications from FS state store.
ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to load/recover state.
com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero).
at com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
at org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$AMRMTokenSecretManagerStateProto.<init>(YarnServerResourceManagerRecoveryProtos.java:3938)
at org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$AMRMTokenSecretManagerStateProto.<init>(YarnServerResourceManagerRecoveryProtos.java:3902)
at org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$AMRMTokenSecretManagerStateProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4006)
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:206)
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:1032)
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
INFO org.apache.hadoop.service.AbstractService: Service RMActiveServices failed in state STARTED; cause: com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-04-2024 08:33 PM
тАО12-04-2024 08:33 PM
Re: Query: Hpe Ezmeral Data Fabric Yarn service down
When I checked the logs, would it be effective to restart the resource manager and node manager after the actions you provided?
INFO org.apache-hadoop. yarn.server.api.ConfigurableAuxServices: Adding auxiliary service RMVolumeManager
INFO com.mapr.hadoop. yarn. resourcemanager .RMVolumeManager: Checking for ResourceManager volume. If volume not present command will create and mount it. Command invoked as: /opt/mapr/server/createdTVolume.sh abigufa2 /var/mapr/cluster/yarn/rm /var/mapr/cluster/yarn/rm/system with permission: rwx---
com.mapr.hadoop. yarn. resourcemanager RMVolumeManager: Successfully created ResourceManager volume and mounted at /var/mapr/cluster/yarn/rm.
org.apache.hadoop. yarn. server. resourcemanager .ResourceManager: Transitioning to active state.
org.apache.hadoop. yarn. server. resourcemanager .ResourceManager: Recovery started.
org.apache.hadoop. yarn. server. resourcemanager.recovery.RMStateStore: Loaded RM state version info 1.2.
org.apache.hadoop. yarn. server. resourcemanager.recovery.FileSystemRMStateStore: Done loading applications from FS state store.
ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to load/recover state.
...
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
INFO org.apache.hadoop.service.AbstractService: Service RMActiveServices failed in state STARTED; cause: com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-05-2024 08:30 AM
тАО12-05-2024 08:30 AM
Re: Query: Hpe Ezmeral Data Fabric Yarn service down
You could do a Warden restart if you prefer and force all services to restart.
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-09-2024 03:02 AM
тАО12-09-2024 03:02 AM
Query: Hpe Ezmeral Data Fabric Yarn service down
Hello,
Let us know if you were able to resolve the issue.
If you have no further query, and you are satisfied with the answer then kindly mark the topic as Solved so that it is helpful for all community members.
Please click on "Thumbs Up/Kudo" icon to give a "Kudo".
Thank you for being a HPE valuable community member.