- Integrated Systems
- About Us
- Integrated Systems
- About Us
Comparison of node selection methods in Hadoop 3 YARN
Are you looking for a comparison of available methods in Hadoop 3 YARN for selecting specific nodes to optimize performance and placement? Read on to gain insights into the selection of the right method based on your application needs.
Now that Hadoop is widely used and has a sizeable market share and growth , the size, number, and types of clusters used for Hadoop has also increased. Inevitably this increase ended up with customers having heterogeneous nodes in their clusters, either because of technology advancements available when doing cluster upgrades/consolidation, or because of new specialized hardware, software, and workloads being used. As a result, many customers need a way to target an application to a subset of the cluster. The sought-after benefits of this approach are:
- Use specialized hardware on certain nodes, like GPUs, FPGAs
- Isolate workloads on specific nodes, like putting HBase on nodes with SSDs
- Isolate tenants on specific nodes, like allocating 5 nodes to financing department
- Use older hardware for low-priority applications and/or cold data
- Distribute applications in a specific pattern across nodes and racks, like making sure each node gets a single HBase region server
- Distribute applications to nodes that have appropriate resources to execute, like the proper set of libraries, OS version and licenses
The recent major release of Hadoop introduced new features in this regard in YARN. Today, Hadoop 3.2.0 YARN provides four different ways to select the nodes needed for an application.:
- YARN node labels
- YARN resource types
- YARN node attributes
- YARN allocation tags
Technical details for basic node selection
Now let's cover the technical details for each of these, so that you can be in a better position to decide which one is a better fit for your application needs.
Before we begin however, it's important to know the most basic node selection method which applies to any application, not just the ones running on YARN. It involves installing certain Hadoop services only on select nodes. This technique is used frequently by customers, especially when allocating separate nodes for streaming services. It is also the method used in the HPE Elastic Platform for Analytics (EPA) architecture, to separate compute resources from storage resources, allowing them to scale independently based on your application’s needs. Just by simply installing HDFS service only on storage nodes and all other compute services only on compute node, we can easily separate these resources.
Find more details about the HPE EPA architecture here. And below is a sample HPE EPA solution diagram with associated building blocks. The most important aspect is separating storage and compute resources on different physical servers.
Analysis of the four node selection methods
- YARN node labels —This is the first method that was available in YARN to target applications to specific compute resources, and the HPE EPA team helped develop it. It is important to note that while it provides additional benefits, it is not mandatory to use it in an HPE EPA solution. YARN node labels require the capacity scheduler and provide a way to group similar nodes to partition a cluster into multiple sections. Each node can have a maximum of one node label, so it can belong to a single partition. In addition it is an application level setting, meaning the whole application gets assigned to those nodes. Node labels can be managed dynamically, and each application can request to run on nodes with a certain label.
- YARN resource types —This is a new method introduced with Hadoop 3.0. Before this, YARN recognized only CPU and RAM as resources. Now you can add any arbitrary countable resource. It works with both YARN schedulers (fair and capacity). Most customers will use this feature to assign nodes with GPU resources to appropriate workloads. There are many other possible use cases. A few examples include managing costly licenses that are applied to a limited set of nodes, targeting nodes with SSDs, and splitting parts of the application to different sets of nodes. The last example is possible because you can specify different resource needs for mappers, reducers and application masters. It can also be combined with YARN node labels when using the Capacity scheduler. When used with the Capacity scheduler, YARN needs the resource calculator set to DominantResourceCalculator. If you encounter an issue when running with the Capacity scheduler please take a look at this JIRA. For this issue, a quick workaround is to run ‘yarn rmadmin –refreshQueues’
- YARN node attributes —Hadoop 3.2 introduced YARN node attributes and YARN allocation tags. Both are designed more from an application point of view. The intent is to make sure the application has the proper resources to run, like making sure it has the appropriate libraries, OS versions or selection of containers per node. In order to use them “yarn.resourcemanager.placement-constraints.handler” needs to be enabled using either the “placement-processor” or “scheduler” values. The “placement-processor” option has more flexibility and allows for node selection before the YARN scheduler is invoked. “scheduler” option uses the Capacity scheduler and it’s ordering rules for queues. Node attributes are string properties that you can attach to nodes. You can have multiple such attributes per node. For example, you can have node attributes like ‘java=1.8’ and ‘os=rhel7.5’
- YARN allocation tags —This recent addition is especially useful for long running applications, which are handled by the new YARN Service framework . It allows greater container placement control using affinity, anti-affinity and cardinality constraints. The official documentation has a great example on how to experiment with it to get a better understanding. It shows how to start with YARN distributed shell using a placement constraint. For example this placement specification: “zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3” Encodes three constraints:
- Place 3 containers with tag “zk” (standing for ZooKeeper) with node anti-affinity to each other, i.e., do not place more than one container per node
- Place 5 containers with tag “hbase” with affinity to a rack on which containers with tag “zk” are running
- Place 7 containers with tag “spark” in nodes that have at least one, but no more than three, containers with tag “hbase”
Summary of differences between node selection methods
Now that we took a quick look at the available methods, the following table summarizes the main differences between these methods. We organized our findings based on the following criteria:
- YARN scheduler—Shows which scheduler is supported by each method
- Granularity level—Shows if you can apply the method only at whole application level or you can apply it at individual stages of an application
- Management type—Shows the typical use case for it, where “Cluster” is used to describe an intent to manage cluster resources, while “Application” manages application needs
- Constraint type—Shows the range of values for the node constraint, “Countable” being a numeric value (like requesting 32GB or RAM), “String” allowing string comparisons (like requesting Python version 2.7) and “On/Off” offering a binary selection if the node can be used or not
- Decision level—Shows what element of the application is used to decide the node selection, being either container-at-a-time or whole application
- Attachment—Shows where you define the constraint, with “Node” describing constrains at the node level while “Application” using constrains at the application level
As we've discussed in this blog, no method is clearly superior for all use cases. Certain methods are available only for application developers (Node Attributes, Allocation Tags) allowing them to describe with great granularity their hardware and software requirements, while a cluster administrator can use Physical Separation, Node Labels and Resource Types to better distribute workloads across the cluster. Recently introduced Resource Types offer the advantage of working with any YARN scheduler and allows targeting different part of a job, instead of just the whole job.
All these methods provide the greatest benefits when combined with a variety of worker nodes configurations, each node type being tuned to the specific needs of individual workloads.
If you are interested to learn more about any of the options discussed here, or if want to know more about HPE EPA, please reach out to your HPE representative.
Meet Infrastructure Insights blogger Daniel Pol. Dani is part of HPE’s Data & Analytics team creating Solution Reference Architectures for BigData landscape.
Hewlett Packard Enterprise