HPE Ezmeral Software platform

Re: How to get default configuration properties in MapR

 
parker9584
Occasional Visitor

How to get default configuration properties in MapR

I'm trying to find the default properties from the MapR Hadoop. From the HPE website (MapR is now owned by the HPE) got to know the configuration files path. But these files show only some specific properties, not all.

 

3 REPLIES 3
okalinin
Frequent Visitor

Re: How to get default configuration properties in MapR

Hi,

One way to list Hadoop configuration properties (including defaults) is 'hadoop conf' command. Output will include default values for properties that are not explicitly configured. You may also check out 'hadoop conf-details' option that may be of interest.

Hope this helps.

Best Regards,
Alex

tdunning
HPE Pro

Re: How to get default configuration properties in MapR

Configuration management with Hadoop can be very complex.

In my experience, because it can be a bit tricky to determine which configurationi files are being used, it is best to go directly to the ground truth. I do this by two mechanisms (see Alex's answer for a slightly different and less home-made approach). 

First, I use a small java program to spill all the contents of a default configuration object. This is easy because Configuration objects are iterable.  I adapted a stack overflow answer and built a github repo with the code. See the README for detailed instructions.

 

$ git clone https://github.com/tdunning/config-print.git
Cloning into 'config-print'...
remote: Enumerating objects: 13, done.
remote: Counting objects: 100% (13/13), done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 13 (delta 1), reused 12 (delta 0), pack-reused 0
Unpacking objects: 100% (13/13), done.
$ cd config-print/
$ mvn -q package
$ HADOOP_HOME=/opt/mapr/hadoop/hadoop-2.7.0/ ./target/config-printer | sort
dfs.ha.fencing.ssh.connect-timeout = 30000
file.blocksize = 67108864
file.bytes-per-checksum = 512
file.client-write-packet-size = 65536
...

 

Second, I use system tracing to find all of the configuration files that are accessed, and in what order. I don't have a hadoop environment handy, but here is how I found out what files R uses from my home directory:

 

$ strace -o r.log R
R version 3.4.4 (2018-03-15) -- "Someone to Lean On"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
   ... stuff deleted ...
Type 'q()' to quit R.

> q()
Save workspace image? [y/n/c]: n
$ grep open r.log | grep /home/
openat(AT_FDCWD, "/home/tdunning/.Renviron.", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/home/tdunning/.Renviron", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/home/tdunning/.Rprofile", O_RDONLY) = -1 ENOENT (No such file or directory)
$

 

 

 

I work for HPE
Dave Olker
HPE Pro

Re: How to get default configuration properties in MapR

To add to Alex's answer, I tried playing with the "hadoop conf-details" command and wanted to format the output in pretty, more human-readable XML.  Here's an example of capturing all the hadoop configuration parameters into a nicely formatted XML file and then displaying a few of the properties:

# hadoop conf-details print-all-effective-properties | xmllint --output hadoop_conf.xml --format -
# head -32 hadoop_conf.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration>
  <property>
    <name>yarn.ipc.rpc.class</name>
    <value>org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC</value>
    <source>yarn-default.xml</source>
  </property>
  <property>
    <name>mapreduce.job.maxtaskfailures.per.tracker</name>
    <value>3</value>
    <source>mapred-default.xml</source>
  </property>
  <property>
    <name>yarn.client.max-cached-nodemanagers-proxies</name>
    <value>0</value>
    <source>yarn-default.xml</source>
  </property>
  <property>
    <name>mapreduce.job.speculative.retry-after-speculate</name>
    <value>15000</value>
    <source>mapred-default.xml</source>
  </property>
  <property>
    <name>ha.health-monitor.connect-retry-interval.ms</name>
    <value>1000</value>
    <source>core-default.xml</source>
  </property>
  <property>
    <name>yarn.resourcemanager.work-preserving-recovery.enabled</name>
    <value>true</value>
    <source>yarn-default.xml</source>
  </property>

 

The above XML file now contains all the hadoop parameters with their respective values and locations in a nicely formatted output.

Hope this helps,

Dave

I work for HPE