You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Niels Basjes <Ni...@basjes.nl> on 2014/08/21 10:15:24 UTC

Deprecated configuration settings set from the core code / {core,hdfs,...}-default.xml ??

Hi,

I found this because I was wondering why simply starting something as
trivial as the pig grunt gives the following messages during startup:

2014-08-21 09:36:55,171 [main] INFO
 org.apache.hadoop.conf.Configuration.deprecation - *mapred.job.tracker is
deprecated*. Instead, use mapreduce.jobtracker.address
2014-08-21 09:36:55,172 [main] INFO
 org.apache.hadoop.conf.Configuration.deprecation - *fs.default.name
<http://fs.default.name> is deprecated*. Instead, use fs.defaultFS

What I found is that these settings are not part of my config but they are
part of the 'core hadoop' files.

I found that the mapred.job.tracker is set from code when using the mapred
package (probably this is what pig uses)
https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java#L869

and that the fs.default.name is explicitly defined here as 'deprecated' in
one of the *-default.xml config files.
https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml#L524

I did some more digging and found that there are several other properties
that have been defined as deprecated that are still present in the various
*-default.xml files throughout the hadoop code base.

I used this list as a reference:
https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/site/apt/DeprecatedProperties.apt.vm

The ones I found so far:
./hadoop-common-project/hadoop-common/src/main/resources/core-default.xml:
 <name>fs.default.name</name>
./hadoop-common-project/hadoop-common/src/main/resources/core-default.xml:
 <name>io.bytes.per.checksum</name>
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml:
<name>mapreduce.job.counters.limit</name>
./hadoop-tools/hadoop-distcp/src/main/resources/distcp-default.xml:
 <name>mapred.job.map.memory.mb</name>
./hadoop-tools/hadoop-distcp/src/main/resources/distcp-default.xml:
 <name>mapred.job.reduce.memory.mb</name>
./hadoop-tools/hadoop-distcp/src/main/resources/distcp-default.xml:
 <name>mapreduce.reduce.class</name>

Seems to me fixing these removes a lot of senseless clutter in the
messaging in the console for end users.

Or is there a good reason to keep it like this?

-- 
Best regards / Met vriendelijke groeten,

Niels Basjes