You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jim Twensky <ji...@gmail.com> on 2009/11/14 02:08:36 UTC

Creating a new configuration

The documentation on configuration states:

------------------------------------------------------------------------------------------------------------------------------------------------------------
Unless explicitly turned off, Hadoop by default specifies two
resources, loaded in-order from the classpath:

   1. core-default.xml : Read-only defaults for hadoop.
   2. core-site.xml: Site-specific configuration for a given hadoop
installation.
------------------------------------------------------------------------------------------------------------------------------------------------------------

Does this mean when creating a configuration, we have to expilicitly
add the resources like:

		conf.addResource("hdfs-default.xml");
		conf.addResource("hdfs-site.xml");
		conf.addResource("mapred-default.xml");
		conf.addResource("mapred-site.xml");

to make those settings visible to the client program? Skimming through
the source code, I got the impression that hdfs and mapred settings
are not loaded by default and I keep adding above four lines to every
client I implement.

Speaking of which, what's the difference between addResource and
addDefaultResource methods? Should I use addDeafultResource methods to
add the above resources instead? The documentation is not clear to me
as to which one is appropriate.

Thanks,
Jim

Re: Creating a new configuration

Posted by Steve Loughran <st...@apache.org>.
Jim Twensky wrote:
> The documentation on configuration states:
> 
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unless explicitly turned off, Hadoop by default specifies two
> resources, loaded in-order from the classpath:
> 
>    1. core-default.xml : Read-only defaults for hadoop.
>    2. core-site.xml: Site-specific configuration for a given hadoop
> installation.
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> Does this mean when creating a configuration, we have to expilicitly
> add the resources like:
> 
> 		conf.addResource("hdfs-default.xml");
> 		conf.addResource("hdfs-site.xml");
> 		conf.addResource("mapred-default.xml");
> 		conf.addResource("mapred-site.xml");
> 
> to make those settings visible to the client program? Skimming through
> the source code, I got the impression that hdfs and mapred settings
> are not loaded by default and I keep adding above four lines to every
> client I implement.

Looking at the source is always handy

The Configuration(boolean loadDefaults) constructor loads all files 
marked as "default" via addDefaultResource(); you should use it to 
register resources which are always to be loaded whenever the 
constructor with loadDefaults=true is called

You can rummage through the source to see where it is called; the 
(deprecated in 0.21+) JobConf registers the mapred files; and the static 
initializer of DistributedFileSystem loads the hdfs ones

> Speaking of which, what's the difference between addResource and
> addDefaultResource methods? Should I use addDeafultResource methods to
> add the above resources instead? The documentation is not clear to me
> as to which one is appropriate.

only use addDefaultResource to add XML files to every configuration 
created from then on. Add a resource to an explicit instance if you are 
only working with that instance


Re: Creating a new configuration

Posted by Edward Capriolo <ed...@gmail.com>.
On Fri, Nov 13, 2009 at 8:08 PM, Jim Twensky <ji...@gmail.com> wrote:
> The documentation on configuration states:
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unless explicitly turned off, Hadoop by default specifies two
> resources, loaded in-order from the classpath:
>
>   1. core-default.xml : Read-only defaults for hadoop.
>   2. core-site.xml: Site-specific configuration for a given hadoop
> installation.
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Does this mean when creating a configuration, we have to expilicitly
> add the resources like:
>
>                conf.addResource("hdfs-default.xml");
>                conf.addResource("hdfs-site.xml");
>                conf.addResource("mapred-default.xml");
>                conf.addResource("mapred-site.xml");
>
> to make those settings visible to the client program? Skimming through
> the source code, I got the impression that hdfs and mapred settings
> are not loaded by default and I keep adding above four lines to every
> client I implement.
>
> Speaking of which, what's the difference between addResource and
> addDefaultResource methods? Should I use addDeafultResource methods to
> add the above resources instead? The documentation is not clear to me
> as to which one is appropriate.
>
> Thanks,
> Jim
>

If you launch components through bin/hadoop the resources are picked
up. However if you are creating a stripped down client started outside
of bin/hadoop IE. I you want to use hadoop from inside tomcat etc you
likely need conf.addResource().