You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Sid Kumar <sq...@gmail.com> on 2012/06/07 00:41:02 UTC

override mapred-site.xml from command line

Hi,
I am trying to override mapred-site.xml (more specifically
mapred.compress.map.output
and mapred.output.compression.
codec) from the command line when I
execute the jar.
I have been using hadoop jar <jarname> <class> -
Dmapred.compress.map.output=true and
-Dmapred.output.compression.codec=org.apache.hadoop.io.SnappyCodec

The above doesnt work as the job.xml for the jar still uses the default
properties and not the one i specify here. Is there a different approach to
override these properties. I am submitting jobs from a client machine that
has the same version of configuration files as my cluster.

Thanks

Sid

Re: override mapred-site.xml from command line

Posted by Sid Kumar <sq...@gmail.com>.
Thanks Marcus and Zhu. I set them explicitly in my program for now. The
config approach as suggested by Zhu makes sense and wil try that too.

Sid

On Thu, Jun 7, 2012 at 7:07 AM, GUOJUN Zhu <gu...@freddiemac.com>wrote:

>
> You can also define a new config file ("mymapred.xml"), and put all the
> properties you want to change there. Then you can do "hadoop --config
> mymapred.xml jar {yourjar} {classname}".  Also, I add the extra properties
> with a space between -D and property name, "hadoop --config mymapred.xml
> jar {yourjar} {classname} -D mapred.create.symlink=yes ....     {your
> program arguments}".  Hadoop program is somewhat picky when interperate the
> arguments.  In your runner, try to print out all the arguments, if the
> "-D..." goes there, you know that hadoop does not pick up those property
> override.
>
> Zhu, Guojun
> Modeling Sr Graduate
> 571-3824370
> guojun_zhu@freddiemac.com
> Financial Engineering
> Freddie Mac
>
>
>     *Marcos Ortiz <ml...@uci.cu>*
>
>    06/07/2012 09:16 AM
>     Please respond to
> mapreduce-user@hadoop.apache.org
>
>   To
> mapreduce-user@hadoop.apache.org
> cc
> Sid Kumar <sq...@gmail.com>
> Subject
> Re: override mapred-site.xml from command line
>
>
>
>
>
>
> On 06/06/2012 07:44 PM, Sid Kumar wrote:
> I am able to set it via the API.
> Configuration.setBoolean(mapred.output.compress,true). This works!
>
> But the -D from the command line still doesn't work. Any idea what I may
> be missing here?
>
> Some additional info - Also when I try running the -D on command line on a
> local cluster (pseudo distributed mode) it works, but when I try it on a
> fully distributed cluster running jobs from a client machine it doesn't
> work. Is there a different way for setting it in this case - in hadoop-env
> perhaps?
>
> Thanks
> Sid
>
> On Wed, Jun 6, 2012 at 4:06 PM, Sid Kumar <*s...@gmail.com>>
> wrote:
> Mayank,
> I dont have a final tag for that property set. I looked at the
> mapred-default.xml in the src/mapred folder and that doesn't have a final
> tag too. Should I set it explicitly to false?
> You should do it explicitly.
> You should read the excellent blog post from Lars Francke where he did a
> great job explaining parameter by parameter and why is recommendable to set
> them to final.*
> **
> http://gbif.blogspot.com/2011/01/setting-up-hadoop-cluster-part-1-manual.html
> *<http://gbif.blogspot.com/2011/01/setting-up-hadoop-cluster-part-1-manual.html>
>
> Regards
>
> Sid
>
>
> On Wed, Jun 6, 2012 at 3:50 PM, Mayank Bansal <*m...@apache.org>>
> wrote:
> Check your mapred site xml if these parameters have <final>true</final>
>
> making final to false should solve your problem.
>
>
> On Wed, Jun 6, 2012 at 3:41 PM, Sid Kumar <*s...@gmail.com>>
> wrote:
> Hi,
> I am trying to override mapred-site.xml (more specifically
> mapred.compress.map.output
> and mapred.output.compression.
> codec) from the command line when I
> execute the jar.
> I have been using hadoop jar <jarname> <class> -
> Dmapred.compress.map.output=true and
> -Dmapred.output.compression.codec=org.apache.hadoop.io.SnappyCodec
>
> The above doesnt work as the job.xml for the jar still uses the default
> properties and not the one i specify here. Is there a different approach to
> override these properties. I am submitting jobs from a client machine that
> has the same version of configuration files as my cluster.
>
> Thanks
>
> Sid
>
>
>
>
>
> --
> Marcos Luis Ortíz Valmaseda
> Data Engineer && Sr. System Administrator at UCI
> *http://marcosluis2186.posterous.com*<http://marcosluis2186.posterous.com/>
> *http://www.linkedin.com/in/marcosluis2186*<http://www.linkedin.com/in/marcosluis2186>
> Twitter: @marcosluis2186
>
> <http://www.uci.cu/>
>
>

Re: override mapred-site.xml from command line

Posted by GUOJUN Zhu <gu...@freddiemac.com>.
You can also define a new config file ("mymapred.xml"), and put all the 
properties you want to change there. Then you can do "hadoop --config 
mymapred.xml jar {yourjar} {classname}".  Also, I add the extra properties 
with a space between -D and property name, "hadoop --config mymapred.xml 
jar {yourjar} {classname} -D mapred.create.symlink=yes ....     {your 
program arguments}".  Hadoop program is somewhat picky when interperate 
the arguments.  In your runner, try to print out all the arguments, if the 
"-D..." goes there, you know that hadoop does not pick up those property 
override. 

Zhu, Guojun
Modeling Sr Graduate
571-3824370
guojun_zhu@freddiemac.com
Financial Engineering
Freddie Mac



   Marcos Ortiz <ml...@uci.cu> 
   06/07/2012 09:16 AM
   Please respond to
mapreduce-user@hadoop.apache.org


To
mapreduce-user@hadoop.apache.org
cc
Sid Kumar <sq...@gmail.com>
Subject
Re: override mapred-site.xml from command line








On 06/06/2012 07:44 PM, Sid Kumar wrote: 
I am able to set it via the API. 
Configuration.setBoolean(mapred.output.compress,true). This works!

But the -D from the command line still doesn't work. Any idea what I may 
be missing here?

Some additional info - Also when I try running the -D on command line on a 
local cluster (pseudo distributed mode) it works, but when I try it on a 
fully distributed cluster running jobs from a client machine it doesn't 
work. Is there a different way for setting it in this case - in hadoop-env 
perhaps?

Thanks
Sid

On Wed, Jun 6, 2012 at 4:06 PM, Sid Kumar <sq...@gmail.com> wrote:
Mayank,
I dont have a final tag for that property set. I looked at the 
mapred-default.xml in the src/mapred folder and that doesn't have a final 
tag too. Should I set it explicitly to false?
You should do it explicitly.
You should read the excellent blog post from Lars Francke where he did a 
great job explaining parameter by parameter and why is recommendable to 
set them to final.
http://gbif.blogspot.com/2011/01/setting-up-hadoop-cluster-part-1-manual.html


Regards

Sid 


On Wed, Jun 6, 2012 at 3:50 PM, Mayank Bansal <ma...@apache.org> wrote:
Check your mapred site xml if these parameters have <final>true</final> 

making final to false should solve your problem. 


On Wed, Jun 6, 2012 at 3:41 PM, Sid Kumar <sq...@gmail.com> wrote:
Hi,
I am trying to override mapred-site.xml (more specifically 
mapred.compress.map.output 
and mapred.output.compression. 
codec) from the command line when I 
execute the jar. 
I have been using hadoop jar <jarname> <class> - 
Dmapred.compress.map.output=true and 
-Dmapred.output.compression.codec=org.apache.hadoop.io.SnappyCodec 
The above doesnt work as the job.xml for the jar still uses the default 
properties and not the one i specify here. Is there a different approach 
to override these properties. I am submitting jobs from a client machine 
that has the same version of configuration files as my cluster. 
Thanks
Sid




-- 
Marcos Luis Ortíz Valmaseda
 Data Engineer && Sr. System Administrator at UCI
 http://marcosluis2186.posterous.com
 http://www.linkedin.com/in/marcosluis2186
 Twitter: @marcosluis2186 




Re: override mapred-site.xml from command line

Posted by Marcos Ortiz <ml...@uci.cu>.

On 06/06/2012 07:44 PM, Sid Kumar wrote:
> I am able to set it via the API. 
> Configuration.setBoolean(mapred.output.compress,true). This works!
>
> But the -D from the command line still doesn't work. Any idea what I 
> may be missing here?
>
> Some additional info - Also when I try running the -D on command line 
> on a local cluster (pseudo distributed mode) it works, but when I try 
> it on a fully distributed cluster running jobs from a client machine 
> it doesn't work. Is there a different way for setting it in this case 
> - in hadoop-env perhaps?
>
> Thanks
> Sid
>
> On Wed, Jun 6, 2012 at 4:06 PM, Sid Kumar <sqlsid101@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Mayank,
>     I dont have a final tag for that property set. I looked at the
>     mapred-default.xml in the src/mapred folder and that doesn't have
>     a final tag too. Should I set it explicitly to false?
>
You should do it explicitly.
You should read the excellent blog post from Lars Francke where he did a 
great job explaining parameter by parameter and why is recommendable to 
set them to final.
http://gbif.blogspot.com/2011/01/setting-up-hadoop-cluster-part-1-manual.html

Regards
>
>
>     Sid
>
>
>     On Wed, Jun 6, 2012 at 3:50 PM, Mayank Bansal <mayank@apache.org
>     <ma...@apache.org>> wrote:
>
>         Check your mapred site xml if these parameters have
>         <final>true</final>
>
>         making final to false should solve your problem.
>
>
>         On Wed, Jun 6, 2012 at 3:41 PM, Sid Kumar <sqlsid101@gmail.com
>         <ma...@gmail.com>> wrote:
>
>             Hi,
>             I am trying to override mapred-site.xml (more specifically
>             mapred.compress.map.output
>             and mapred.output.compression.
>             codec) from the command line when I
>             execute the jar.
>             I have been using hadoop jar <jarname> <class> -
>             Dmapred.compress.map.output=true and
>             -Dmapred.output.compression.codec=org.apache.hadoop.io.SnappyCodec
>
>
>             The above doesnt work as the job.xml for the jar still
>             uses the default properties and not the one i specify
>             here. Is there a different approach to override these
>             properties. I am submitting jobs from a client machine
>             that has the same version of configuration files as my
>             cluster.
>
>             Thanks
>
>             Sid
>
>
>
>

-- 
Marcos Luis Ortíz Valmaseda
  Data Engineer&&  Sr. System Administrator at UCI
  http://marcosluis2186.posterous.com
  http://www.linkedin.com/in/marcosluis2186
  Twitter: @marcosluis2186



10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: override mapred-site.xml from command line

Posted by Sid Kumar <sq...@gmail.com>.
I am able to set it via the API.
Configuration.setBoolean(mapred.output.compress,true). This works!

But the -D from the command line still doesn't work. Any idea what I may be
missing here?

Some additional info - Also when I try running the -D on command line on a
local cluster (pseudo distributed mode) it works, but when I try it on a
fully distributed cluster running jobs from a client machine it doesn't
work. Is there a different way for setting it in this case - in hadoop-env
perhaps?

Thanks
Sid

On Wed, Jun 6, 2012 at 4:06 PM, Sid Kumar <sq...@gmail.com> wrote:

> Mayank,
> I dont have a final tag for that property set. I looked at the
> mapred-default.xml in the src/mapred folder and that doesn't have a final
> tag too. Should I set it explicitly to false?
>
> Sid
>
>
> On Wed, Jun 6, 2012 at 3:50 PM, Mayank Bansal <ma...@apache.org> wrote:
>
>> Check your mapred site xml if these parameters have <final>true</final>
>>
>> making final to false should solve your problem.
>>
>>
>> On Wed, Jun 6, 2012 at 3:41 PM, Sid Kumar <sq...@gmail.com> wrote:
>>
>>> Hi,
>>> I am trying to override mapred-site.xml (more specifically
>>> mapred.compress.map.output
>>> and mapred.output.compression.
>>> codec) from the command line when I
>>> execute the jar.
>>> I have been using hadoop jar <jarname> <class> -
>>> Dmapred.compress.map.output=true and
>>> -Dmapred.output.compression.codec=org.apache.hadoop.io.SnappyCodec
>>>
>>> The above doesnt work as the job.xml for the jar still uses the default
>>> properties and not the one i specify here. Is there a different approach to
>>> override these properties. I am submitting jobs from a client machine that
>>> has the same version of configuration files as my cluster.
>>>
>>> Thanks
>>>
>>> Sid
>>>
>>
>>
>

Re: override mapred-site.xml from command line

Posted by Sid Kumar <sq...@gmail.com>.
Mayank,
I dont have a final tag for that property set. I looked at the
mapred-default.xml in the src/mapred folder and that doesn't have a final
tag too. Should I set it explicitly to false?

Sid

On Wed, Jun 6, 2012 at 3:50 PM, Mayank Bansal <ma...@apache.org> wrote:

> Check your mapred site xml if these parameters have <final>true</final>
>
> making final to false should solve your problem.
>
>
> On Wed, Jun 6, 2012 at 3:41 PM, Sid Kumar <sq...@gmail.com> wrote:
>
>> Hi,
>> I am trying to override mapred-site.xml (more specifically
>> mapred.compress.map.output
>> and mapred.output.compression.
>> codec) from the command line when I
>> execute the jar.
>> I have been using hadoop jar <jarname> <class> -
>> Dmapred.compress.map.output=true and
>> -Dmapred.output.compression.codec=org.apache.hadoop.io.SnappyCodec
>>
>> The above doesnt work as the job.xml for the jar still uses the default
>> properties and not the one i specify here. Is there a different approach to
>> override these properties. I am submitting jobs from a client machine that
>> has the same version of configuration files as my cluster.
>>
>> Thanks
>>
>> Sid
>>
>
>

Re: override mapred-site.xml from command line

Posted by Mayank Bansal <ma...@apache.org>.
Check your mapred site xml if these parameters have <final>true</final>

making final to false should solve your problem.

On Wed, Jun 6, 2012 at 3:41 PM, Sid Kumar <sq...@gmail.com> wrote:

> Hi,
> I am trying to override mapred-site.xml (more specifically
> mapred.compress.map.output
> and mapred.output.compression.
> codec) from the command line when I
> execute the jar.
> I have been using hadoop jar <jarname> <class> -
> Dmapred.compress.map.output=true and
> -Dmapred.output.compression.codec=org.apache.hadoop.io.SnappyCodec
>
> The above doesnt work as the job.xml for the jar still uses the default
> properties and not the one i specify here. Is there a different approach to
> override these properties. I am submitting jobs from a client machine that
> has the same version of configuration files as my cluster.
>
> Thanks
>
> Sid
>