You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Yang <te...@gmail.com> on 2012/07/17 02:15:24 UTC

who is resetting my mapred.map.tasks?

I have the following PIG script.
In the beginning, I set the mapred.map.tasks


but when the job is launched,  I see from jobtracker UI that the job.xml
shows that
the mapred.map.tasks param is set to "50". but I never used such a value.
what is resetting this value ?

this is pig-0.10.0

Thanks
Yang


################################

SET mapred.min.split.size  10000;
SET pig.noSplitCombination true;

-- SET mapred.map.tasks.speculative.execution false;
-- SET mapred.reduce.tasks.speculative.execution false;
SET mapred.map.tasks 1020;


set default_parallel 42;

verdict = LOAD '$input' AS ( partition_key:chararray);



xx = FILTER verdict by partition_key == '';

dump xx;

Re: who is resetting my mapred.map.tasks?

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
mapred.map.tasks does not actually control the number of map tasks,
it's a default (which I don't think I've ever actually seen kick in?).

Assuming you are working with HDFS files and your input format is some
FileInputFormat variant (PigStorage uses that), you probably want to
change mapred.max.split.size if you want to get more mappers than you
are getting now.

D

On Mon, Jul 16, 2012 at 5:15 PM, Yang <te...@gmail.com> wrote:
> I have the following PIG script.
> In the beginning, I set the mapred.map.tasks
>
>
> but when the job is launched,  I see from jobtracker UI that the job.xml
> shows that
> the mapred.map.tasks param is set to "50". but I never used such a value.
> what is resetting this value ?
>
> this is pig-0.10.0
>
> Thanks
> Yang
>
>
> ################################
>
> SET mapred.min.split.size  10000;
> SET pig.noSplitCombination true;
>
> -- SET mapred.map.tasks.speculative.execution false;
> -- SET mapred.reduce.tasks.speculative.execution false;
> SET mapred.map.tasks 1020;
>
>
> set default_parallel 42;
>
> verdict = LOAD '$input' AS ( partition_key:chararray);
>
>
>
> xx = FILTER verdict by partition_key == '';
>
> dump xx;