You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Vadim Zaliva <kr...@gmail.com> on 2009/02/25 06:04:31 UTC
Re: Skip Reduce Phase
On Thu, Feb 7, 2008 at 10:07, Owen O'Malley <oo...@yahoo-inc.com> wrote:
> Setting it to 0 skips all of the buffering, sorting, merging, and shuffling.
> It passes the objects straight from the mapper to the output format, which
> writes it straight to hdfs.
I just tried to set number or Reduce tasks to 0, but Job Tracker shows
Reduce task working, doing "reduce > sort". I have a big data set and
it takes a while. It would be a good to find a way to skip it.
Vadim
Re: Skip Reduce Phase
Posted by Vadim Zaliva <kr...@gmail.com>.
I am sorry, it was my fault. I have not updated JAR.
Now it seems to be working as expected. Thanks!
Vadim
On Tue, Feb 24, 2009 at 21:23, Jothi Padmanabhan <jo...@yahoo-inc.com> wrote:
> If you had set the number of reduce tasks to 0, you should not see the
> reduce>sort. How did you set the number of reducers?
> You could do that by doing
>
> job.setNumReduceTasks(0);
>
> Jothi
>
>
> On 2/25/09 10:34 AM, "Vadim Zaliva" <kr...@gmail.com> wrote:
>
>> On Thu, Feb 7, 2008 at 10:07, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>>
>>> Setting it to 0 skips all of the buffering, sorting, merging, and shuffling.
>>> It passes the objects straight from the mapper to the output format, which
>>> writes it straight to hdfs.
>>
>> I just tried to set number or Reduce tasks to 0, but Job Tracker shows
>> Reduce task working, doing "reduce > sort". I have a big data set and
>> it takes a while. It would be a good to find a way to skip it.
>>
>> Vadim
>
>
Re: Skip Reduce Phase
Posted by Jothi Padmanabhan <jo...@yahoo-inc.com>.
Sorry, this mail was intended for somebody else. Please disregard.
On 2/25/09 2:33 PM, "Jothi Padmanabhan" <jo...@yahoo-inc.com> wrote:
> Just to clarify -- setting test.build.data on the command line to point to
> some arbitrary directory in /tmp should work
>
> ant -Dtestcase=TestMapReduceLocal -Dtest.output=yes
> -Dtest.build.data=/tmp/foo test-core
>
> Jothi
>
>
> On 2/25/09 10:53 AM, "Jothi Padmanabhan" <jo...@yahoo-inc.com> wrote:
>
>> If you had set the number of reduce tasks to 0, you should not see the
>> reduce>sort. How did you set the number of reducers?
>> You could do that by doing
>>
>> job.setNumReduceTasks(0);
>>
>> Jothi
>>
>>
>> On 2/25/09 10:34 AM, "Vadim Zaliva" <kr...@gmail.com> wrote:
>>
>>> On Thu, Feb 7, 2008 at 10:07, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>>>
>>>> Setting it to 0 skips all of the buffering, sorting, merging, and
>>>> shuffling.
>>>> It passes the objects straight from the mapper to the output format, which
>>>> writes it straight to hdfs.
>>>
>>> I just tried to set number or Reduce tasks to 0, but Job Tracker shows
>>> Reduce task working, doing "reduce > sort". I have a big data set and
>>> it takes a while. It would be a good to find a way to skip it.
>>>
>>> Vadim
>
Re: Skip Reduce Phase
Posted by Jothi Padmanabhan <jo...@yahoo-inc.com>.
Just to clarify -- setting test.build.data on the command line to point to
some arbitrary directory in /tmp should work
ant -Dtestcase=TestMapReduceLocal -Dtest.output=yes
-Dtest.build.data=/tmp/foo test-core
Jothi
On 2/25/09 10:53 AM, "Jothi Padmanabhan" <jo...@yahoo-inc.com> wrote:
> If you had set the number of reduce tasks to 0, you should not see the
> reduce>sort. How did you set the number of reducers?
> You could do that by doing
>
> job.setNumReduceTasks(0);
>
> Jothi
>
>
> On 2/25/09 10:34 AM, "Vadim Zaliva" <kr...@gmail.com> wrote:
>
>> On Thu, Feb 7, 2008 at 10:07, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>>
>>> Setting it to 0 skips all of the buffering, sorting, merging, and shuffling.
>>> It passes the objects straight from the mapper to the output format, which
>>> writes it straight to hdfs.
>>
>> I just tried to set number or Reduce tasks to 0, but Job Tracker shows
>> Reduce task working, doing "reduce > sort". I have a big data set and
>> it takes a while. It would be a good to find a way to skip it.
>>
>> Vadim
Re: Skip Reduce Phase
Posted by Jothi Padmanabhan <jo...@yahoo-inc.com>.
If you had set the number of reduce tasks to 0, you should not see the
reduce>sort. How did you set the number of reducers?
You could do that by doing
job.setNumReduceTasks(0);
Jothi
On 2/25/09 10:34 AM, "Vadim Zaliva" <kr...@gmail.com> wrote:
> On Thu, Feb 7, 2008 at 10:07, Owen O'Malley <oo...@yahoo-inc.com> wrote:
>
>> Setting it to 0 skips all of the buffering, sorting, merging, and shuffling.
>> It passes the objects straight from the mapper to the output format, which
>> writes it straight to hdfs.
>
> I just tried to set number or Reduce tasks to 0, but Job Tracker shows
> Reduce task working, doing "reduce > sort". I have a big data set and
> it takes a while. It would be a good to find a way to skip it.
>
> Vadim