You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Raimon Bosch <ra...@gmail.com> on 2011/10/23 03:01:38 UTC

why one of the reducers it's always slower?

Hi all,

I'm executing one job to convert logs into hive tables. The times are very
good once we have added a proper number of nodes but the reduce phase spends
always more time in one of the machines.

task_201110211442_0086_r_000000<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000>
100.00%
reduce > reduce
23-Oct-2011 00:26:42
23-Oct-2011 00:28:09 (1mins, 27sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000>
task_201110211442_0086_r_000001<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001>
100.00%
reduce > reduce
23-Oct-2011 00:26:42
23-Oct-2011 00:28:10 (1mins, 27sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001>
task_201110211442_0086_r_000002<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002>
100.00%
reduce > reduce
23-Oct-2011 00:26:43
23-Oct-2011 00:28:10 (1mins, 27sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002>
task_201110211442_0086_r_000003<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003>
100.00%
reduce > reduce
23-Oct-2011 00:26:43
23-Oct-2011 00:28:10 (1mins, 27sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003>
task_201110211442_0086_r_000004<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004>
100.00%
reduce > reduce
23-Oct-2011 00:26:44
23-Oct-2011 00:35:56 (9mins, 11sec)

10<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004>
task_201110211442_0086_r_000005<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005>
100.00%
reduce > reduce
23-Oct-2011 00:26:44
23-Oct-2011 00:28:09 (1mins, 24sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005>

As you can see in the statistics from 6 reduce executions one is spending 9
minutes while the rest is spending 1 minute. I think that it is because one
of the reducers has to spend time sorting the results from the rest of
nodes.

There is a way to reduce this time?

Thanks in advance,
Raimon Bosch

Re: why one of the reducers it's always slower?

Posted by Harsh J <ha...@cloudera.com>.

Raimon,

Does the 9-min reducer contain the same amount of input records as the
others? You can check and compare its reported counters with the other
tasks. Could just be your key distribution.

On Sun, Oct 23, 2011 at 6:31 AM, Raimon Bosch <ra...@gmail.com> wrote:
> Hi all,
>
> I'm executing one job to convert logs into hive tables. The times are very
> good once we have added a proper number of nodes but the reduce phase spends
> always more time in one of the machines.
>
> task_201110211442_0086_r_000000<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000>
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:42
> 23-Oct-2011 00:28:09 (1mins, 27sec)
>
> 9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000>
> task_201110211442_0086_r_000001<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001>
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:42
> 23-Oct-2011 00:28:10 (1mins, 27sec)
>
> 9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001>
> task_201110211442_0086_r_000002<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002>
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:43
> 23-Oct-2011 00:28:10 (1mins, 27sec)
>
> 9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002>
> task_201110211442_0086_r_000003<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003>
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:43
> 23-Oct-2011 00:28:10 (1mins, 27sec)
>
> 9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003>
> task_201110211442_0086_r_000004<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004>
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:44
> 23-Oct-2011 00:35:56 (9mins, 11sec)
>
> 10<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004>
> task_201110211442_0086_r_000005<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005>
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:44
> 23-Oct-2011 00:28:09 (1mins, 24sec)
>
> 9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005>
>
> As you can see in the statistics from 6 reduce executions one is spending 9
> minutes while the rest is spending 1 minute. I think that it is because one
> of the reducers has to spend time sorting the results from the rest of
> nodes.
>
> There is a way to reduce this time?
>
> Thanks in advance,
> Raimon Bosch
>



-- 
Harsh J

Re: why one of the reducers it's always slower?

Posted by Raimon Bosch <ra...@gmail.com>.

The answer is yes. I have checked my code and I was generating one map key
for each table when I didn't  need to do it.

Now, I'm generating keys that are including the name of the table and a
unique id. That information is used during the MultiOutput to generate
proper outputs for each table.

2011/10/23 Raimon Bosch <ra...@gmail.com>

> Thanks for your help,
>
> In fact, I'm using MultipleOutputFormat to generate one file for each hive
> table and in this case I'm generating only one of the possible hive tables.
> Can I use MultipleOutputFormat and still distribute my keys over all the
> cluster?
>
> 2011/10/23 Ayon Sinha <ay...@yahoo.com>
>
>> Looks like that is the reducer who is actually doing the work with 14M
>> input records.
>>
>>
>>  Reduce input groups 1
>>  Combine output records 0
>>  Reduce shuffle bytes 5,135,004,496
>>  Reduce output records 14,232,592
>>  Spilled Records 14,232,592
>>  Combine input records 0
>>  Reduce input records 14,232,592
>>
>>
>>
>> Other reducers have this:
>> Reduce output records0
>> Spilled Records0
>> Combine input records0
>> Reduce input records0
>>
>> -Ayon
>> See My Photos on Flickr
>> Also check out my Blog for answers to commonly asked questions.
>>
>>
>>
>> ________________________________
>> From: Raimon Bosch <ra...@gmail.com>
>> To: common-user@hadoop.apache.org
>> Sent: Saturday, October 22, 2011 6:01 PM
>> Subject: why one of the reducers it's always slower?
>>
>> Hi all,
>>
>> I'm executing one job to convert logs into hive tables. The times are very
>> good once we have added a proper number of nodes but the reduce phase
>> spends
>> always more time in one of the machines.
>>
>> task_201110211442_0086_r_000000<
>> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000
>> >
>> 100.00%
>> reduce > reduce
>> 23-Oct-2011 00:26:42
>> 23-Oct-2011 00:28:09 (1mins, 27sec)
>>
>> 9<
>> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000
>> >
>> task_201110211442_0086_r_000001<
>> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001
>> >
>> 100.00%
>> reduce > reduce
>> 23-Oct-2011 00:26:42
>> 23-Oct-2011 00:28:10 (1mins, 27sec)
>>
>> 9<
>> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001
>> >
>> task_201110211442_0086_r_000002<
>> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002
>> >
>> 100.00%
>> reduce > reduce
>> 23-Oct-2011 00:26:43
>> 23-Oct-2011 00:28:10 (1mins, 27sec)
>>
>> 9<
>> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002
>> >
>> task_201110211442_0086_r_000003<
>> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003
>> >
>> 100.00%
>> reduce > reduce
>> 23-Oct-2011 00:26:43
>> 23-Oct-2011 00:28:10 (1mins, 27sec)
>>
>> 9<
>> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003
>> >
>> task_201110211442_0086_r_000004<
>> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004
>> >
>> 100.00%
>> reduce > reduce
>> 23-Oct-2011 00:26:44
>> 23-Oct-2011 00:35:56 (9mins, 11sec)
>>
>> 10<
>> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004
>> >
>> task_201110211442_0086_r_000005<
>> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005
>> >
>> 100.00%
>> reduce > reduce
>> 23-Oct-2011 00:26:44
>> 23-Oct-2011 00:28:09 (1mins, 24sec)
>>
>> 9<
>> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005
>> >
>>
>> As you can see in the statistics from 6 reduce executions one is spending
>> 9
>> minutes while the rest is spending 1 minute. I think that it is because
>> one
>> of the reducers has to spend time sorting the results from the rest of
>> nodes.
>>
>> There is a way to reduce this time?
>>
>> Thanks in advance,
>> Raimon Bosch
>>
>
>

Re: why one of the reducers it's always slower?

Posted by Raimon Bosch <ra...@gmail.com>.

Thanks for your help,

In fact, I'm using MultipleOutputFormat to generate one file for each hive
table and in this case I'm generating only one of the possible hive tables.
Can I use MultipleOutputFormat and still distribute my keys over all the
cluster?

2011/10/23 Ayon Sinha <ay...@yahoo.com>

> Looks like that is the reducer who is actually doing the work with 14M
> input records.
>
>
>  Reduce input groups 1
>  Combine output records 0
>  Reduce shuffle bytes 5,135,004,496
>  Reduce output records 14,232,592
>  Spilled Records 14,232,592
>  Combine input records 0
>  Reduce input records 14,232,592
>
>
>
> Other reducers have this:
> Reduce output records0
> Spilled Records0
> Combine input records0
> Reduce input records0
>
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.
>
>
>
> ________________________________
> From: Raimon Bosch <ra...@gmail.com>
> To: common-user@hadoop.apache.org
> Sent: Saturday, October 22, 2011 6:01 PM
> Subject: why one of the reducers it's always slower?
>
> Hi all,
>
> I'm executing one job to convert logs into hive tables. The times are very
> good once we have added a proper number of nodes but the reduce phase
> spends
> always more time in one of the machines.
>
> task_201110211442_0086_r_000000<
> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000
> >
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:42
> 23-Oct-2011 00:28:09 (1mins, 27sec)
>
> 9<
> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000
> >
> task_201110211442_0086_r_000001<
> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001
> >
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:42
> 23-Oct-2011 00:28:10 (1mins, 27sec)
>
> 9<
> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001
> >
> task_201110211442_0086_r_000002<
> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002
> >
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:43
> 23-Oct-2011 00:28:10 (1mins, 27sec)
>
> 9<
> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002
> >
> task_201110211442_0086_r_000003<
> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003
> >
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:43
> 23-Oct-2011 00:28:10 (1mins, 27sec)
>
> 9<
> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003
> >
> task_201110211442_0086_r_000004<
> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004
> >
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:44
> 23-Oct-2011 00:35:56 (9mins, 11sec)
>
> 10<
> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004
> >
> task_201110211442_0086_r_000005<
> http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005
> >
> 100.00%
> reduce > reduce
> 23-Oct-2011 00:26:44
> 23-Oct-2011 00:28:09 (1mins, 24sec)
>
> 9<
> http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005
> >
>
> As you can see in the statistics from 6 reduce executions one is spending 9
> minutes while the rest is spending 1 minute. I think that it is because one
> of the reducers has to spend time sorting the results from the rest of
> nodes.
>
> There is a way to reduce this time?
>
> Thanks in advance,
> Raimon Bosch
>

Re: why one of the reducers it's always slower?

Posted by Ayon Sinha <ay...@yahoo.com>.

Looks like that is the reducer who is actually doing the work with 14M input records. 


 Reduce input groups 1 
 Combine output records 0 
 Reduce shuffle bytes 5,135,004,496 
 Reduce output records 14,232,592 
 Spilled Records 14,232,592 
 Combine input records 0 
 Reduce input records 14,232,592


 
Other reducers have this:
Reduce output records0
Spilled Records0
Combine input records0
Reduce input records0
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.



________________________________
From: Raimon Bosch <ra...@gmail.com>
To: common-user@hadoop.apache.org
Sent: Saturday, October 22, 2011 6:01 PM
Subject: why one of the reducers it's always slower?

Hi all,

I'm executing one job to convert logs into hive tables. The times are very
good once we have added a proper number of nodes but the reduce phase spends
always more time in one of the machines.

task_201110211442_0086_r_000000<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000>
100.00%
reduce > reduce
23-Oct-2011 00:26:42
23-Oct-2011 00:28:09 (1mins, 27sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000000>
task_201110211442_0086_r_000001<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001>
100.00%
reduce > reduce
23-Oct-2011 00:26:42
23-Oct-2011 00:28:10 (1mins, 27sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000001>
task_201110211442_0086_r_000002<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002>
100.00%
reduce > reduce
23-Oct-2011 00:26:43
23-Oct-2011 00:28:10 (1mins, 27sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000002>
task_201110211442_0086_r_000003<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003>
100.00%
reduce > reduce
23-Oct-2011 00:26:43
23-Oct-2011 00:28:10 (1mins, 27sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000003>
task_201110211442_0086_r_000004<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004>
100.00%
reduce > reduce
23-Oct-2011 00:26:44
23-Oct-2011 00:35:56 (9mins, 11sec)

10<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000004>
task_201110211442_0086_r_000005<http://204.236.208.103:50030/taskdetails.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005>
100.00%
reduce > reduce
23-Oct-2011 00:26:44
23-Oct-2011 00:28:09 (1mins, 24sec)

9<http://204.236.208.103:50030/taskstats.jsp?jobid=job_201110211442_0086&tipid=task_201110211442_0086_r_000005>

As you can see in the statistics from 6 reduce executions one is spending 9
minutes while the rest is spending 1 minute. I think that it is because one
of the reducers has to spend time sorting the results from the rest of
nodes.

There is a way to reduce this time?

Thanks in advance,
Raimon Bosch