You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Erwan MAS <er...@mas.nom.fr> on 2015/04/02 18:01:53 UTC
Hive and engine performance tez vs mr
Hello ,
I have a issue on hive , with tez engine . When try to execute a query , with
tez engine , the query is 9 times slower than map/reduce .
The query is a left outer join on two table using orc storage .
With map/reduce i have :
Job 0 : Map 27 Reduce 256
Job 1 : Map 27 Reduce 256
Time taken 110 sec
With tez i have :
Map 1 : 1/1 Map 4 : 3/3 Reducer 2: 256/256 Reducer 3: 256/256
Time taken 930 sec
With my configuration tez want to use only one mapper for some part .
How to increase this number of mapper ?
Which variable on hive , i must set to change this behavior ?
My context :
Hive 0.13 on Hortonworks 2.1
--
____________________________________________________________
/ Erwan MAS /\
| mailto:erwan@mas.nom.fr |_/
___|________________________________________________________ |
\___________________________________________________________\__/
Re: Hive and engine performance tez vs mr
Posted by Carter Shanklin <ca...@hortonworks.com>.
Erwan,
Faced with a similar situation last week I found that decreasing
mapred.max.split.size
Increased my parallelism by 6x. Yes mapred even though it was a Tez job. I
reduced it to 10mb from 256mb which I believe is the default.
The other variables to try are:
tez.grouping.min-size (make it smaller)
tez.grouping.max-size (smaller as well)
Good luck.
On 4/6/15, 2:57 PM, "Erwan MAS" <er...@mas.nom.fr> wrote:
>On Mon, Apr 06, 2015 at 12:15:05PM -0500, max scalf wrote:
>> Try setting the below in Hive and see what happens..btw what are you
>> configs in hive if any?
>>
>> set mapred.map.tasks = 20;
>>
>
>Does not change the behavior :(
>
>--
> ____________________________________________________________
> / Erwan MAS /\
> | mailto:erwan@mas.nom.fr |_/
>___|________________________________________________________ |
>\___________________________________________________________\__/
Re: Hive and engine performance tez vs mr
Posted by Erwan MAS <er...@mas.nom.fr>.
On Mon, Apr 06, 2015 at 12:15:05PM -0500, max scalf wrote:
> Try setting the below in Hive and see what happens..btw what are you
> configs in hive if any?
>
> set mapred.map.tasks = 20;
>
Does not change the behavior :(
--
____________________________________________________________
/ Erwan MAS /\
| mailto:erwan@mas.nom.fr |_/
___|________________________________________________________ |
\___________________________________________________________\__/
Re: Hive and engine performance tez vs mr
Posted by max scalf <or...@gmail.com>.
Try setting the below in Hive and see what happens..btw what are you
configs in hive if any?
set mapred.map.tasks = 20;
On Thu, Apr 2, 2015 at 11:01 AM, Erwan MAS <er...@mas.nom.fr> wrote:
> Hello ,
>
> I have a issue on hive , with tez engine . When try to execute a query ,
> with
> tez engine , the query is 9 times slower than map/reduce .
>
> The query is a left outer join on two table using orc storage .
>
> With map/reduce i have :
> Job 0 : Map 27 Reduce 256
> Job 1 : Map 27 Reduce 256
> Time taken 110 sec
>
> With tez i have :
> Map 1 : 1/1 Map 4 : 3/3 Reducer 2: 256/256 Reducer 3: 256/256
> Time taken 930 sec
>
> With my configuration tez want to use only one mapper for some part .
>
> How to increase this number of mapper ?
> Which variable on hive , i must set to change this behavior ?
>
> My context :
> Hive 0.13 on Hortonworks 2.1
>
> --
> ____________________________________________________________
> / Erwan MAS /\
> | mailto:erwan@mas.nom.fr |_/
> ___|________________________________________________________ |
> \___________________________________________________________\__/
>