You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by "sreebalineni ." <sr...@gmail.com> on 2016/03/15 17:32:32 UTC

Re: Issue with Star schema

You can think of map joins.If cluster is configured by default it must be
happening already check query profile

On Tue, 15 Mar 2016 21:12 Himabindu sanka, <hi...@gmail.com>
wrote:

> Hi Team,
>
>
>
> I have a query where I am joining with 10 other entities
>
>
>
> Like
>
>
>
> Select  a.col1,b1.col1,b2.col1 from
>
>
>
> A a
>
> Left outer join b1 on
>
> Left outer join b2 on
>
> Left outer join b3….
>
>
>
>
>
> A is my driver entity which a 20 million data. Most of the other entities
> are small just 10 to 20 rows of data.
>
>
> In that scenario, my hive query is taking hours of time to join and fetch
> it.  Please suggest me optimization technique.
>
> This is chocking the query performance.
>
>
>
>
>
> *Regards,*
>
> *Himabindu Sanka*
>

Re: Issue with Star schema

Posted by Thejas Nair <th...@gmail.com>.

As suggested, looking at the explain plan should tell you if map-join
is getting used.
Using a recent version with hive-on-tez would also give you further
speedup as map-joins are optimized further in it.


On Tue, Mar 15, 2016 at 9:32 AM, sreebalineni . <sr...@gmail.com> wrote:
> You can think of map joins.If cluster is configured by default it must be
> happening already check query profile
>
> On Tue, 15 Mar 2016 21:12 Himabindu sanka, <hi...@gmail.com>
> wrote:
>
>> Hi Team,
>>
>>
>>
>> I have a query where I am joining with 10 other entities
>>
>>
>>
>> Like
>>
>>
>>
>> Select  a.col1,b1.col1,b2.col1 from
>>
>>
>>
>> A a
>>
>> Left outer join b1 on
>>
>> Left outer join b2 on
>>
>> Left outer join b3….
>>
>>
>>
>>
>>
>> A is my driver entity which a 20 million data. Most of the other entities
>> are small just 10 to 20 rows of data.
>>
>>
>> In that scenario, my hive query is taking hours of time to join and fetch
>> it.  Please suggest me optimization technique.
>>
>> This is chocking the query performance.
>>
>>
>>
>>
>>
>> *Regards,*
>>
>> *Himabindu Sanka*
>>

Re: Issue with Star schema

Posted by Thejas Nair <th...@gmail.com>.

As suggested, looking at the explain plan should tell you if map-join
is getting used.
Using a recent version with hive-on-tez would also give you further
speedup as map-joins are optimized further in it.


On Tue, Mar 15, 2016 at 9:32 AM, sreebalineni . <sr...@gmail.com> wrote:
> You can think of map joins.If cluster is configured by default it must be
> happening already check query profile
>
> On Tue, 15 Mar 2016 21:12 Himabindu sanka, <hi...@gmail.com>
> wrote:
>
>> Hi Team,
>>
>>
>>
>> I have a query where I am joining with 10 other entities
>>
>>
>>
>> Like
>>
>>
>>
>> Select  a.col1,b1.col1,b2.col1 from
>>
>>
>>
>> A a
>>
>> Left outer join b1 on
>>
>> Left outer join b2 on
>>
>> Left outer join b3….
>>
>>
>>
>>
>>
>> A is my driver entity which a 20 million data. Most of the other entities
>> are small just 10 to 20 rows of data.
>>
>>
>> In that scenario, my hive query is taking hours of time to join and fetch
>> it.  Please suggest me optimization technique.
>>
>> This is chocking the query performance.
>>
>>
>>
>>
>>
>> *Regards,*
>>
>> *Himabindu Sanka*
>>