You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Santosh Achhra <sa...@gmail.com> on 2013/01/08 09:04:51 UTC

Map Reduce Local Task

Hello,

I was reading an article on web which tells about MapReduce local Task and
use of hash table files and conditional tasks to improve performance of
hive queries.

Any idea how to implement this ? I am aware of Map joins but I am sure how
to implement Map reduce local tasks with hash tables.

Good wishes,always !
Santosh

Re: Map Reduce Local Task

Posted by be...@yahoo.com.

Hi Santhosh

As long as the smaller table size is in the range of a few MBs. It is a good candidate for map join.

If the smaller table size is still more then you can take a look at bucketed map joins.

Regards 
Bejoy KS

Sent from remote device, Please excuse typos

-----Original Message-----
From: Santosh Achhra <sa...@gmail.com>
Date: Wed, 9 Jan 2013 00:11:37 
To: <us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Re: Map Reduce Local Task

Thank you Dean,

One of our table is very small, it has only 16,000 rows and other big table
has 45 million plus records. Wont doing a loacl task help in this case ?

Good wishes,always !
Santosh

On Tue, Jan 8, 2013 at 11:59 PM, Dean Wampler <
dean.wampler@thinkbiganalytics.com> wrote:

> more aggressive about trying to convert a join to a local task, where it
> bypasses the job tracker. When you're experimenting with queries on a small
> data set, it can make things much faster, but won't be useful for large
> data sets where you need the cluster.
>

Re: Map Reduce Local Task

Posted by Santosh Achhra <sa...@gmail.com>.

Thank you Dean,

One of our table is very small, it has only 16,000 rows and other big table
has 45 million plus records. Wont doing a loacl task help in this case ?

Good wishes,always !
Santosh

On Tue, Jan 8, 2013 at 11:59 PM, Dean Wampler <
dean.wampler@thinkbiganalytics.com> wrote:

> more aggressive about trying to convert a join to a local task, where it
> bypasses the job tracker. When you're experimenting with queries on a small
> data set, it can make things much faster, but won't be useful for large
> data sets where you need the cluster.
>

Re: Map Reduce Local Task

Posted by Dean Wampler <de...@thinkbiganalytics.com>.

That setting will make Hive more aggressive about trying to convert a join
to a local task, where it bypasses the job tracker. When you're
experimenting with queries on a small data set, it can make things much
faster, but won't be useful for large data sets where you need the cluster.

dean

On Tue, Jan 8, 2013 at 9:11 AM, Santosh Achhra <sa...@gmail.com>wrote:

>
>
> Is setting hive.auto.convert.join to true will help setting mapreduce
> local task and conditional task ?
>
> Good wishes,always !
> Santosh
>
>
> On Tue, Jan 8, 2013 at 4:04 PM, Santosh Achhra <sa...@gmail.com>wrote:
>
>> Hello,
>>
>> I was reading an article on web which tells about MapReduce local Task
>> and use of hash table files and conditional tasks to improve performance of
>> hive queries.
>>
>> Any idea how to implement this ? I am aware of Map joins but I am sure
>> how to implement Map reduce local tasks with hash tables.
>>
>> Good wishes,always !
>> Santosh
>>
>
>

-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330

Re: Map Reduce Local Task

Posted by Santosh Achhra <sa...@gmail.com>.

Is setting hive.auto.convert.join to true will help setting mapreduce local
task and conditional task ?

Good wishes,always !
Santosh

On Tue, Jan 8, 2013 at 4:04 PM, Santosh Achhra <sa...@gmail.com>wrote:

> Hello,
>
> I was reading an article on web which tells about MapReduce local Task and
> use of hash table files and conditional tasks to improve performance of
> hive queries.
>
> Any idea how to implement this ? I am aware of Map joins but I am sure how
> to implement Map reduce local tasks with hash tables.
>
> Good wishes,always !
> Santosh
>