You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Santosh Achhra <sa...@gmail.com> on 2013/01/08 09:04:51 UTC
Map Reduce Local Task
Hello,
I was reading an article on web which tells about MapReduce local Task and
use of hash table files and conditional tasks to improve performance of
hive queries.
Any idea how to implement this ? I am aware of Map joins but I am sure how
to implement Map reduce local tasks with hash tables.
Good wishes,always !
Santosh
Re: Map Reduce Local Task
Posted by be...@yahoo.com.
Hi Santhosh
As long as the smaller table size is in the range of a few MBs. It is a good candidate for map join.
If the smaller table size is still more then you can take a look at bucketed map joins.
Regards
Bejoy KS
Sent from remote device, Please excuse typos
-----Original Message-----
From: Santosh Achhra <sa...@gmail.com>
Date: Wed, 9 Jan 2013 00:11:37
To: <us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Re: Map Reduce Local Task
Thank you Dean,
One of our table is very small, it has only 16,000 rows and other big table
has 45 million plus records. Wont doing a loacl task help in this case ?
Good wishes,always !
Santosh
On Tue, Jan 8, 2013 at 11:59 PM, Dean Wampler <
dean.wampler@thinkbiganalytics.com> wrote:
> more aggressive about trying to convert a join to a local task, where it
> bypasses the job tracker. When you're experimenting with queries on a small
> data set, it can make things much faster, but won't be useful for large
> data sets where you need the cluster.
>
Re: Map Reduce Local Task
Posted by Santosh Achhra <sa...@gmail.com>.
Thank you Dean,
One of our table is very small, it has only 16,000 rows and other big table
has 45 million plus records. Wont doing a loacl task help in this case ?
Good wishes,always !
Santosh
On Tue, Jan 8, 2013 at 11:59 PM, Dean Wampler <
dean.wampler@thinkbiganalytics.com> wrote:
> more aggressive about trying to convert a join to a local task, where it
> bypasses the job tracker. When you're experimenting with queries on a small
> data set, it can make things much faster, but won't be useful for large
> data sets where you need the cluster.
>
Re: Map Reduce Local Task
Posted by Dean Wampler <de...@thinkbiganalytics.com>.
That setting will make Hive more aggressive about trying to convert a join
to a local task, where it bypasses the job tracker. When you're
experimenting with queries on a small data set, it can make things much
faster, but won't be useful for large data sets where you need the cluster.
dean
On Tue, Jan 8, 2013 at 9:11 AM, Santosh Achhra <sa...@gmail.com>wrote:
>
>
> Is setting hive.auto.convert.join to true will help setting mapreduce
> local task and conditional task ?
>
> Good wishes,always !
> Santosh
>
>
> On Tue, Jan 8, 2013 at 4:04 PM, Santosh Achhra <sa...@gmail.com>wrote:
>
>> Hello,
>>
>> I was reading an article on web which tells about MapReduce local Task
>> and use of hash table files and conditional tasks to improve performance of
>> hive queries.
>>
>> Any idea how to implement this ? I am aware of Map joins but I am sure
>> how to implement Map reduce local tasks with hash tables.
>>
>> Good wishes,always !
>> Santosh
>>
>
>
--
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330
Re: Map Reduce Local Task
Posted by Santosh Achhra <sa...@gmail.com>.
Is setting hive.auto.convert.join to true will help setting mapreduce local
task and conditional task ?
Good wishes,always !
Santosh
On Tue, Jan 8, 2013 at 4:04 PM, Santosh Achhra <sa...@gmail.com>wrote:
> Hello,
>
> I was reading an article on web which tells about MapReduce local Task and
> use of hash table files and conditional tasks to improve performance of
> hive queries.
>
> Any idea how to implement this ? I am aware of Map joins but I am sure how
> to implement Map reduce local tasks with hash tables.
>
> Good wishes,always !
> Santosh
>