You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by hadoop hive <ha...@gmail.com> on 2012/01/12 07:43:41 UTC

Proper utilization of Map and reduce

hey all,

I have a cluster of 13 nodes, in which i configured map =16 and reduce=10 ,
and mapred.reduce.tasks=120 , replication factor=2

but still i wont be able to utilize map and reduce as shown in pic.

Thanks in advance.. tell how can i make it utilize full map adn reduce.

regards
Vikas Srivastava

Re: Proper utilization of Map and reduce

Posted by Aniket Mokashi <an...@gmail.com>.

Hi,

Can this be because of the scheduler you are using? I see there are 3
queues, check the scheduler configuration, that might give you some hints.

Thanks,
Aniket

On Thu, Jan 12, 2012 at 12:24 AM, hadoop hive <ha...@gmail.com> wrote:

> thanks for your reply bejoy
>
> Actually there is nothing like rack awareness(actually they are on default
> rack), sometime there are also some reducers when required by query, i m
> running jobs through hive cli,
>
> you can suggest me some R&D so that i can check wat is actually going on.
>
> Regards
> Vikas Srivastava
>
>
> On Thu, Jan 12, 2012 at 1:29 PM, Bejoy Ks <be...@yahoo.com> wrote:
>
>> Hi Vikas
>>        From the job tracker WebUI, looks like your job is just map only,
>> ie there is no reduce tasks for your job. It is a hive job then if no
>> reduce tasks are required for your query Hive sets the number of reduce
>> tasks to zero at code level. The parameters set using -D at run time on CLI
>> could be overridden at code level. I believe that is happening here.
>>       On the map tasks, there would be constrains taken here like data
>> locality(rack locality etc). May be your data is not uniformly distributed
>> across the cluster or so. Need detailed investigation, can't say in one
>> look.
>>
>> Regards
>> Bejoy.K.S
>>
>>   ------------------------------
>> *From:* hadoop hive <ha...@gmail.com>
>> *To:* user@hive.apache.org
>> *Sent:* Thursday, January 12, 2012 12:13 PM
>> *Subject:* Proper utilization of Map and reduce
>>
>> hey all,
>>
>> I have a cluster of 13 nodes, in which i configured map =16 and reduce=10
>> , and mapred.reduce.tasks=120 , replication factor=2
>>
>> but still i wont be able to utilize map and reduce as shown in pic.
>>
>> Thanks in advance.. tell how can i make it utilize full map adn reduce.
>>
>> regards
>> Vikas Srivastava
>>
>>
>>
>


-- 
"...:::Aniket:::... Quetzalco@tl"

Re: Proper utilization of Map and reduce

Posted by hadoop hive <ha...@gmail.com>.

thanks for your reply bejoy

Actually there is nothing like rack awareness(actually they are on default
rack), sometime there are also some reducers when required by query, i m
running jobs through hive cli,

you can suggest me some R&D so that i can check wat is actually going on.

Regards
Vikas Srivastava


On Thu, Jan 12, 2012 at 1:29 PM, Bejoy Ks <be...@yahoo.com> wrote:

> Hi Vikas
>        From the job tracker WebUI, looks like your job is just map only,
> ie there is no reduce tasks for your job. It is a hive job then if no
> reduce tasks are required for your query Hive sets the number of reduce
> tasks to zero at code level. The parameters set using -D at run time on CLI
> could be overridden at code level. I believe that is happening here.
>       On the map tasks, there would be constrains taken here like data
> locality(rack locality etc). May be your data is not uniformly distributed
> across the cluster or so. Need detailed investigation, can't say in one
> look.
>
> Regards
> Bejoy.K.S
>
>   ------------------------------
> *From:* hadoop hive <ha...@gmail.com>
> *To:* user@hive.apache.org
> *Sent:* Thursday, January 12, 2012 12:13 PM
> *Subject:* Proper utilization of Map and reduce
>
> hey all,
>
> I have a cluster of 13 nodes, in which i configured map =16 and reduce=10
> , and mapred.reduce.tasks=120 , replication factor=2
>
> but still i wont be able to utilize map and reduce as shown in pic.
>
> Thanks in advance.. tell how can i make it utilize full map adn reduce.
>
> regards
> Vikas Srivastava
>
>
>

Re: Proper utilization of Map and reduce

Posted by Bejoy Ks <be...@yahoo.com>.

Hi Vikas

       From the job tracker WebUI, looks like your job is just map only, ie there is no reduce tasks for your job. It is a hive job then if no reduce tasks are required for your query Hive sets the number of reduce tasks to zero at code level. The parameters set using -D at run time on CLI could be overridden at code level. I believe that is happening here.
      On the map tasks, there would be constrains taken here like data locality(rack locality etc). May be your data is not uniformly distributed across the cluster or so. Need detailed investigation, can't say in one look.

Regards
Bejoy.K.S



________________________________
 From: hadoop hive <ha...@gmail.com>
To: user@hive.apache.org 
Sent: Thursday, January 12, 2012 12:13 PM
Subject: Proper utilization of Map and reduce
 

hey all,

I have a cluster of 13 nodes, in which i configured map =16 and reduce=10 , and mapred.reduce.tasks=120 , replication factor=2

but still i wont be able to utilize map and reduce as shown in pic.

Thanks in advance.. tell how can i make it utilize full map adn reduce.

regards
Vikas Srivastava