You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Rahul Bhattacharjee <ra...@gmail.com> on 2013/05/06 16:45:06 UTC

Uber Job!

Hi,

I was going through the definition of Uber Job of Hadoop.

A job is considered uber when it has 10 or less maps , one reducer and the
complete data is less than one dfs block size.

I have some doubts here-

Splits are created as per the dfs block size.Creating 10 mappers are
possible from one block of data by some settings change (changing the max
split size). But trying to understand , why would some job need to run
around 10 maps for 64 MB of data.
One thing may be that the job is immensely CUP intensive. Will it be a
correct assumption? or is there is any other reason for this.

Thanks,
Rahul

Re: Uber Job!

Posted by Rahul Bhattacharjee <ra...@gmail.com>.

Excellent point sir

On Monday, May 6, 2013, yypvsxf19870706 wrote:

> Hi
>
>     Suppose that your input file are 10 with total size 64mb , I think you
> will get the 10 maps.
>
>     By the ways,the uber mode is only for yarn . Suppose you have actually
> 1 map ,yarn will at least create two containers , one for app master and
> the other for the map , if uber mode is enabled with the yarn , yarn will
> only create 1 container for both app master and the map.
>
>
> 发自我的 iPhone
>
> 在 2013-5-6，22:45，Rahul Bhattacharjee <rahul.rec.dgp@gmail.com<javascript:_e({}, 'cvml', 'rahul.rec.dgp@gmail.com');>>
> 写道：
>
> Hi,
>
> I was going through the definition of Uber Job of Hadoop.
>
> A job is considered uber when it has 10 or less maps , one reducer and the
> complete data is less than one dfs block size.
>
> I have some doubts here-
>
> Splits are created as per the dfs block size.Creating 10 mappers are
> possible from one block of data by some settings change (changing the max
> split size). But trying to understand , why would some job need to run
> around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a
> correct assumption? or is there is any other reason for this.
>
> Thanks,
> Rahul
>
>
>

-- 
Sent from Gmail Mobile

Re: Uber Job!

Posted by Rahul Bhattacharjee <ra...@gmail.com>.

Excellent point sir

On Monday, May 6, 2013, yypvsxf19870706 wrote:

> Hi
>
>     Suppose that your input file are 10 with total size 64mb , I think you
> will get the 10 maps.
>
>     By the ways,the uber mode is only for yarn . Suppose you have actually
> 1 map ,yarn will at least create two containers , one for app master and
> the other for the map , if uber mode is enabled with the yarn , yarn will
> only create 1 container for both app master and the map.
>
>
> 发自我的 iPhone
>
> 在 2013-5-6，22:45，Rahul Bhattacharjee <rahul.rec.dgp@gmail.com<javascript:_e({}, 'cvml', 'rahul.rec.dgp@gmail.com');>>
> 写道：
>
> Hi,
>
> I was going through the definition of Uber Job of Hadoop.
>
> A job is considered uber when it has 10 or less maps , one reducer and the
> complete data is less than one dfs block size.
>
> I have some doubts here-
>
> Splits are created as per the dfs block size.Creating 10 mappers are
> possible from one block of data by some settings change (changing the max
> split size). But trying to understand , why would some job need to run
> around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a
> correct assumption? or is there is any other reason for this.
>
> Thanks,
> Rahul
>
>
>

-- 
Sent from Gmail Mobile

Re: Uber Job!

Posted by Rahul Bhattacharjee <ra...@gmail.com>.

Excellent point sir

On Monday, May 6, 2013, yypvsxf19870706 wrote:

> Hi
>
>     Suppose that your input file are 10 with total size 64mb , I think you
> will get the 10 maps.
>
>     By the ways,the uber mode is only for yarn . Suppose you have actually
> 1 map ,yarn will at least create two containers , one for app master and
> the other for the map , if uber mode is enabled with the yarn , yarn will
> only create 1 container for both app master and the map.
>
>
> 发自我的 iPhone
>
> 在 2013-5-6，22:45，Rahul Bhattacharjee <rahul.rec.dgp@gmail.com<javascript:_e({}, 'cvml', 'rahul.rec.dgp@gmail.com');>>
> 写道：
>
> Hi,
>
> I was going through the definition of Uber Job of Hadoop.
>
> A job is considered uber when it has 10 or less maps , one reducer and the
> complete data is less than one dfs block size.
>
> I have some doubts here-
>
> Splits are created as per the dfs block size.Creating 10 mappers are
> possible from one block of data by some settings change (changing the max
> split size). But trying to understand , why would some job need to run
> around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a
> correct assumption? or is there is any other reason for this.
>
> Thanks,
> Rahul
>
>
>

-- 
Sent from Gmail Mobile

Re: Uber Job!

Posted by Rahul Bhattacharjee <ra...@gmail.com>.

Excellent point sir

On Monday, May 6, 2013, yypvsxf19870706 wrote:

> Hi
>
>     Suppose that your input file are 10 with total size 64mb , I think you
> will get the 10 maps.
>
>     By the ways,the uber mode is only for yarn . Suppose you have actually
> 1 map ,yarn will at least create two containers , one for app master and
> the other for the map , if uber mode is enabled with the yarn , yarn will
> only create 1 container for both app master and the map.
>
>
> 发自我的 iPhone
>
> 在 2013-5-6，22:45，Rahul Bhattacharjee <rahul.rec.dgp@gmail.com<javascript:_e({}, 'cvml', 'rahul.rec.dgp@gmail.com');>>
> 写道：
>
> Hi,
>
> I was going through the definition of Uber Job of Hadoop.
>
> A job is considered uber when it has 10 or less maps , one reducer and the
> complete data is less than one dfs block size.
>
> I have some doubts here-
>
> Splits are created as per the dfs block size.Creating 10 mappers are
> possible from one block of data by some settings change (changing the max
> split size). But trying to understand , why would some job need to run
> around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a
> correct assumption? or is there is any other reason for this.
>
> Thanks,
> Rahul
>
>
>

-- 
Sent from Gmail Mobile

Re: Uber Job!

Posted by yypvsxf19870706 <yy...@gmail.com>.

Hi

    Suppose that your input file are 10 with total size 64mb , I think you will get the 10 maps.

    By the ways,the uber mode is only for yarn . Suppose you have actually 1 map ,yarn will at least create two containers , one for app master and the other for the map , if uber mode is enabled with the yarn , yarn will only create 1 container for both app master and the map. 
    

发自我的 iPhone

在 2013-5-6，22:45，Rahul Bhattacharjee <ra...@gmail.com> 写道：

> Hi,
> 
> I was going through the definition of Uber Job of Hadoop.
> 
> A job is considered uber when it has 10 or less maps , one reducer and the complete data is less than one dfs block size.
> 
> I have some doubts here-
> 
> Splits are created as per the dfs block size.Creating 10 mappers are possible from one block of data by some settings change (changing the max split size). But trying to understand , why would some job need to run around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a correct assumption? or is there is any other reason for this.
> 
> Thanks,
> Rahul
> 
>

Re: Uber Job!

Posted by yypvsxf19870706 <yy...@gmail.com>.

Hi

    Suppose that your input file are 10 with total size 64mb , I think you will get the 10 maps.

    By the ways,the uber mode is only for yarn . Suppose you have actually 1 map ,yarn will at least create two containers , one for app master and the other for the map , if uber mode is enabled with the yarn , yarn will only create 1 container for both app master and the map. 
    

�����ҵ� iPhone

�� 2013-5-6��22:45��Rahul Bhattacharjee <ra...@gmail.com> д����

> Hi,
> 
> I was going through the definition of Uber Job of Hadoop.
> 
> A job is considered uber when it has 10 or less maps , one reducer and the complete data is less than one dfs block size.
> 
> I have some doubts here-
> 
> Splits are created as per the dfs block size.Creating 10 mappers are possible from one block of data by some settings change (changing the max split size). But trying to understand , why would some job need to run around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a correct assumption? or is there is any other reason for this.
> 
> Thanks,
> Rahul
> 
>

Re: Uber Job!

Posted by Rahul Bhattacharjee <ra...@gmail.com>.

Thanks I was more interested to know about the use case of uber task

On Monday, May 6, 2013, Mohammad Tariq wrote:

> Split creation is primarily InputForma's responsibility, IMHO. It's good
> if splits overlap with the block, but it's not always true.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, May 6, 2013 at 8:15 PM, Rahul Bhattacharjee <
> rahul.rec.dgp@gmail.com <javascript:_e({}, 'cvml',
> 'rahul.rec.dgp@gmail.com');>> wrote:
>
>> Hi,
>>
>> I was going through the definition of Uber Job of Hadoop.
>>
>> A job is considered uber when it has 10 or less maps , one reducer and
>> the complete data is less than one dfs block size.
>>
>> I have some doubts here-
>>
>> Splits are created as per the dfs block size.Creating 10 mappers are
>> possible from one block of data by some settings change (changing the max
>> split size). But trying to understand , why would some job need to run
>> around 10 maps for 64 MB of data.
>> One thing may be that the job is immensely CUP intensive. Will it be a
>> correct assumption? or is there is any other reason for this.
>>
>> Thanks,
>> Rahul
>>
>>
>>
>

-- 
Sent from Gmail Mobile

Re: Uber Job!

Posted by Rahul Bhattacharjee <ra...@gmail.com>.

Thanks I was more interested to know about the use case of uber task

On Monday, May 6, 2013, Mohammad Tariq wrote:

> Split creation is primarily InputForma's responsibility, IMHO. It's good
> if splits overlap with the block, but it's not always true.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, May 6, 2013 at 8:15 PM, Rahul Bhattacharjee <
> rahul.rec.dgp@gmail.com <javascript:_e({}, 'cvml',
> 'rahul.rec.dgp@gmail.com');>> wrote:
>
>> Hi,
>>
>> I was going through the definition of Uber Job of Hadoop.
>>
>> A job is considered uber when it has 10 or less maps , one reducer and
>> the complete data is less than one dfs block size.
>>
>> I have some doubts here-
>>
>> Splits are created as per the dfs block size.Creating 10 mappers are
>> possible from one block of data by some settings change (changing the max
>> split size). But trying to understand , why would some job need to run
>> around 10 maps for 64 MB of data.
>> One thing may be that the job is immensely CUP intensive. Will it be a
>> correct assumption? or is there is any other reason for this.
>>
>> Thanks,
>> Rahul
>>
>>
>>
>

-- 
Sent from Gmail Mobile

Re: Uber Job!

Posted by Rahul Bhattacharjee <ra...@gmail.com>.

Thanks I was more interested to know about the use case of uber task

On Monday, May 6, 2013, Mohammad Tariq wrote:

> Split creation is primarily InputForma's responsibility, IMHO. It's good
> if splits overlap with the block, but it's not always true.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, May 6, 2013 at 8:15 PM, Rahul Bhattacharjee <
> rahul.rec.dgp@gmail.com <javascript:_e({}, 'cvml',
> 'rahul.rec.dgp@gmail.com');>> wrote:
>
>> Hi,
>>
>> I was going through the definition of Uber Job of Hadoop.
>>
>> A job is considered uber when it has 10 or less maps , one reducer and
>> the complete data is less than one dfs block size.
>>
>> I have some doubts here-
>>
>> Splits are created as per the dfs block size.Creating 10 mappers are
>> possible from one block of data by some settings change (changing the max
>> split size). But trying to understand , why would some job need to run
>> around 10 maps for 64 MB of data.
>> One thing may be that the job is immensely CUP intensive. Will it be a
>> correct assumption? or is there is any other reason for this.
>>
>> Thanks,
>> Rahul
>>
>>
>>
>

-- 
Sent from Gmail Mobile

Re: Uber Job!

Posted by Rahul Bhattacharjee <ra...@gmail.com>.

Thanks I was more interested to know about the use case of uber task

On Monday, May 6, 2013, Mohammad Tariq wrote:

> Split creation is primarily InputForma's responsibility, IMHO. It's good
> if splits overlap with the block, but it's not always true.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Mon, May 6, 2013 at 8:15 PM, Rahul Bhattacharjee <
> rahul.rec.dgp@gmail.com <javascript:_e({}, 'cvml',
> 'rahul.rec.dgp@gmail.com');>> wrote:
>
>> Hi,
>>
>> I was going through the definition of Uber Job of Hadoop.
>>
>> A job is considered uber when it has 10 or less maps , one reducer and
>> the complete data is less than one dfs block size.
>>
>> I have some doubts here-
>>
>> Splits are created as per the dfs block size.Creating 10 mappers are
>> possible from one block of data by some settings change (changing the max
>> split size). But trying to understand , why would some job need to run
>> around 10 maps for 64 MB of data.
>> One thing may be that the job is immensely CUP intensive. Will it be a
>> correct assumption? or is there is any other reason for this.
>>
>> Thanks,
>> Rahul
>>
>>
>>
>

-- 
Sent from Gmail Mobile

Re: Uber Job!

Posted by Mohammad Tariq <do...@gmail.com>.

Split creation is primarily InputForma's responsibility, IMHO. It's good if
splits overlap with the block, but it's not always true.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Mon, May 6, 2013 at 8:15 PM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com
> wrote:

> Hi,
>
> I was going through the definition of Uber Job of Hadoop.
>
> A job is considered uber when it has 10 or less maps , one reducer and the
> complete data is less than one dfs block size.
>
> I have some doubts here-
>
> Splits are created as per the dfs block size.Creating 10 mappers are
> possible from one block of data by some settings change (changing the max
> split size). But trying to understand , why would some job need to run
> around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a
> correct assumption? or is there is any other reason for this.
>
> Thanks,
> Rahul
>
>
>

Re: Uber Job!

Posted by yypvsxf19870706 <yy...@gmail.com>.

Hi

    Suppose that your input file are 10 with total size 64mb , I think you will get the 10 maps.

    By the ways,the uber mode is only for yarn . Suppose you have actually 1 map ,yarn will at least create two containers , one for app master and the other for the map , if uber mode is enabled with the yarn , yarn will only create 1 container for both app master and the map. 
    

�����ҵ� iPhone

�� 2013-5-6��22:45��Rahul Bhattacharjee <ra...@gmail.com> д����

> Hi,
> 
> I was going through the definition of Uber Job of Hadoop.
> 
> A job is considered uber when it has 10 or less maps , one reducer and the complete data is less than one dfs block size.
> 
> I have some doubts here-
> 
> Splits are created as per the dfs block size.Creating 10 mappers are possible from one block of data by some settings change (changing the max split size). But trying to understand , why would some job need to run around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a correct assumption? or is there is any other reason for this.
> 
> Thanks,
> Rahul
> 
>

Re: Uber Job!

Posted by Mohammad Tariq <do...@gmail.com>.

Split creation is primarily InputForma's responsibility, IMHO. It's good if
splits overlap with the block, but it's not always true.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Mon, May 6, 2013 at 8:15 PM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com
> wrote:

> Hi,
>
> I was going through the definition of Uber Job of Hadoop.
>
> A job is considered uber when it has 10 or less maps , one reducer and the
> complete data is less than one dfs block size.
>
> I have some doubts here-
>
> Splits are created as per the dfs block size.Creating 10 mappers are
> possible from one block of data by some settings change (changing the max
> split size). But trying to understand , why would some job need to run
> around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a
> correct assumption? or is there is any other reason for this.
>
> Thanks,
> Rahul
>
>
>

Re: Uber Job!

Posted by Mohammad Tariq <do...@gmail.com>.

Split creation is primarily InputForma's responsibility, IMHO. It's good if
splits overlap with the block, but it's not always true.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Mon, May 6, 2013 at 8:15 PM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com
> wrote:

> Hi,
>
> I was going through the definition of Uber Job of Hadoop.
>
> A job is considered uber when it has 10 or less maps , one reducer and the
> complete data is less than one dfs block size.
>
> I have some doubts here-
>
> Splits are created as per the dfs block size.Creating 10 mappers are
> possible from one block of data by some settings change (changing the max
> split size). But trying to understand , why would some job need to run
> around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a
> correct assumption? or is there is any other reason for this.
>
> Thanks,
> Rahul
>
>
>

Re: Uber Job!

Posted by yypvsxf19870706 <yy...@gmail.com>.

Hi

    Suppose that your input file are 10 with total size 64mb , I think you will get the 10 maps.

    By the ways,the uber mode is only for yarn . Suppose you have actually 1 map ,yarn will at least create two containers , one for app master and the other for the map , if uber mode is enabled with the yarn , yarn will only create 1 container for both app master and the map. 
    

发自我的 iPhone

在 2013-5-6，22:45，Rahul Bhattacharjee <ra...@gmail.com> 写道：

> Hi,
> 
> I was going through the definition of Uber Job of Hadoop.
> 
> A job is considered uber when it has 10 or less maps , one reducer and the complete data is less than one dfs block size.
> 
> I have some doubts here-
> 
> Splits are created as per the dfs block size.Creating 10 mappers are possible from one block of data by some settings change (changing the max split size). But trying to understand , why would some job need to run around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a correct assumption? or is there is any other reason for this.
> 
> Thanks,
> Rahul
> 
>

Re: Uber Job!

Posted by Mohammad Tariq <do...@gmail.com>.

Split creation is primarily InputForma's responsibility, IMHO. It's good if
splits overlap with the block, but it's not always true.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Mon, May 6, 2013 at 8:15 PM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com
> wrote:

> Hi,
>
> I was going through the definition of Uber Job of Hadoop.
>
> A job is considered uber when it has 10 or less maps , one reducer and the
> complete data is less than one dfs block size.
>
> I have some doubts here-
>
> Splits are created as per the dfs block size.Creating 10 mappers are
> possible from one block of data by some settings change (changing the max
> split size). But trying to understand , why would some job need to run
> around 10 maps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a
> correct assumption? or is there is any other reason for this.
>
> Thanks,
> Rahul
>
>
>