You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Rajeshkumar J <ra...@gmail.com> on 2017/05/15 08:48:18 UTC

region size for a mapper

Hi,

  As we run mapreduce over hbase it will take each region as input for each
mapper. I have given region max size as 10GB. If i have about 5 gb will it
take 5 gb of data as input of mappers??

Thanks

Re: region size for a mapper

Posted by Rajeshkumar J <ra...@gmail.com>.
Hi,

 Thanks ted. we are using default split policy and our flush size is 64 MB.
And the size is calculated based on the formula

 Math.min(getDesiredMaxFileSize(),initialSize * tableRegionsCount *
tableRegionsCount * tableRegionsCount);

 If this size exceeds max region size (10 GB), then max region size will be
taken or else calculated value will be taken. Suppose if the calculation
returns 11 gb then 10 gb will be considered and it will be sent to mapper.
Default mapper mapred.map.child.java.opts is 1 gb. what my doubt what if
10gb of data is sent to this mapper with 1 gb whether there will be any
issue??

Thanks

On Mon, May 15, 2017 at 2:22 PM, Ted Yu <yu...@gmail.com> wrote:

> Split policy may play a role here.
>
> Please take a look at:
> http://hbase.apache.org/book.html#_custom_split_policies
>
> On Mon, May 15, 2017 at 1:48 AM, Rajeshkumar J <
> rajeshkumarit8292@gmail.com>
> wrote:
>
> > Hi,
> >
> >   As we run mapreduce over hbase it will take each region as input for
> each
> > mapper. I have given region max size as 10GB. If i have about 5 gb will
> it
> > take 5 gb of data as input of mappers??
> >
> > Thanks
> >
>

Re: region size for a mapper

Posted by Ted Yu <yu...@gmail.com>.
Split policy may play a role here.

Please take a look at:
http://hbase.apache.org/book.html#_custom_split_policies

On Mon, May 15, 2017 at 1:48 AM, Rajeshkumar J <ra...@gmail.com>
wrote:

> Hi,
>
>   As we run mapreduce over hbase it will take each region as input for each
> mapper. I have given region max size as 10GB. If i have about 5 gb will it
> take 5 gb of data as input of mappers??
>
> Thanks
>