You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Vimal Jain <vk...@gmail.com> on 2014/03/06 13:55:45 UTC

Region splits even when max Hstorefile is less then default value

Hi,
I am running 2 node hbase cluster atop HDFS.
I was simulating a heavy write process on Hbase for performance analysis.
I found that my HstoreFile has not  reached "hbase.hregion.max.filesize"
property ( which is by default 10G ) and yet my region is split into 2
regions.

*du -sh HadoopData/*
*3.2G HadoopData/*

where HadoopData is my hadoop.tmp.dir ( place where hadoop stores data ).

Please help in understanding this.
-- 
Thanks and Regards,
Vimal Jain

Re: Region splits even when max Hstorefile is less then default value

Posted by Vimal Jain <vk...@gmail.com>.
Thanks Jean.


On Thu, Mar 6, 2014 at 6:41 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Default split policy for 0.94.17 is
> IncreasingToUpperBoundRegionSplitPolicy. Since you did not set a specific
> policy in your config file, that's the one used.
>
> Comments from the class:
>  * Split size is the number of regions that are on this server that all are
>  * of the same table, cubed, times 2x the region flush size OR the maximum
>  * region split size, whichever is smaller.  For example, if the flush size
>  * is 128M, then after two flushes (256MB) we will split which will make
> two regions
>  * that will split when their size is 2^3 * 128M*2 = 2048M.  If one of
> these
>  * regions splits, then there are three regions and now the split size is
>  * 3^3 * 128M*2 =  6912M, and so on until we reach the configured
>  * maximum filesize and then from there on out, we'll use that.
>
> JM
>
>
> 2014-03-06 8:08 GMT-05:00 Vimal Jain <vk...@gmail.com>:
>
> > Hi,
> > I am using 0.94.17 and my config file is
> >
> >                 <property>
> >                 <name>hbase.rootdir</name>
> >                 <value>hdfs://10.14.24.19:9000/hbase</value>
> >                 </property>
> >                 <property>
> >                 <name>hbase.cluster.distributed</name>
> >                 <value>true</value>
> >                 </property>
> >                 <property>
> >                 <name>hbase.zookeeper.quorum</name>
> >                 <value>10.14.24.19</value>
> >                 </property>
> >                 <property>
> >                 <name>dfs.replication</name>
> >                 <value>2</value>
> >                 </property>
> >                 <property>
> >                 <name>hbase.zookeeper.property.clientPort</name>
> >                 <value>2181</value>
> >                 </property>
> >                 <property>
> >                 <name>hbase.zookeeper.property.dataDir</name>
> >                 <value>/home/hadoop/HbaseData/zookeeper</value>
> >                 </property>
> >                 <property>
> >                 <name>hbase.tmp.dir</name>
> >                 <value>/home/hadoop/Hbasetemp</value>
> >                 </property>
> >
> >
> >
> > On Thu, Mar 6, 2014 at 6:35 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org
> > > wrote:
> >
> > > Hi Vimal,
> > >
> > > Which version of HBase do you use? Can you also share your config file?
> > >
> > > JM
> > >
> > >
> > > 2014-03-06 8:02 GMT-05:00 Vimal Jain <vk...@gmail.com>:
> > >
> > > > Hi Jean,
> > > > I am not sure about this.
> > > > Whats the default policy ?
> > > >
> > > >
> > > > On Thu, Mar 6, 2014 at 6:30 PM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org
> > > > > wrote:
> > > >
> > > > > Hi Vimal,
> > > > >
> > > > > This is used only with ConstantSizeRegionSplitPolicy. Which split
> > > policy
> > > > do
> > > > > you have configured in your setup?
> > > > >
> > > > > JM
> > > > >
> > > > >
> > > > > 2014-03-06 7:55 GMT-05:00 Vimal Jain <vk...@gmail.com>:
> > > > >
> > > > > > Hi,
> > > > > > I am running 2 node hbase cluster atop HDFS.
> > > > > > I was simulating a heavy write process on Hbase for performance
> > > > analysis.
> > > > > > I found that my HstoreFile has not  reached
> > > > "hbase.hregion.max.filesize"
> > > > > > property ( which is by default 10G ) and yet my region is split
> > into
> > > 2
> > > > > > regions.
> > > > > >
> > > > > > *du -sh HadoopData/*
> > > > > > *3.2G HadoopData/*
> > > > > >
> > > > > > where HadoopData is my hadoop.tmp.dir ( place where hadoop stores
> > > data
> > > > ).
> > > > > >
> > > > > > Please help in understanding this.
> > > > > > --
> > > > > > Thanks and Regards,
> > > > > > Vimal Jain
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks and Regards,
> > > > Vimal Jain
> > > >
> > >
> >
> >
> >
> > --
> > Thanks and Regards,
> > Vimal Jain
> >
>



-- 
Thanks and Regards,
Vimal Jain

Re: Region splits even when max Hstorefile is less then default value

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Default split policy for 0.94.17 is
IncreasingToUpperBoundRegionSplitPolicy. Since you did not set a specific
policy in your config file, that's the one used.

Comments from the class:
 * Split size is the number of regions that are on this server that all are
 * of the same table, cubed, times 2x the region flush size OR the maximum
 * region split size, whichever is smaller.  For example, if the flush size
 * is 128M, then after two flushes (256MB) we will split which will make
two regions
 * that will split when their size is 2^3 * 128M*2 = 2048M.  If one of these
 * regions splits, then there are three regions and now the split size is
 * 3^3 * 128M*2 =  6912M, and so on until we reach the configured
 * maximum filesize and then from there on out, we'll use that.

JM


2014-03-06 8:08 GMT-05:00 Vimal Jain <vk...@gmail.com>:

> Hi,
> I am using 0.94.17 and my config file is
>
>                 <property>
>                 <name>hbase.rootdir</name>
>                 <value>hdfs://10.14.24.19:9000/hbase</value>
>                 </property>
>                 <property>
>                 <name>hbase.cluster.distributed</name>
>                 <value>true</value>
>                 </property>
>                 <property>
>                 <name>hbase.zookeeper.quorum</name>
>                 <value>10.14.24.19</value>
>                 </property>
>                 <property>
>                 <name>dfs.replication</name>
>                 <value>2</value>
>                 </property>
>                 <property>
>                 <name>hbase.zookeeper.property.clientPort</name>
>                 <value>2181</value>
>                 </property>
>                 <property>
>                 <name>hbase.zookeeper.property.dataDir</name>
>                 <value>/home/hadoop/HbaseData/zookeeper</value>
>                 </property>
>                 <property>
>                 <name>hbase.tmp.dir</name>
>                 <value>/home/hadoop/Hbasetemp</value>
>                 </property>
>
>
>
> On Thu, Mar 6, 2014 at 6:35 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> > Hi Vimal,
> >
> > Which version of HBase do you use? Can you also share your config file?
> >
> > JM
> >
> >
> > 2014-03-06 8:02 GMT-05:00 Vimal Jain <vk...@gmail.com>:
> >
> > > Hi Jean,
> > > I am not sure about this.
> > > Whats the default policy ?
> > >
> > >
> > > On Thu, Mar 6, 2014 at 6:30 PM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org
> > > > wrote:
> > >
> > > > Hi Vimal,
> > > >
> > > > This is used only with ConstantSizeRegionSplitPolicy. Which split
> > policy
> > > do
> > > > you have configured in your setup?
> > > >
> > > > JM
> > > >
> > > >
> > > > 2014-03-06 7:55 GMT-05:00 Vimal Jain <vk...@gmail.com>:
> > > >
> > > > > Hi,
> > > > > I am running 2 node hbase cluster atop HDFS.
> > > > > I was simulating a heavy write process on Hbase for performance
> > > analysis.
> > > > > I found that my HstoreFile has not  reached
> > > "hbase.hregion.max.filesize"
> > > > > property ( which is by default 10G ) and yet my region is split
> into
> > 2
> > > > > regions.
> > > > >
> > > > > *du -sh HadoopData/*
> > > > > *3.2G HadoopData/*
> > > > >
> > > > > where HadoopData is my hadoop.tmp.dir ( place where hadoop stores
> > data
> > > ).
> > > > >
> > > > > Please help in understanding this.
> > > > > --
> > > > > Thanks and Regards,
> > > > > Vimal Jain
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks and Regards,
> > > Vimal Jain
> > >
> >
>
>
>
> --
> Thanks and Regards,
> Vimal Jain
>

Re: Region splits even when max Hstorefile is less then default value

Posted by Vimal Jain <vk...@gmail.com>.
Hi,
I am using 0.94.17 and my config file is

                <property>
                <name>hbase.rootdir</name>
                <value>hdfs://10.14.24.19:9000/hbase</value>
                </property>
                <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
                </property>
                <property>
                <name>hbase.zookeeper.quorum</name>
                <value>10.14.24.19</value>
                </property>
                <property>
                <name>dfs.replication</name>
                <value>2</value>
                </property>
                <property>
                <name>hbase.zookeeper.property.clientPort</name>
                <value>2181</value>
                </property>
                <property>
                <name>hbase.zookeeper.property.dataDir</name>
                <value>/home/hadoop/HbaseData/zookeeper</value>
                </property>
                <property>
                <name>hbase.tmp.dir</name>
                <value>/home/hadoop/Hbasetemp</value>
                </property>



On Thu, Mar 6, 2014 at 6:35 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hi Vimal,
>
> Which version of HBase do you use? Can you also share your config file?
>
> JM
>
>
> 2014-03-06 8:02 GMT-05:00 Vimal Jain <vk...@gmail.com>:
>
> > Hi Jean,
> > I am not sure about this.
> > Whats the default policy ?
> >
> >
> > On Thu, Mar 6, 2014 at 6:30 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org
> > > wrote:
> >
> > > Hi Vimal,
> > >
> > > This is used only with ConstantSizeRegionSplitPolicy. Which split
> policy
> > do
> > > you have configured in your setup?
> > >
> > > JM
> > >
> > >
> > > 2014-03-06 7:55 GMT-05:00 Vimal Jain <vk...@gmail.com>:
> > >
> > > > Hi,
> > > > I am running 2 node hbase cluster atop HDFS.
> > > > I was simulating a heavy write process on Hbase for performance
> > analysis.
> > > > I found that my HstoreFile has not  reached
> > "hbase.hregion.max.filesize"
> > > > property ( which is by default 10G ) and yet my region is split into
> 2
> > > > regions.
> > > >
> > > > *du -sh HadoopData/*
> > > > *3.2G HadoopData/*
> > > >
> > > > where HadoopData is my hadoop.tmp.dir ( place where hadoop stores
> data
> > ).
> > > >
> > > > Please help in understanding this.
> > > > --
> > > > Thanks and Regards,
> > > > Vimal Jain
> > > >
> > >
> >
> >
> >
> > --
> > Thanks and Regards,
> > Vimal Jain
> >
>



-- 
Thanks and Regards,
Vimal Jain

Re: Region splits even when max Hstorefile is less then default value

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Vimal,

Which version of HBase do you use? Can you also share your config file?

JM


2014-03-06 8:02 GMT-05:00 Vimal Jain <vk...@gmail.com>:

> Hi Jean,
> I am not sure about this.
> Whats the default policy ?
>
>
> On Thu, Mar 6, 2014 at 6:30 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> > Hi Vimal,
> >
> > This is used only with ConstantSizeRegionSplitPolicy. Which split policy
> do
> > you have configured in your setup?
> >
> > JM
> >
> >
> > 2014-03-06 7:55 GMT-05:00 Vimal Jain <vk...@gmail.com>:
> >
> > > Hi,
> > > I am running 2 node hbase cluster atop HDFS.
> > > I was simulating a heavy write process on Hbase for performance
> analysis.
> > > I found that my HstoreFile has not  reached
> "hbase.hregion.max.filesize"
> > > property ( which is by default 10G ) and yet my region is split into 2
> > > regions.
> > >
> > > *du -sh HadoopData/*
> > > *3.2G HadoopData/*
> > >
> > > where HadoopData is my hadoop.tmp.dir ( place where hadoop stores data
> ).
> > >
> > > Please help in understanding this.
> > > --
> > > Thanks and Regards,
> > > Vimal Jain
> > >
> >
>
>
>
> --
> Thanks and Regards,
> Vimal Jain
>

Re: Region splits even when max Hstorefile is less then default value

Posted by Vimal Jain <vk...@gmail.com>.
Hi Jean,
I am not sure about this.
Whats the default policy ?


On Thu, Mar 6, 2014 at 6:30 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hi Vimal,
>
> This is used only with ConstantSizeRegionSplitPolicy. Which split policy do
> you have configured in your setup?
>
> JM
>
>
> 2014-03-06 7:55 GMT-05:00 Vimal Jain <vk...@gmail.com>:
>
> > Hi,
> > I am running 2 node hbase cluster atop HDFS.
> > I was simulating a heavy write process on Hbase for performance analysis.
> > I found that my HstoreFile has not  reached "hbase.hregion.max.filesize"
> > property ( which is by default 10G ) and yet my region is split into 2
> > regions.
> >
> > *du -sh HadoopData/*
> > *3.2G HadoopData/*
> >
> > where HadoopData is my hadoop.tmp.dir ( place where hadoop stores data ).
> >
> > Please help in understanding this.
> > --
> > Thanks and Regards,
> > Vimal Jain
> >
>



-- 
Thanks and Regards,
Vimal Jain

Re: Region splits even when max Hstorefile is less then default value

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Vimal,

This is used only with ConstantSizeRegionSplitPolicy. Which split policy do
you have configured in your setup?

JM


2014-03-06 7:55 GMT-05:00 Vimal Jain <vk...@gmail.com>:

> Hi,
> I am running 2 node hbase cluster atop HDFS.
> I was simulating a heavy write process on Hbase for performance analysis.
> I found that my HstoreFile has not  reached "hbase.hregion.max.filesize"
> property ( which is by default 10G ) and yet my region is split into 2
> regions.
>
> *du -sh HadoopData/*
> *3.2G HadoopData/*
>
> where HadoopData is my hadoop.tmp.dir ( place where hadoop stores data ).
>
> Please help in understanding this.
> --
> Thanks and Regards,
> Vimal Jain
>