You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Amit Sela <am...@infolinks.com> on 2014/01/22 18:25:41 UTC

max HStoreFile size

Hi all, I'm using HBase 0.94.12 and in some tables I'm managing splitting
and compactions manually.

I was wondering if hbase.hregion.max.filesize relates to compressed or
uncompressed file size.
If I'm using compression, and the file size < hbase.hregion.max.filesize
but uncompressed it's bigger, than when executing major compaction on the
region, it splits.

Should it be like that ? more important, the recommendation of regions of
1GB is for compressed or uncompressed StoreFile size?

Since I'm using bulk load, I get about 3 StoreFiles loaded into each CF of
every new region, I executed region compaction to unite them as 1 file (and
then got the unwanted splits) - If I'm never updating this data, do I gain
something from uniting the files ?
Could I manage ~500MB of compressed (GZ - decompresses to about 7.5GB) with
10GB RAM RegionServers ?

Thanks,

Amit.

Re: max HStoreFile size

Posted by Amit Sela <am...@infolinks.com>.
I set the compaction policy to constant size and still, when compacting
bulk loaded regions, it splits the regions (the region's size is much
smaller than max file size but I do use compression...)
On Jan 23, 2014 12:11 PM, "Samir Ahmic" <ah...@gmail.com> wrote:

> Hi Amit,
>
> Yes. You can set split policy per table. Here is relevant part of hbase
> book:
>
> http://hbase.apache.org/book/regions.arch.html
>
> The policy can set globally through the HBaseConfiguration used or on a per
> table basis:
>
> HTableDescriptor myHtd = ...;
> myHtd.setValue(HTableDescriptor.SPLIT_POLICY,
> MyCustomSplitPolicy.class.getName());
>
> Cheers
>
>
>
> On Thu, Jan 23, 2014 at 9:34 AM, Amit Sela <am...@infolinks.com> wrote:
>
> > So I think my problem is that from 0.94 the default split policy
> > is: IncreasingToUpperBoundRegionSplitPolicy and
> > not ConstantSizeRegionSplitPolicy.
> >
> > Can I set split policy per table ?
> >
> > Still don't know if hbase.hregion.max.filesize relates to compressed or
> > uncompressed ?
> >
> >
> > On Wed, Jan 22, 2014 at 7:25 PM, Amit Sela <am...@infolinks.com> wrote:
> >
> > > Hi all, I'm using HBase 0.94.12 and in some tables I'm managing
> splitting
> > > and compactions manually.
> > >
> > > I was wondering if hbase.hregion.max.filesize relates to compressed or
> > > uncompressed file size.
> > > If I'm using compression, and the file size <
> hbase.hregion.max.filesize
> > > but uncompressed it's bigger, than when executing major compaction on
> the
> > > region, it splits.
> > >
> > > Should it be like that ? more important, the recommendation of regions
> of
> > > 1GB is for compressed or uncompressed StoreFile size?
> > >
> > > Since I'm using bulk load, I get about 3 StoreFiles loaded into each CF
> > of
> > > every new region, I executed region compaction to unite them as 1 file
> > (and
> > > then got the unwanted splits) - If I'm never updating this data, do I
> > gain
> > > something from uniting the files ?
> > > Could I manage ~500MB of compressed (GZ - decompresses to about 7.5GB)
> > > with 10GB RAM RegionServers ?
> > >
> > > Thanks,
> > >
> > > Amit.
> > >
> > >
> > >
> > >
> > >
> >
>

Re: max HStoreFile size

Posted by Samir Ahmic <ah...@gmail.com>.
Hi Amit,

Yes. You can set split policy per table. Here is relevant part of hbase
book:

http://hbase.apache.org/book/regions.arch.html

The policy can set globally through the HBaseConfiguration used or on a per
table basis:

HTableDescriptor myHtd = ...;
myHtd.setValue(HTableDescriptor.SPLIT_POLICY,
MyCustomSplitPolicy.class.getName());

Cheers



On Thu, Jan 23, 2014 at 9:34 AM, Amit Sela <am...@infolinks.com> wrote:

> So I think my problem is that from 0.94 the default split policy
> is: IncreasingToUpperBoundRegionSplitPolicy and
> not ConstantSizeRegionSplitPolicy.
>
> Can I set split policy per table ?
>
> Still don't know if hbase.hregion.max.filesize relates to compressed or
> uncompressed ?
>
>
> On Wed, Jan 22, 2014 at 7:25 PM, Amit Sela <am...@infolinks.com> wrote:
>
> > Hi all, I'm using HBase 0.94.12 and in some tables I'm managing splitting
> > and compactions manually.
> >
> > I was wondering if hbase.hregion.max.filesize relates to compressed or
> > uncompressed file size.
> > If I'm using compression, and the file size < hbase.hregion.max.filesize
> > but uncompressed it's bigger, than when executing major compaction on the
> > region, it splits.
> >
> > Should it be like that ? more important, the recommendation of regions of
> > 1GB is for compressed or uncompressed StoreFile size?
> >
> > Since I'm using bulk load, I get about 3 StoreFiles loaded into each CF
> of
> > every new region, I executed region compaction to unite them as 1 file
> (and
> > then got the unwanted splits) - If I'm never updating this data, do I
> gain
> > something from uniting the files ?
> > Could I manage ~500MB of compressed (GZ - decompresses to about 7.5GB)
> > with 10GB RAM RegionServers ?
> >
> > Thanks,
> >
> > Amit.
> >
> >
> >
> >
> >
>

Re: max HStoreFile size

Posted by Amit Sela <am...@infolinks.com>.
So I think my problem is that from 0.94 the default split policy
is: IncreasingToUpperBoundRegionSplitPolicy and
not ConstantSizeRegionSplitPolicy.

Can I set split policy per table ?

Still don't know if hbase.hregion.max.filesize relates to compressed or
uncompressed ?


On Wed, Jan 22, 2014 at 7:25 PM, Amit Sela <am...@infolinks.com> wrote:

> Hi all, I'm using HBase 0.94.12 and in some tables I'm managing splitting
> and compactions manually.
>
> I was wondering if hbase.hregion.max.filesize relates to compressed or
> uncompressed file size.
> If I'm using compression, and the file size < hbase.hregion.max.filesize
> but uncompressed it's bigger, than when executing major compaction on the
> region, it splits.
>
> Should it be like that ? more important, the recommendation of regions of
> 1GB is for compressed or uncompressed StoreFile size?
>
> Since I'm using bulk load, I get about 3 StoreFiles loaded into each CF of
> every new region, I executed region compaction to unite them as 1 file (and
> then got the unwanted splits) - If I'm never updating this data, do I gain
> something from uniting the files ?
> Could I manage ~500MB of compressed (GZ - decompresses to about 7.5GB)
> with 10GB RAM RegionServers ?
>
> Thanks,
>
> Amit.
>
>
>
>
>