You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mehdi Ben Haj Abbes <me...@gmail.com> on 2016/01/05 15:52:31 UTC

When compactions become major ones

Hi folks,

I'm using hbase 0.98. I have heavy writes workload. I'm writing to one
table with one CF compressed with GZ. My table is pre splitted to 27
regions. As I start writing to this table I start seeing HFiles of the size
of 2-4 MB across the regions. I have the default hbase configuration for
compaction properties. The compactions start as soon as I start writing to
HBase but many of these compaction are major ones. I can see this through
HBase master UI on the table details view. So I wanted to understand when a
compaction becomes major.

Another question, If I'm not wrong we have a memstore per region, so when a
memstore is flushed I will have a HFile with 128MB but I only see files
with 42MB (without compression and 2.5MB when compressed with GZ).

Any explanation ?

Thanks in advance.
-- 
Mehdi BEN HAJ ABBES

Re: When compactions become major ones

Posted by Ted Yu <yu...@gmail.com>.
This parameter is not listed in hbase-default.xml

>From hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
(of branch-1.1):
    this.maxLogs = conf.getInt("hbase.regionserver.maxlogs", 32);

You don't need to adjust its value given the information below.

FYI

On Wed, Jan 6, 2016 at 2:22 AM, Mehdi Ben Haj Abbes <me...@gmail.com>
wrote:

> Hi,
> I couldn't  find the hbase.regionserver.maxlogs property in
> hbase-default.xml neither in the 0.98 nor in the 0.94 nor the 1.XX
> documentation. So how could I know for sure the default value for this
> property and how could I change is ? is it a property to put in the
> hbase-site.xml ?
>
> And if the default for this property is 32 I'm good according to the guide
> you posted Ted.
> I have my XMX set to 6144 MB and my upper limit to 0.4 and the DFS blocks
> are set to 128MB.
> So I'm good with 6144 * 0.4 < 128 * 32 * 095
>
> Best regards,
> Mehdi
>
> On Tue, Jan 5, 2016 at 9:36 PM, Mehdi Ben Haj Abbes <mehdi.abbes@gmail.com
> >
> wrote:
>
> > Thanks guys for your feedbacks.  I will check the WAL property and make
> > other tests and let u know.
> > Best regards,
> > Mehdi
> > Le 5 janv. 2016 9:09 PM, "Vladimir Rodionov" <vl...@gmail.com> a
> > écrit :
> >
> >> >>And I still dont understand how the store files resulting after
> memstore
> >> >>flushes are having a size of 40MB. Does it hove smth to do with
> memstore
> >> >>upper limit and these 42MB are the result of forcing the memstore to
> be
> >> >>flushed? The problem is that all the newly store files added to HDFS
> are
> >> >>starting with this size (42MB) I did not mention that my CF is
> >> in-memory.
> >>
> >> Its due to Java object overhead, so 3x is normal (128MB  in memory ->
> 42MB
> >> on disk)
> >> Another aspect to take into account: flush can happen not only when we
> >> reach memstore size limit,
> >> there are other triggers as well:
> >>
> >> 1. maximum WAL files reached (hbase.regionserver.maxlogs)
> >> 2. periodic memstore flusher (once an 1h) can trigger flushes a s well
> >>
> >> -Vlad
> >>
> >>
> >>
> >> On Tue, Jan 5, 2016 at 9:37 AM, Ted Yu <yu...@gmail.com> wrote:
> >>
> >> > For #1,
> >> > bq. would this minor which becomes major take care of deleted rows
> >> >
> >> > Yes.
> >> >
> >> > For #2, please consider the following guide:
> >> >
> >> > dfs.blocksize (value: ${propdata["dfs.blocksize"]}) * 0.95 *
> >> > hbase.regionserver.maxlogs (value:
> >> > ${propdata["hbase.regionserver.maxlogs"]}) should be greater than
> >> > hbase.regionserver.global.memstore.upperLimit * HBASE_HEAPSIZE (the
> >> value
> >> > for -Xmx)
> >> >
> >> > Cheers
> >> >
> >> > On Tue, Jan 5, 2016 at 8:39 AM, Mehdi Ben Haj Abbes <
> >> mehdi.abbes@gmail.com
> >> > >
> >> > wrote:
> >> >
> >> > > Thanks Ted for the clarification about the major compactions. So if
> I
> >> did
> >> > > understand well when a minor compaction is triggered and the policy
> >> > selects
> >> > > all the store files, this compaction becomes a major one. But would
> >> this
> >> > > minor which becomes major take care of deleted rows as a major one
> >> would
> >> > do
> >> > > or at the end it is just a minor that happened and selected all the
> >> store
> >> > > files ?
> >> > >
> >> > > About disabling splitting I have already hbase.hregion.max.filesize
> >> set
> >> > to
> >> > > 10GB besides I pre splitted my table.
> >> > >
> >> > > And I still dont understand how the store files resulting after
> >> memstore
> >> > > flushes are having a size of 40MB. Does it hove smth to do with
> >> memstore
> >> > > upper limit and these 42MB are the result of forcing the memstore to
> >> be
> >> > > flushed? The problem is that all the newly store files added to HDFS
> >> are
> >> > > starting with this size (42MB) I did not mention that my CF is
> >> in-memory.
> >> > >
> >> > > Best regards,
> >> > >
> >> > > On Tue, Jan 5, 2016 at 4:04 PM, Ted Yu <yu...@gmail.com> wrote:
> >> > >
> >> > > > For #1, when all store files are selected for compaction, the
> >> > compaction
> >> > > > becomes major
> >> > > >
> >> > > > see 'Determine the Optimal Number of Pre-Split Regions' under:
> >> > > > http://hbase.apache.org/book.html#disable.splitting
> >> > > >
> >> > > > See also http://hbase.apache.org/book.html#managed.compactions
> >> > > >
> >> > > > Cheers
> >> > > >
> >> > > > On Tue, Jan 5, 2016 at 6:52 AM, Mehdi Ben Haj Abbes <
> >> > > mehdi.abbes@gmail.com
> >> > > > >
> >> > > > wrote:
> >> > > >
> >> > > > > Hi folks,
> >> > > > >
> >> > > > > I'm using hbase 0.98. I have heavy writes workload. I'm writing
> to
> >> > one
> >> > > > > table with one CF compressed with GZ. My table is pre splitted
> to
> >> 27
> >> > > > > regions. As I start writing to this table I start seeing HFiles
> of
> >> > the
> >> > > > size
> >> > > > > of 2-4 MB across the regions. I have the default hbase
> >> configuration
> >> > > for
> >> > > > > compaction properties. The compactions start as soon as I start
> >> > writing
> >> > > > to
> >> > > > > HBase but many of these compaction are major ones. I can see
> this
> >> > > through
> >> > > > > HBase master UI on the table details view. So I wanted to
> >> understand
> >> > > > when a
> >> > > > > compaction becomes major.
> >> > > > >
> >> > > > > Another question, If I'm not wrong we have a memstore per
> region,
> >> so
> >> > > > when a
> >> > > > > memstore is flushed I will have a HFile with 128MB but I only
> see
> >> > files
> >> > > > > with 42MB (without compression and 2.5MB when compressed with
> GZ).
> >> > > > >
> >> > > > > Any explanation ?
> >> > > > >
> >> > > > > Thanks in advance.
> >> > > > > --
> >> > > > > Mehdi BEN HAJ ABBES
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Mehdi BEN HAJ ABBES
> >> > >
> >> >
> >>
> >
>
>
> --
> Mehdi BEN HAJ ABBES
>

Re: When compactions become major ones

Posted by Mehdi Ben Haj Abbes <me...@gmail.com>.
Hi,
I couldn't  find the hbase.regionserver.maxlogs property in
hbase-default.xml neither in the 0.98 nor in the 0.94 nor the 1.XX
documentation. So how could I know for sure the default value for this
property and how could I change is ? is it a property to put in the
hbase-site.xml ?

And if the default for this property is 32 I'm good according to the guide
you posted Ted.
I have my XMX set to 6144 MB and my upper limit to 0.4 and the DFS blocks
are set to 128MB.
So I'm good with 6144 * 0.4 < 128 * 32 * 095

Best regards,
Mehdi

On Tue, Jan 5, 2016 at 9:36 PM, Mehdi Ben Haj Abbes <me...@gmail.com>
wrote:

> Thanks guys for your feedbacks.  I will check the WAL property and make
> other tests and let u know.
> Best regards,
> Mehdi
> Le 5 janv. 2016 9:09 PM, "Vladimir Rodionov" <vl...@gmail.com> a
> écrit :
>
>> >>And I still dont understand how the store files resulting after memstore
>> >>flushes are having a size of 40MB. Does it hove smth to do with memstore
>> >>upper limit and these 42MB are the result of forcing the memstore to be
>> >>flushed? The problem is that all the newly store files added to HDFS are
>> >>starting with this size (42MB) I did not mention that my CF is
>> in-memory.
>>
>> Its due to Java object overhead, so 3x is normal (128MB  in memory -> 42MB
>> on disk)
>> Another aspect to take into account: flush can happen not only when we
>> reach memstore size limit,
>> there are other triggers as well:
>>
>> 1. maximum WAL files reached (hbase.regionserver.maxlogs)
>> 2. periodic memstore flusher (once an 1h) can trigger flushes a s well
>>
>> -Vlad
>>
>>
>>
>> On Tue, Jan 5, 2016 at 9:37 AM, Ted Yu <yu...@gmail.com> wrote:
>>
>> > For #1,
>> > bq. would this minor which becomes major take care of deleted rows
>> >
>> > Yes.
>> >
>> > For #2, please consider the following guide:
>> >
>> > dfs.blocksize (value: ${propdata["dfs.blocksize"]}) * 0.95 *
>> > hbase.regionserver.maxlogs (value:
>> > ${propdata["hbase.regionserver.maxlogs"]}) should be greater than
>> > hbase.regionserver.global.memstore.upperLimit * HBASE_HEAPSIZE (the
>> value
>> > for -Xmx)
>> >
>> > Cheers
>> >
>> > On Tue, Jan 5, 2016 at 8:39 AM, Mehdi Ben Haj Abbes <
>> mehdi.abbes@gmail.com
>> > >
>> > wrote:
>> >
>> > > Thanks Ted for the clarification about the major compactions. So if I
>> did
>> > > understand well when a minor compaction is triggered and the policy
>> > selects
>> > > all the store files, this compaction becomes a major one. But would
>> this
>> > > minor which becomes major take care of deleted rows as a major one
>> would
>> > do
>> > > or at the end it is just a minor that happened and selected all the
>> store
>> > > files ?
>> > >
>> > > About disabling splitting I have already hbase.hregion.max.filesize
>> set
>> > to
>> > > 10GB besides I pre splitted my table.
>> > >
>> > > And I still dont understand how the store files resulting after
>> memstore
>> > > flushes are having a size of 40MB. Does it hove smth to do with
>> memstore
>> > > upper limit and these 42MB are the result of forcing the memstore to
>> be
>> > > flushed? The problem is that all the newly store files added to HDFS
>> are
>> > > starting with this size (42MB) I did not mention that my CF is
>> in-memory.
>> > >
>> > > Best regards,
>> > >
>> > > On Tue, Jan 5, 2016 at 4:04 PM, Ted Yu <yu...@gmail.com> wrote:
>> > >
>> > > > For #1, when all store files are selected for compaction, the
>> > compaction
>> > > > becomes major
>> > > >
>> > > > see 'Determine the Optimal Number of Pre-Split Regions' under:
>> > > > http://hbase.apache.org/book.html#disable.splitting
>> > > >
>> > > > See also http://hbase.apache.org/book.html#managed.compactions
>> > > >
>> > > > Cheers
>> > > >
>> > > > On Tue, Jan 5, 2016 at 6:52 AM, Mehdi Ben Haj Abbes <
>> > > mehdi.abbes@gmail.com
>> > > > >
>> > > > wrote:
>> > > >
>> > > > > Hi folks,
>> > > > >
>> > > > > I'm using hbase 0.98. I have heavy writes workload. I'm writing to
>> > one
>> > > > > table with one CF compressed with GZ. My table is pre splitted to
>> 27
>> > > > > regions. As I start writing to this table I start seeing HFiles of
>> > the
>> > > > size
>> > > > > of 2-4 MB across the regions. I have the default hbase
>> configuration
>> > > for
>> > > > > compaction properties. The compactions start as soon as I start
>> > writing
>> > > > to
>> > > > > HBase but many of these compaction are major ones. I can see this
>> > > through
>> > > > > HBase master UI on the table details view. So I wanted to
>> understand
>> > > > when a
>> > > > > compaction becomes major.
>> > > > >
>> > > > > Another question, If I'm not wrong we have a memstore per region,
>> so
>> > > > when a
>> > > > > memstore is flushed I will have a HFile with 128MB but I only see
>> > files
>> > > > > with 42MB (without compression and 2.5MB when compressed with GZ).
>> > > > >
>> > > > > Any explanation ?
>> > > > >
>> > > > > Thanks in advance.
>> > > > > --
>> > > > > Mehdi BEN HAJ ABBES
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Mehdi BEN HAJ ABBES
>> > >
>> >
>>
>


-- 
Mehdi BEN HAJ ABBES

Re: When compactions become major ones

Posted by Mehdi Ben Haj Abbes <me...@gmail.com>.
Thanks guys for your feedbacks.  I will check the WAL property and make
other tests and let u know.
Best regards,
Mehdi
Le 5 janv. 2016 9:09 PM, "Vladimir Rodionov" <vl...@gmail.com> a
écrit :

> >>And I still dont understand how the store files resulting after memstore
> >>flushes are having a size of 40MB. Does it hove smth to do with memstore
> >>upper limit and these 42MB are the result of forcing the memstore to be
> >>flushed? The problem is that all the newly store files added to HDFS are
> >>starting with this size (42MB) I did not mention that my CF is in-memory.
>
> Its due to Java object overhead, so 3x is normal (128MB  in memory -> 42MB
> on disk)
> Another aspect to take into account: flush can happen not only when we
> reach memstore size limit,
> there are other triggers as well:
>
> 1. maximum WAL files reached (hbase.regionserver.maxlogs)
> 2. periodic memstore flusher (once an 1h) can trigger flushes a s well
>
> -Vlad
>
>
>
> On Tue, Jan 5, 2016 at 9:37 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > For #1,
> > bq. would this minor which becomes major take care of deleted rows
> >
> > Yes.
> >
> > For #2, please consider the following guide:
> >
> > dfs.blocksize (value: ${propdata["dfs.blocksize"]}) * 0.95 *
> > hbase.regionserver.maxlogs (value:
> > ${propdata["hbase.regionserver.maxlogs"]}) should be greater than
> > hbase.regionserver.global.memstore.upperLimit * HBASE_HEAPSIZE (the value
> > for -Xmx)
> >
> > Cheers
> >
> > On Tue, Jan 5, 2016 at 8:39 AM, Mehdi Ben Haj Abbes <
> mehdi.abbes@gmail.com
> > >
> > wrote:
> >
> > > Thanks Ted for the clarification about the major compactions. So if I
> did
> > > understand well when a minor compaction is triggered and the policy
> > selects
> > > all the store files, this compaction becomes a major one. But would
> this
> > > minor which becomes major take care of deleted rows as a major one
> would
> > do
> > > or at the end it is just a minor that happened and selected all the
> store
> > > files ?
> > >
> > > About disabling splitting I have already hbase.hregion.max.filesize set
> > to
> > > 10GB besides I pre splitted my table.
> > >
> > > And I still dont understand how the store files resulting after
> memstore
> > > flushes are having a size of 40MB. Does it hove smth to do with
> memstore
> > > upper limit and these 42MB are the result of forcing the memstore to be
> > > flushed? The problem is that all the newly store files added to HDFS
> are
> > > starting with this size (42MB) I did not mention that my CF is
> in-memory.
> > >
> > > Best regards,
> > >
> > > On Tue, Jan 5, 2016 at 4:04 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > For #1, when all store files are selected for compaction, the
> > compaction
> > > > becomes major
> > > >
> > > > see 'Determine the Optimal Number of Pre-Split Regions' under:
> > > > http://hbase.apache.org/book.html#disable.splitting
> > > >
> > > > See also http://hbase.apache.org/book.html#managed.compactions
> > > >
> > > > Cheers
> > > >
> > > > On Tue, Jan 5, 2016 at 6:52 AM, Mehdi Ben Haj Abbes <
> > > mehdi.abbes@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi folks,
> > > > >
> > > > > I'm using hbase 0.98. I have heavy writes workload. I'm writing to
> > one
> > > > > table with one CF compressed with GZ. My table is pre splitted to
> 27
> > > > > regions. As I start writing to this table I start seeing HFiles of
> > the
> > > > size
> > > > > of 2-4 MB across the regions. I have the default hbase
> configuration
> > > for
> > > > > compaction properties. The compactions start as soon as I start
> > writing
> > > > to
> > > > > HBase but many of these compaction are major ones. I can see this
> > > through
> > > > > HBase master UI on the table details view. So I wanted to
> understand
> > > > when a
> > > > > compaction becomes major.
> > > > >
> > > > > Another question, If I'm not wrong we have a memstore per region,
> so
> > > > when a
> > > > > memstore is flushed I will have a HFile with 128MB but I only see
> > files
> > > > > with 42MB (without compression and 2.5MB when compressed with GZ).
> > > > >
> > > > > Any explanation ?
> > > > >
> > > > > Thanks in advance.
> > > > > --
> > > > > Mehdi BEN HAJ ABBES
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Mehdi BEN HAJ ABBES
> > >
> >
>

Re: When compactions become major ones

Posted by Vladimir Rodionov <vl...@gmail.com>.
>>And I still dont understand how the store files resulting after memstore
>>flushes are having a size of 40MB. Does it hove smth to do with memstore
>>upper limit and these 42MB are the result of forcing the memstore to be
>>flushed? The problem is that all the newly store files added to HDFS are
>>starting with this size (42MB) I did not mention that my CF is in-memory.

Its due to Java object overhead, so 3x is normal (128MB  in memory -> 42MB
on disk)
Another aspect to take into account: flush can happen not only when we
reach memstore size limit,
there are other triggers as well:

1. maximum WAL files reached (hbase.regionserver.maxlogs)
2. periodic memstore flusher (once an 1h) can trigger flushes a s well

-Vlad



On Tue, Jan 5, 2016 at 9:37 AM, Ted Yu <yu...@gmail.com> wrote:

> For #1,
> bq. would this minor which becomes major take care of deleted rows
>
> Yes.
>
> For #2, please consider the following guide:
>
> dfs.blocksize (value: ${propdata["dfs.blocksize"]}) * 0.95 *
> hbase.regionserver.maxlogs (value:
> ${propdata["hbase.regionserver.maxlogs"]}) should be greater than
> hbase.regionserver.global.memstore.upperLimit * HBASE_HEAPSIZE (the value
> for -Xmx)
>
> Cheers
>
> On Tue, Jan 5, 2016 at 8:39 AM, Mehdi Ben Haj Abbes <mehdi.abbes@gmail.com
> >
> wrote:
>
> > Thanks Ted for the clarification about the major compactions. So if I did
> > understand well when a minor compaction is triggered and the policy
> selects
> > all the store files, this compaction becomes a major one. But would this
> > minor which becomes major take care of deleted rows as a major one would
> do
> > or at the end it is just a minor that happened and selected all the store
> > files ?
> >
> > About disabling splitting I have already hbase.hregion.max.filesize set
> to
> > 10GB besides I pre splitted my table.
> >
> > And I still dont understand how the store files resulting after memstore
> > flushes are having a size of 40MB. Does it hove smth to do with memstore
> > upper limit and these 42MB are the result of forcing the memstore to be
> > flushed? The problem is that all the newly store files added to HDFS are
> > starting with this size (42MB) I did not mention that my CF is in-memory.
> >
> > Best regards,
> >
> > On Tue, Jan 5, 2016 at 4:04 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > For #1, when all store files are selected for compaction, the
> compaction
> > > becomes major
> > >
> > > see 'Determine the Optimal Number of Pre-Split Regions' under:
> > > http://hbase.apache.org/book.html#disable.splitting
> > >
> > > See also http://hbase.apache.org/book.html#managed.compactions
> > >
> > > Cheers
> > >
> > > On Tue, Jan 5, 2016 at 6:52 AM, Mehdi Ben Haj Abbes <
> > mehdi.abbes@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi folks,
> > > >
> > > > I'm using hbase 0.98. I have heavy writes workload. I'm writing to
> one
> > > > table with one CF compressed with GZ. My table is pre splitted to 27
> > > > regions. As I start writing to this table I start seeing HFiles of
> the
> > > size
> > > > of 2-4 MB across the regions. I have the default hbase configuration
> > for
> > > > compaction properties. The compactions start as soon as I start
> writing
> > > to
> > > > HBase but many of these compaction are major ones. I can see this
> > through
> > > > HBase master UI on the table details view. So I wanted to understand
> > > when a
> > > > compaction becomes major.
> > > >
> > > > Another question, If I'm not wrong we have a memstore per region, so
> > > when a
> > > > memstore is flushed I will have a HFile with 128MB but I only see
> files
> > > > with 42MB (without compression and 2.5MB when compressed with GZ).
> > > >
> > > > Any explanation ?
> > > >
> > > > Thanks in advance.
> > > > --
> > > > Mehdi BEN HAJ ABBES
> > > >
> > >
> >
> >
> >
> > --
> > Mehdi BEN HAJ ABBES
> >
>

Re: When compactions become major ones

Posted by Ted Yu <yu...@gmail.com>.
For #1,
bq. would this minor which becomes major take care of deleted rows

Yes.

For #2, please consider the following guide:

dfs.blocksize (value: ${propdata["dfs.blocksize"]}) * 0.95 *
hbase.regionserver.maxlogs (value:
${propdata["hbase.regionserver.maxlogs"]}) should be greater than
hbase.regionserver.global.memstore.upperLimit * HBASE_HEAPSIZE (the value
for -Xmx)

Cheers

On Tue, Jan 5, 2016 at 8:39 AM, Mehdi Ben Haj Abbes <me...@gmail.com>
wrote:

> Thanks Ted for the clarification about the major compactions. So if I did
> understand well when a minor compaction is triggered and the policy selects
> all the store files, this compaction becomes a major one. But would this
> minor which becomes major take care of deleted rows as a major one would do
> or at the end it is just a minor that happened and selected all the store
> files ?
>
> About disabling splitting I have already hbase.hregion.max.filesize set to
> 10GB besides I pre splitted my table.
>
> And I still dont understand how the store files resulting after memstore
> flushes are having a size of 40MB. Does it hove smth to do with memstore
> upper limit and these 42MB are the result of forcing the memstore to be
> flushed? The problem is that all the newly store files added to HDFS are
> starting with this size (42MB) I did not mention that my CF is in-memory.
>
> Best regards,
>
> On Tue, Jan 5, 2016 at 4:04 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > For #1, when all store files are selected for compaction, the compaction
> > becomes major
> >
> > see 'Determine the Optimal Number of Pre-Split Regions' under:
> > http://hbase.apache.org/book.html#disable.splitting
> >
> > See also http://hbase.apache.org/book.html#managed.compactions
> >
> > Cheers
> >
> > On Tue, Jan 5, 2016 at 6:52 AM, Mehdi Ben Haj Abbes <
> mehdi.abbes@gmail.com
> > >
> > wrote:
> >
> > > Hi folks,
> > >
> > > I'm using hbase 0.98. I have heavy writes workload. I'm writing to one
> > > table with one CF compressed with GZ. My table is pre splitted to 27
> > > regions. As I start writing to this table I start seeing HFiles of the
> > size
> > > of 2-4 MB across the regions. I have the default hbase configuration
> for
> > > compaction properties. The compactions start as soon as I start writing
> > to
> > > HBase but many of these compaction are major ones. I can see this
> through
> > > HBase master UI on the table details view. So I wanted to understand
> > when a
> > > compaction becomes major.
> > >
> > > Another question, If I'm not wrong we have a memstore per region, so
> > when a
> > > memstore is flushed I will have a HFile with 128MB but I only see files
> > > with 42MB (without compression and 2.5MB when compressed with GZ).
> > >
> > > Any explanation ?
> > >
> > > Thanks in advance.
> > > --
> > > Mehdi BEN HAJ ABBES
> > >
> >
>
>
>
> --
> Mehdi BEN HAJ ABBES
>

Re: When compactions become major ones

Posted by Mehdi Ben Haj Abbes <me...@gmail.com>.
Thanks Ted for the clarification about the major compactions. So if I did
understand well when a minor compaction is triggered and the policy selects
all the store files, this compaction becomes a major one. But would this
minor which becomes major take care of deleted rows as a major one would do
or at the end it is just a minor that happened and selected all the store
files ?

About disabling splitting I have already hbase.hregion.max.filesize set to
10GB besides I pre splitted my table.

And I still dont understand how the store files resulting after memstore
flushes are having a size of 40MB. Does it hove smth to do with memstore
upper limit and these 42MB are the result of forcing the memstore to be
flushed? The problem is that all the newly store files added to HDFS are
starting with this size (42MB) I did not mention that my CF is in-memory.

Best regards,

On Tue, Jan 5, 2016 at 4:04 PM, Ted Yu <yu...@gmail.com> wrote:

> For #1, when all store files are selected for compaction, the compaction
> becomes major
>
> see 'Determine the Optimal Number of Pre-Split Regions' under:
> http://hbase.apache.org/book.html#disable.splitting
>
> See also http://hbase.apache.org/book.html#managed.compactions
>
> Cheers
>
> On Tue, Jan 5, 2016 at 6:52 AM, Mehdi Ben Haj Abbes <mehdi.abbes@gmail.com
> >
> wrote:
>
> > Hi folks,
> >
> > I'm using hbase 0.98. I have heavy writes workload. I'm writing to one
> > table with one CF compressed with GZ. My table is pre splitted to 27
> > regions. As I start writing to this table I start seeing HFiles of the
> size
> > of 2-4 MB across the regions. I have the default hbase configuration for
> > compaction properties. The compactions start as soon as I start writing
> to
> > HBase but many of these compaction are major ones. I can see this through
> > HBase master UI on the table details view. So I wanted to understand
> when a
> > compaction becomes major.
> >
> > Another question, If I'm not wrong we have a memstore per region, so
> when a
> > memstore is flushed I will have a HFile with 128MB but I only see files
> > with 42MB (without compression and 2.5MB when compressed with GZ).
> >
> > Any explanation ?
> >
> > Thanks in advance.
> > --
> > Mehdi BEN HAJ ABBES
> >
>



-- 
Mehdi BEN HAJ ABBES

Re: When compactions become major ones

Posted by Ted Yu <yu...@gmail.com>.
For #1, when all store files are selected for compaction, the compaction
becomes major

see 'Determine the Optimal Number of Pre-Split Regions' under:
http://hbase.apache.org/book.html#disable.splitting

See also http://hbase.apache.org/book.html#managed.compactions

Cheers

On Tue, Jan 5, 2016 at 6:52 AM, Mehdi Ben Haj Abbes <me...@gmail.com>
wrote:

> Hi folks,
>
> I'm using hbase 0.98. I have heavy writes workload. I'm writing to one
> table with one CF compressed with GZ. My table is pre splitted to 27
> regions. As I start writing to this table I start seeing HFiles of the size
> of 2-4 MB across the regions. I have the default hbase configuration for
> compaction properties. The compactions start as soon as I start writing to
> HBase but many of these compaction are major ones. I can see this through
> HBase master UI on the table details view. So I wanted to understand when a
> compaction becomes major.
>
> Another question, If I'm not wrong we have a memstore per region, so when a
> memstore is flushed I will have a HFile with 128MB but I only see files
> with 42MB (without compression and 2.5MB when compressed with GZ).
>
> Any explanation ?
>
> Thanks in advance.
> --
> Mehdi BEN HAJ ABBES
>