You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Schubert Zhang <zs...@gmail.com> on 2011/05/31 08:27:08 UTC

Split issue for multiple ColumnFamilies

We found a issue of forceSplit.

1. My schema:
    We have a table with multiple column families: A, B, C, D
    After put many data, but there may be no data (no storeFile) for some
stores.
    for example: store A and B is empty, but C and D have many data now.

2. Then I request a split from hbase shell or the web GUI, but nothing
happend.

3. Then we check the code, it may be a bug for multiple CFs.
    (1) split request.
         RegionServer:   region.shouldSplit(true)
         and enqueue this request
    (2) HRegion.compactStores(false)
         for (Store store: stores.values()) {
            final Store.StoreSize ss = store.compact(majorCompaction);
            lastCompactSize += store.getLastCompactSize();
            if (ss != null && ss.getSize() > maxSize) {
              maxSize = ss.getSize();
              splitRow = ss.getSplitRow();
            }
          }
         but, for store A and store B, store.compact(false) returns null.
     (3) Store.compact(false)
          boolean forceSplit = this.region.shouldSplit(false);
          for the first store A and second store B, it set
this.splitRequest=false. This is a bug.
     (4) Then even for store C and D which have many data, forceSplit is
always false.


btw: seems the current code of compaction/spliting is so disordered and
confused.

Schubert

Re: Split issue for multiple ColumnFamilies

Posted by Schubert Zhang <zs...@gmail.com>.
Yes, I had just upgrade to 0.90.3 this moning and retest the split, and also
checked the code.
The split works fine.



On Wed, Jun 1, 2011 at 12:10 PM, Ted Yu <yu...@gmail.com> wrote:

> Schubert:
> Are you able to upgrade to 0.90.3 ?
>
> On Tue, May 31, 2011 at 7:06 PM, Schubert Zhang <zs...@gmail.com> wrote:
>
> > Another issue:
> >
> > in Store.compct()
> >
> >      if (!majorcompaction && !references &&
> >          (forceSplit || (filesToCompact.size() < compactionThreshold))) {
> >        return checkSplit(forceSplit);
> >      }
> >
> > In some cases, (filesToCompact.size() >= compactionThreshold), but the
> > following compaction logic still cannot compact to make number of
> > storefiles
> > lower, since the file-selection-policy of compaction is very complicated.
> >
> >
> >
> > On Wed, Jun 1, 2011 at 9:42 AM, Schubert Zhang <zs...@gmail.com>
> wrote:
> >
> > > Thanks Ted.
> > >
> > > We are using 0.90.1 now.
> > >
> > > It's good news that new 0.90.3 and trunk resolved this issue.
> > >
> > >
> > >
> > > On Wed, Jun 1, 2011 at 1:35 AM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > >> Which version of HBase are you using ?
> > >> In 0.90.3, I see this in Store.compact():
> > >>    boolean forceSplit = this.region.shouldForceSplit();
> > >> For trunk, the code has been rewritten.
> > >>
> > >> Cheers
> > >>
> > >> On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com>
> > >> wrote:
> > >>
> > >> > We found a issue of forceSplit.
> > >> >
> > >> > 1. My schema:
> > >> >    We have a table with multiple column families: A, B, C, D
> > >> >    After put many data, but there may be no data (no storeFile) for
> > some
> > >> > stores.
> > >> >    for example: store A and B is empty, but C and D have many data
> > now.
> > >> >
> > >> > 2. Then I request a split from hbase shell or the web GUI, but
> nothing
> > >> > happend.
> > >> >
> > >> > 3. Then we check the code, it may be a bug for multiple CFs.
> > >> >    (1) split request.
> > >> >         RegionServer:   region.shouldSplit(true)
> > >> >         and enqueue this request
> > >> >    (2) HRegion.compactStores(false)
> > >> >         for (Store store: stores.values()) {
> > >> >            final Store.StoreSize ss =
> store.compact(majorCompaction);
> > >> >            lastCompactSize += store.getLastCompactSize();
> > >> >            if (ss != null && ss.getSize() > maxSize) {
> > >> >              maxSize = ss.getSize();
> > >> >              splitRow = ss.getSplitRow();
> > >> >            }
> > >> >          }
> > >> >         but, for store A and store B, store.compact(false) returns
> > null.
> > >> >     (3) Store.compact(false)
> > >> >          boolean forceSplit = this.region.shouldSplit(false);
> > >> >          for the first store A and second store B, it set
> > >> > this.splitRequest=false. This is a bug.
> > >> >     (4) Then even for store C and D which have many data, forceSplit
> > is
> > >> > always false.
> > >> >
> > >> >
> > >> > btw: seems the current code of compaction/spliting is so disordered
> > and
> > >> > confused.
> > >> >
> > >> > Schubert
> > >> >
> > >>
> > >
> > >
> >
>

Re: Split issue for multiple ColumnFamilies

Posted by Ted Yu <yu...@gmail.com>.
Schubert:
Are you able to upgrade to 0.90.3 ?

On Tue, May 31, 2011 at 7:06 PM, Schubert Zhang <zs...@gmail.com> wrote:

> Another issue:
>
> in Store.compct()
>
>      if (!majorcompaction && !references &&
>          (forceSplit || (filesToCompact.size() < compactionThreshold))) {
>        return checkSplit(forceSplit);
>      }
>
> In some cases, (filesToCompact.size() >= compactionThreshold), but the
> following compaction logic still cannot compact to make number of
> storefiles
> lower, since the file-selection-policy of compaction is very complicated.
>
>
>
> On Wed, Jun 1, 2011 at 9:42 AM, Schubert Zhang <zs...@gmail.com> wrote:
>
> > Thanks Ted.
> >
> > We are using 0.90.1 now.
> >
> > It's good news that new 0.90.3 and trunk resolved this issue.
> >
> >
> >
> > On Wed, Jun 1, 2011 at 1:35 AM, Ted Yu <yu...@gmail.com> wrote:
> >
> >> Which version of HBase are you using ?
> >> In 0.90.3, I see this in Store.compact():
> >>    boolean forceSplit = this.region.shouldForceSplit();
> >> For trunk, the code has been rewritten.
> >>
> >> Cheers
> >>
> >> On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com>
> >> wrote:
> >>
> >> > We found a issue of forceSplit.
> >> >
> >> > 1. My schema:
> >> >    We have a table with multiple column families: A, B, C, D
> >> >    After put many data, but there may be no data (no storeFile) for
> some
> >> > stores.
> >> >    for example: store A and B is empty, but C and D have many data
> now.
> >> >
> >> > 2. Then I request a split from hbase shell or the web GUI, but nothing
> >> > happend.
> >> >
> >> > 3. Then we check the code, it may be a bug for multiple CFs.
> >> >    (1) split request.
> >> >         RegionServer:   region.shouldSplit(true)
> >> >         and enqueue this request
> >> >    (2) HRegion.compactStores(false)
> >> >         for (Store store: stores.values()) {
> >> >            final Store.StoreSize ss = store.compact(majorCompaction);
> >> >            lastCompactSize += store.getLastCompactSize();
> >> >            if (ss != null && ss.getSize() > maxSize) {
> >> >              maxSize = ss.getSize();
> >> >              splitRow = ss.getSplitRow();
> >> >            }
> >> >          }
> >> >         but, for store A and store B, store.compact(false) returns
> null.
> >> >     (3) Store.compact(false)
> >> >          boolean forceSplit = this.region.shouldSplit(false);
> >> >          for the first store A and second store B, it set
> >> > this.splitRequest=false. This is a bug.
> >> >     (4) Then even for store C and D which have many data, forceSplit
> is
> >> > always false.
> >> >
> >> >
> >> > btw: seems the current code of compaction/spliting is so disordered
> and
> >> > confused.
> >> >
> >> > Schubert
> >> >
> >>
> >
> >
>

Re: Split issue for multiple ColumnFamilies

Posted by Schubert Zhang <zs...@gmail.com>.
Another issue:

in Store.compct()

      if (!majorcompaction && !references &&
          (forceSplit || (filesToCompact.size() < compactionThreshold))) {
        return checkSplit(forceSplit);
      }

In some cases, (filesToCompact.size() >= compactionThreshold), but the
following compaction logic still cannot compact to make number of storefiles
lower, since the file-selection-policy of compaction is very complicated.



On Wed, Jun 1, 2011 at 9:42 AM, Schubert Zhang <zs...@gmail.com> wrote:

> Thanks Ted.
>
> We are using 0.90.1 now.
>
> It's good news that new 0.90.3 and trunk resolved this issue.
>
>
>
> On Wed, Jun 1, 2011 at 1:35 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> Which version of HBase are you using ?
>> In 0.90.3, I see this in Store.compact():
>>    boolean forceSplit = this.region.shouldForceSplit();
>> For trunk, the code has been rewritten.
>>
>> Cheers
>>
>> On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com>
>> wrote:
>>
>> > We found a issue of forceSplit.
>> >
>> > 1. My schema:
>> >    We have a table with multiple column families: A, B, C, D
>> >    After put many data, but there may be no data (no storeFile) for some
>> > stores.
>> >    for example: store A and B is empty, but C and D have many data now.
>> >
>> > 2. Then I request a split from hbase shell or the web GUI, but nothing
>> > happend.
>> >
>> > 3. Then we check the code, it may be a bug for multiple CFs.
>> >    (1) split request.
>> >         RegionServer:   region.shouldSplit(true)
>> >         and enqueue this request
>> >    (2) HRegion.compactStores(false)
>> >         for (Store store: stores.values()) {
>> >            final Store.StoreSize ss = store.compact(majorCompaction);
>> >            lastCompactSize += store.getLastCompactSize();
>> >            if (ss != null && ss.getSize() > maxSize) {
>> >              maxSize = ss.getSize();
>> >              splitRow = ss.getSplitRow();
>> >            }
>> >          }
>> >         but, for store A and store B, store.compact(false) returns null.
>> >     (3) Store.compact(false)
>> >          boolean forceSplit = this.region.shouldSplit(false);
>> >          for the first store A and second store B, it set
>> > this.splitRequest=false. This is a bug.
>> >     (4) Then even for store C and D which have many data, forceSplit is
>> > always false.
>> >
>> >
>> > btw: seems the current code of compaction/spliting is so disordered and
>> > confused.
>> >
>> > Schubert
>> >
>>
>
>

Re: Split issue for multiple ColumnFamilies

Posted by Schubert Zhang <zs...@gmail.com>.
Thanks Ted.

We are using 0.90.1 now.

It's good news that new 0.90.3 and trunk resolved this issue.



On Wed, Jun 1, 2011 at 1:35 AM, Ted Yu <yu...@gmail.com> wrote:

> Which version of HBase are you using ?
> In 0.90.3, I see this in Store.compact():
>    boolean forceSplit = this.region.shouldForceSplit();
> For trunk, the code has been rewritten.
>
> Cheers
>
> On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com>
> wrote:
>
> > We found a issue of forceSplit.
> >
> > 1. My schema:
> >    We have a table with multiple column families: A, B, C, D
> >    After put many data, but there may be no data (no storeFile) for some
> > stores.
> >    for example: store A and B is empty, but C and D have many data now.
> >
> > 2. Then I request a split from hbase shell or the web GUI, but nothing
> > happend.
> >
> > 3. Then we check the code, it may be a bug for multiple CFs.
> >    (1) split request.
> >         RegionServer:   region.shouldSplit(true)
> >         and enqueue this request
> >    (2) HRegion.compactStores(false)
> >         for (Store store: stores.values()) {
> >            final Store.StoreSize ss = store.compact(majorCompaction);
> >            lastCompactSize += store.getLastCompactSize();
> >            if (ss != null && ss.getSize() > maxSize) {
> >              maxSize = ss.getSize();
> >              splitRow = ss.getSplitRow();
> >            }
> >          }
> >         but, for store A and store B, store.compact(false) returns null.
> >     (3) Store.compact(false)
> >          boolean forceSplit = this.region.shouldSplit(false);
> >          for the first store A and second store B, it set
> > this.splitRequest=false. This is a bug.
> >     (4) Then even for store C and D which have many data, forceSplit is
> > always false.
> >
> >
> > btw: seems the current code of compaction/spliting is so disordered and
> > confused.
> >
> > Schubert
> >
>

Re: Split issue for multiple ColumnFamilies

Posted by Ted Yu <yu...@gmail.com>.
Which version of HBase are you using ?
In 0.90.3, I see this in Store.compact():
    boolean forceSplit = this.region.shouldForceSplit();
For trunk, the code has been rewritten.

Cheers

On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com> wrote:

> We found a issue of forceSplit.
>
> 1. My schema:
>    We have a table with multiple column families: A, B, C, D
>    After put many data, but there may be no data (no storeFile) for some
> stores.
>    for example: store A and B is empty, but C and D have many data now.
>
> 2. Then I request a split from hbase shell or the web GUI, but nothing
> happend.
>
> 3. Then we check the code, it may be a bug for multiple CFs.
>    (1) split request.
>         RegionServer:   region.shouldSplit(true)
>         and enqueue this request
>    (2) HRegion.compactStores(false)
>         for (Store store: stores.values()) {
>            final Store.StoreSize ss = store.compact(majorCompaction);
>            lastCompactSize += store.getLastCompactSize();
>            if (ss != null && ss.getSize() > maxSize) {
>              maxSize = ss.getSize();
>              splitRow = ss.getSplitRow();
>            }
>          }
>         but, for store A and store B, store.compact(false) returns null.
>     (3) Store.compact(false)
>          boolean forceSplit = this.region.shouldSplit(false);
>          for the first store A and second store B, it set
> this.splitRequest=false. This is a bug.
>     (4) Then even for store C and D which have many data, forceSplit is
> always false.
>
>
> btw: seems the current code of compaction/spliting is so disordered and
> confused.
>
> Schubert
>