You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Schubert Zhang <zs...@gmail.com> on 2011/05/31 08:27:08 UTC
Split issue for multiple ColumnFamilies
We found a issue of forceSplit.
1. My schema:
We have a table with multiple column families: A, B, C, D
After put many data, but there may be no data (no storeFile) for some
stores.
for example: store A and B is empty, but C and D have many data now.
2. Then I request a split from hbase shell or the web GUI, but nothing
happend.
3. Then we check the code, it may be a bug for multiple CFs.
(1) split request.
RegionServer: region.shouldSplit(true)
and enqueue this request
(2) HRegion.compactStores(false)
for (Store store: stores.values()) {
final Store.StoreSize ss = store.compact(majorCompaction);
lastCompactSize += store.getLastCompactSize();
if (ss != null && ss.getSize() > maxSize) {
maxSize = ss.getSize();
splitRow = ss.getSplitRow();
}
}
but, for store A and store B, store.compact(false) returns null.
(3) Store.compact(false)
boolean forceSplit = this.region.shouldSplit(false);
for the first store A and second store B, it set
this.splitRequest=false. This is a bug.
(4) Then even for store C and D which have many data, forceSplit is
always false.
btw: seems the current code of compaction/spliting is so disordered and
confused.
Schubert
Re: Split issue for multiple ColumnFamilies
Posted by Schubert Zhang <zs...@gmail.com>.
Yes, I had just upgrade to 0.90.3 this moning and retest the split, and also
checked the code.
The split works fine.
On Wed, Jun 1, 2011 at 12:10 PM, Ted Yu <yu...@gmail.com> wrote:
> Schubert:
> Are you able to upgrade to 0.90.3 ?
>
> On Tue, May 31, 2011 at 7:06 PM, Schubert Zhang <zs...@gmail.com> wrote:
>
> > Another issue:
> >
> > in Store.compct()
> >
> > if (!majorcompaction && !references &&
> > (forceSplit || (filesToCompact.size() < compactionThreshold))) {
> > return checkSplit(forceSplit);
> > }
> >
> > In some cases, (filesToCompact.size() >= compactionThreshold), but the
> > following compaction logic still cannot compact to make number of
> > storefiles
> > lower, since the file-selection-policy of compaction is very complicated.
> >
> >
> >
> > On Wed, Jun 1, 2011 at 9:42 AM, Schubert Zhang <zs...@gmail.com>
> wrote:
> >
> > > Thanks Ted.
> > >
> > > We are using 0.90.1 now.
> > >
> > > It's good news that new 0.90.3 and trunk resolved this issue.
> > >
> > >
> > >
> > > On Wed, Jun 1, 2011 at 1:35 AM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > >> Which version of HBase are you using ?
> > >> In 0.90.3, I see this in Store.compact():
> > >> boolean forceSplit = this.region.shouldForceSplit();
> > >> For trunk, the code has been rewritten.
> > >>
> > >> Cheers
> > >>
> > >> On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com>
> > >> wrote:
> > >>
> > >> > We found a issue of forceSplit.
> > >> >
> > >> > 1. My schema:
> > >> > We have a table with multiple column families: A, B, C, D
> > >> > After put many data, but there may be no data (no storeFile) for
> > some
> > >> > stores.
> > >> > for example: store A and B is empty, but C and D have many data
> > now.
> > >> >
> > >> > 2. Then I request a split from hbase shell or the web GUI, but
> nothing
> > >> > happend.
> > >> >
> > >> > 3. Then we check the code, it may be a bug for multiple CFs.
> > >> > (1) split request.
> > >> > RegionServer: region.shouldSplit(true)
> > >> > and enqueue this request
> > >> > (2) HRegion.compactStores(false)
> > >> > for (Store store: stores.values()) {
> > >> > final Store.StoreSize ss =
> store.compact(majorCompaction);
> > >> > lastCompactSize += store.getLastCompactSize();
> > >> > if (ss != null && ss.getSize() > maxSize) {
> > >> > maxSize = ss.getSize();
> > >> > splitRow = ss.getSplitRow();
> > >> > }
> > >> > }
> > >> > but, for store A and store B, store.compact(false) returns
> > null.
> > >> > (3) Store.compact(false)
> > >> > boolean forceSplit = this.region.shouldSplit(false);
> > >> > for the first store A and second store B, it set
> > >> > this.splitRequest=false. This is a bug.
> > >> > (4) Then even for store C and D which have many data, forceSplit
> > is
> > >> > always false.
> > >> >
> > >> >
> > >> > btw: seems the current code of compaction/spliting is so disordered
> > and
> > >> > confused.
> > >> >
> > >> > Schubert
> > >> >
> > >>
> > >
> > >
> >
>
Re: Split issue for multiple ColumnFamilies
Posted by Ted Yu <yu...@gmail.com>.
Schubert:
Are you able to upgrade to 0.90.3 ?
On Tue, May 31, 2011 at 7:06 PM, Schubert Zhang <zs...@gmail.com> wrote:
> Another issue:
>
> in Store.compct()
>
> if (!majorcompaction && !references &&
> (forceSplit || (filesToCompact.size() < compactionThreshold))) {
> return checkSplit(forceSplit);
> }
>
> In some cases, (filesToCompact.size() >= compactionThreshold), but the
> following compaction logic still cannot compact to make number of
> storefiles
> lower, since the file-selection-policy of compaction is very complicated.
>
>
>
> On Wed, Jun 1, 2011 at 9:42 AM, Schubert Zhang <zs...@gmail.com> wrote:
>
> > Thanks Ted.
> >
> > We are using 0.90.1 now.
> >
> > It's good news that new 0.90.3 and trunk resolved this issue.
> >
> >
> >
> > On Wed, Jun 1, 2011 at 1:35 AM, Ted Yu <yu...@gmail.com> wrote:
> >
> >> Which version of HBase are you using ?
> >> In 0.90.3, I see this in Store.compact():
> >> boolean forceSplit = this.region.shouldForceSplit();
> >> For trunk, the code has been rewritten.
> >>
> >> Cheers
> >>
> >> On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com>
> >> wrote:
> >>
> >> > We found a issue of forceSplit.
> >> >
> >> > 1. My schema:
> >> > We have a table with multiple column families: A, B, C, D
> >> > After put many data, but there may be no data (no storeFile) for
> some
> >> > stores.
> >> > for example: store A and B is empty, but C and D have many data
> now.
> >> >
> >> > 2. Then I request a split from hbase shell or the web GUI, but nothing
> >> > happend.
> >> >
> >> > 3. Then we check the code, it may be a bug for multiple CFs.
> >> > (1) split request.
> >> > RegionServer: region.shouldSplit(true)
> >> > and enqueue this request
> >> > (2) HRegion.compactStores(false)
> >> > for (Store store: stores.values()) {
> >> > final Store.StoreSize ss = store.compact(majorCompaction);
> >> > lastCompactSize += store.getLastCompactSize();
> >> > if (ss != null && ss.getSize() > maxSize) {
> >> > maxSize = ss.getSize();
> >> > splitRow = ss.getSplitRow();
> >> > }
> >> > }
> >> > but, for store A and store B, store.compact(false) returns
> null.
> >> > (3) Store.compact(false)
> >> > boolean forceSplit = this.region.shouldSplit(false);
> >> > for the first store A and second store B, it set
> >> > this.splitRequest=false. This is a bug.
> >> > (4) Then even for store C and D which have many data, forceSplit
> is
> >> > always false.
> >> >
> >> >
> >> > btw: seems the current code of compaction/spliting is so disordered
> and
> >> > confused.
> >> >
> >> > Schubert
> >> >
> >>
> >
> >
>
Re: Split issue for multiple ColumnFamilies
Posted by Schubert Zhang <zs...@gmail.com>.
Another issue:
in Store.compct()
if (!majorcompaction && !references &&
(forceSplit || (filesToCompact.size() < compactionThreshold))) {
return checkSplit(forceSplit);
}
In some cases, (filesToCompact.size() >= compactionThreshold), but the
following compaction logic still cannot compact to make number of storefiles
lower, since the file-selection-policy of compaction is very complicated.
On Wed, Jun 1, 2011 at 9:42 AM, Schubert Zhang <zs...@gmail.com> wrote:
> Thanks Ted.
>
> We are using 0.90.1 now.
>
> It's good news that new 0.90.3 and trunk resolved this issue.
>
>
>
> On Wed, Jun 1, 2011 at 1:35 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> Which version of HBase are you using ?
>> In 0.90.3, I see this in Store.compact():
>> boolean forceSplit = this.region.shouldForceSplit();
>> For trunk, the code has been rewritten.
>>
>> Cheers
>>
>> On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com>
>> wrote:
>>
>> > We found a issue of forceSplit.
>> >
>> > 1. My schema:
>> > We have a table with multiple column families: A, B, C, D
>> > After put many data, but there may be no data (no storeFile) for some
>> > stores.
>> > for example: store A and B is empty, but C and D have many data now.
>> >
>> > 2. Then I request a split from hbase shell or the web GUI, but nothing
>> > happend.
>> >
>> > 3. Then we check the code, it may be a bug for multiple CFs.
>> > (1) split request.
>> > RegionServer: region.shouldSplit(true)
>> > and enqueue this request
>> > (2) HRegion.compactStores(false)
>> > for (Store store: stores.values()) {
>> > final Store.StoreSize ss = store.compact(majorCompaction);
>> > lastCompactSize += store.getLastCompactSize();
>> > if (ss != null && ss.getSize() > maxSize) {
>> > maxSize = ss.getSize();
>> > splitRow = ss.getSplitRow();
>> > }
>> > }
>> > but, for store A and store B, store.compact(false) returns null.
>> > (3) Store.compact(false)
>> > boolean forceSplit = this.region.shouldSplit(false);
>> > for the first store A and second store B, it set
>> > this.splitRequest=false. This is a bug.
>> > (4) Then even for store C and D which have many data, forceSplit is
>> > always false.
>> >
>> >
>> > btw: seems the current code of compaction/spliting is so disordered and
>> > confused.
>> >
>> > Schubert
>> >
>>
>
>
Re: Split issue for multiple ColumnFamilies
Posted by Schubert Zhang <zs...@gmail.com>.
Thanks Ted.
We are using 0.90.1 now.
It's good news that new 0.90.3 and trunk resolved this issue.
On Wed, Jun 1, 2011 at 1:35 AM, Ted Yu <yu...@gmail.com> wrote:
> Which version of HBase are you using ?
> In 0.90.3, I see this in Store.compact():
> boolean forceSplit = this.region.shouldForceSplit();
> For trunk, the code has been rewritten.
>
> Cheers
>
> On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com>
> wrote:
>
> > We found a issue of forceSplit.
> >
> > 1. My schema:
> > We have a table with multiple column families: A, B, C, D
> > After put many data, but there may be no data (no storeFile) for some
> > stores.
> > for example: store A and B is empty, but C and D have many data now.
> >
> > 2. Then I request a split from hbase shell or the web GUI, but nothing
> > happend.
> >
> > 3. Then we check the code, it may be a bug for multiple CFs.
> > (1) split request.
> > RegionServer: region.shouldSplit(true)
> > and enqueue this request
> > (2) HRegion.compactStores(false)
> > for (Store store: stores.values()) {
> > final Store.StoreSize ss = store.compact(majorCompaction);
> > lastCompactSize += store.getLastCompactSize();
> > if (ss != null && ss.getSize() > maxSize) {
> > maxSize = ss.getSize();
> > splitRow = ss.getSplitRow();
> > }
> > }
> > but, for store A and store B, store.compact(false) returns null.
> > (3) Store.compact(false)
> > boolean forceSplit = this.region.shouldSplit(false);
> > for the first store A and second store B, it set
> > this.splitRequest=false. This is a bug.
> > (4) Then even for store C and D which have many data, forceSplit is
> > always false.
> >
> >
> > btw: seems the current code of compaction/spliting is so disordered and
> > confused.
> >
> > Schubert
> >
>
Re: Split issue for multiple ColumnFamilies
Posted by Ted Yu <yu...@gmail.com>.
Which version of HBase are you using ?
In 0.90.3, I see this in Store.compact():
boolean forceSplit = this.region.shouldForceSplit();
For trunk, the code has been rewritten.
Cheers
On Mon, May 30, 2011 at 11:27 PM, Schubert Zhang <zs...@gmail.com> wrote:
> We found a issue of forceSplit.
>
> 1. My schema:
> We have a table with multiple column families: A, B, C, D
> After put many data, but there may be no data (no storeFile) for some
> stores.
> for example: store A and B is empty, but C and D have many data now.
>
> 2. Then I request a split from hbase shell or the web GUI, but nothing
> happend.
>
> 3. Then we check the code, it may be a bug for multiple CFs.
> (1) split request.
> RegionServer: region.shouldSplit(true)
> and enqueue this request
> (2) HRegion.compactStores(false)
> for (Store store: stores.values()) {
> final Store.StoreSize ss = store.compact(majorCompaction);
> lastCompactSize += store.getLastCompactSize();
> if (ss != null && ss.getSize() > maxSize) {
> maxSize = ss.getSize();
> splitRow = ss.getSplitRow();
> }
> }
> but, for store A and store B, store.compact(false) returns null.
> (3) Store.compact(false)
> boolean forceSplit = this.region.shouldSplit(false);
> for the first store A and second store B, it set
> this.splitRequest=false. This is a bug.
> (4) Then even for store C and D which have many data, forceSplit is
> always false.
>
>
> btw: seems the current code of compaction/spliting is so disordered and
> confused.
>
> Schubert
>