You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jean-Marc Spaggiari <je...@spaggiari.org> on 2013/08/10 01:00:15 UTC
IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question
Hi,
In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
and we look at 2 things.
First, if the store can't be split, then we return immediatly with false
without checking the other stores. Then if the store can be split and is
big enought, we return immediatly (break) with true.
The logic seems to be wrong to be. In the 2nd case, should we not continue
to loop over the other stores and see if one can't be split? Else we will
trigger a split on a region where some stores can't be split. I will sply
remove the "break".
Does anyone have another opinion on that? Else I will open a JIRA and send
a patch...
JM
@Override
protected boolean shouldSplit() {
if (region.shouldForceSplit()) return true;
boolean foundABigStore = false;
// Get count of regions that have the same common table as this.region
int tableRegionsCount = getCountOfCommonTableRegions();
// Get size to check
long sizeToCheck = getSizeToCheck(tableRegionsCount);
LOG.debug("sizeToCheck = " + sizeToCheck);
for (Store store : region.getStores().values()) {
// If any of the stores is unable to split (eg they contain reference
files)
// then don't split
if ((!store.canSplit())) {
return false;
}
// Mark if any store is big enough
long size = store.getSize();
if (size > sizeToCheck) {
LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
" size=" + size + ", sizeToCheck=" + sizeToCheck +
", regionsWithCommonTable=" + tableRegionsCount);
foundABigStore = true;
break;
}
}
return foundABigStore;
}
Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question
Posted by Ted Yu <yu...@gmail.com>.
Your reasoning looks good.
On Fri, Aug 9, 2013 at 4:00 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:
> Hi,
>
> In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
> and we look at 2 things.
>
> First, if the store can't be split, then we return immediatly with false
> without checking the other stores. Then if the store can be split and is
> big enought, we return immediatly (break) with true.
>
> The logic seems to be wrong to be. In the 2nd case, should we not continue
> to loop over the other stores and see if one can't be split? Else we will
> trigger a split on a region where some stores can't be split. I will sply
> remove the "break".
>
> Does anyone have another opinion on that? Else I will open a JIRA and send
> a patch...
>
> JM
>
> @Override
> protected boolean shouldSplit() {
> if (region.shouldForceSplit()) return true;
> boolean foundABigStore = false;
> // Get count of regions that have the same common table as this.region
> int tableRegionsCount = getCountOfCommonTableRegions();
> // Get size to check
> long sizeToCheck = getSizeToCheck(tableRegionsCount);
> LOG.debug("sizeToCheck = " + sizeToCheck);
>
> for (Store store : region.getStores().values()) {
> // If any of the stores is unable to split (eg they contain reference
> files)
> // then don't split
> if ((!store.canSplit())) {
> return false;
> }
>
> // Mark if any store is big enough
> long size = store.getSize();
> if (size > sizeToCheck) {
> LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
> " size=" + size + ", sizeToCheck=" + sizeToCheck +
> ", regionsWithCommonTable=" + tableRegionsCount);
> foundABigStore = true;
> break;
> }
> }
>
> return foundABigStore;
> }
>
Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question
Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Thanks all for checking.
Opened HBASE-9189. I will do the modifications on trunk and 0.94, run the
tests and submit the patch over the day.
I'm also instigating another split related. I have big >10GB regions not
splitting even after days. More to come.
JM
2013/8/11 lars hofhansl <la...@apache.org>
> File a jira, JMS?
> Looks like to simple fix, can pull into 0.94.11 (which I'll spin early
> next week, promised).
>
>
> -- Lars
>
>
>
> ________________________________
> From: Stack <st...@duboce.net>
> To: HBase Dev List <de...@hbase.apache.org>
> Sent: Saturday, August 10, 2013 10:36 PM
> Subject: Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic
> question
>
>
> In Store.canSplit, we check for references. If references, we should not
> split a region.
>
> public boolean canSplit() {
>
> this.lock.readLock().lock();
>
> try {
>
> // Not split-able if we find a reference store file present in the
> store.
>
> boolean result = !hasReferences();
>
> if (!result && LOG.isDebugEnabled()) {
>
> LOG.debug("Cannot split region due to reference files being
> there");
>
> }
>
> return result;
>
> } finally {
>
> this.lock.readLock().unlock();
>
> }
>
> }
>
>
> So, yeah, we should keep going through stores in case one is unsplittlable.
>
> Good on you JMS,
>
> St.Ack
>
>
> On Fri, Aug 9, 2013 at 4:00 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> > Hi,
> >
> > In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
> > and we look at 2 things.
> >
> > First, if the store can't be split, then we return immediatly with false
> > without checking the other stores. Then if the store can be split and is
> > big enought, we return immediatly (break) with true.
> >
> > The logic seems to be wrong to be. In the 2nd case, should we not
> continue
> > to loop over the other stores and see if one can't be split? Else we will
> > trigger a split on a region where some stores can't be split. I will sply
> > remove the "break".
> >
> > Does anyone have another opinion on that? Else I will open a JIRA and
> send
> > a patch...
> >
> > JM
> >
> > @Override
> > protected boolean shouldSplit() {
> > if (region.shouldForceSplit()) return true;
> > boolean foundABigStore = false;
> > // Get count of regions that have the same common table as
> this.region
> > int tableRegionsCount = getCountOfCommonTableRegions();
> > // Get size to check
> > long sizeToCheck = getSizeToCheck(tableRegionsCount);
> > LOG.debug("sizeToCheck = " + sizeToCheck);
> >
> > for (Store store : region.getStores().values()) {
> > // If any of the stores is unable to split (eg they contain
> reference
> > files)
> > // then don't split
> > if ((!store.canSplit())) {
> > return false;
> > }
> >
> > // Mark if any store is big enough
> > long size = store.getSize();
> > if (size > sizeToCheck) {
> > LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
> > " size=" + size + ", sizeToCheck=" + sizeToCheck +
> > ", regionsWithCommonTable=" + tableRegionsCount);
> > foundABigStore = true;
> > break;
> > }
> > }
> >
> > return foundABigStore;
> > }
> >
>
Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question
Posted by lars hofhansl <la...@apache.org>.
File a jira, JMS?
Looks like to simple fix, can pull into 0.94.11 (which I'll spin early next week, promised).
-- Lars
________________________________
From: Stack <st...@duboce.net>
To: HBase Dev List <de...@hbase.apache.org>
Sent: Saturday, August 10, 2013 10:36 PM
Subject: Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question
In Store.canSplit, we check for references. If references, we should not
split a region.
public boolean canSplit() {
this.lock.readLock().lock();
try {
// Not split-able if we find a reference store file present in the
store.
boolean result = !hasReferences();
if (!result && LOG.isDebugEnabled()) {
LOG.debug("Cannot split region due to reference files being there");
}
return result;
} finally {
this.lock.readLock().unlock();
}
}
So, yeah, we should keep going through stores in case one is unsplittlable.
Good on you JMS,
St.Ack
On Fri, Aug 9, 2013 at 4:00 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:
> Hi,
>
> In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
> and we look at 2 things.
>
> First, if the store can't be split, then we return immediatly with false
> without checking the other stores. Then if the store can be split and is
> big enought, we return immediatly (break) with true.
>
> The logic seems to be wrong to be. In the 2nd case, should we not continue
> to loop over the other stores and see if one can't be split? Else we will
> trigger a split on a region where some stores can't be split. I will sply
> remove the "break".
>
> Does anyone have another opinion on that? Else I will open a JIRA and send
> a patch...
>
> JM
>
> @Override
> protected boolean shouldSplit() {
> if (region.shouldForceSplit()) return true;
> boolean foundABigStore = false;
> // Get count of regions that have the same common table as this.region
> int tableRegionsCount = getCountOfCommonTableRegions();
> // Get size to check
> long sizeToCheck = getSizeToCheck(tableRegionsCount);
> LOG.debug("sizeToCheck = " + sizeToCheck);
>
> for (Store store : region.getStores().values()) {
> // If any of the stores is unable to split (eg they contain reference
> files)
> // then don't split
> if ((!store.canSplit())) {
> return false;
> }
>
> // Mark if any store is big enough
> long size = store.getSize();
> if (size > sizeToCheck) {
> LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
> " size=" + size + ", sizeToCheck=" + sizeToCheck +
> ", regionsWithCommonTable=" + tableRegionsCount);
> foundABigStore = true;
> break;
> }
> }
>
> return foundABigStore;
> }
>
Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question
Posted by Stack <st...@duboce.net>.
In Store.canSplit, we check for references. If references, we should not
split a region.
public boolean canSplit() {
this.lock.readLock().lock();
try {
// Not split-able if we find a reference store file present in the
store.
boolean result = !hasReferences();
if (!result && LOG.isDebugEnabled()) {
LOG.debug("Cannot split region due to reference files being there");
}
return result;
} finally {
this.lock.readLock().unlock();
}
}
So, yeah, we should keep going through stores in case one is unsplittlable.
Good on you JMS,
St.Ack
On Fri, Aug 9, 2013 at 4:00 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:
> Hi,
>
> In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
> and we look at 2 things.
>
> First, if the store can't be split, then we return immediatly with false
> without checking the other stores. Then if the store can be split and is
> big enought, we return immediatly (break) with true.
>
> The logic seems to be wrong to be. In the 2nd case, should we not continue
> to loop over the other stores and see if one can't be split? Else we will
> trigger a split on a region where some stores can't be split. I will sply
> remove the "break".
>
> Does anyone have another opinion on that? Else I will open a JIRA and send
> a patch...
>
> JM
>
> @Override
> protected boolean shouldSplit() {
> if (region.shouldForceSplit()) return true;
> boolean foundABigStore = false;
> // Get count of regions that have the same common table as this.region
> int tableRegionsCount = getCountOfCommonTableRegions();
> // Get size to check
> long sizeToCheck = getSizeToCheck(tableRegionsCount);
> LOG.debug("sizeToCheck = " + sizeToCheck);
>
> for (Store store : region.getStores().values()) {
> // If any of the stores is unable to split (eg they contain reference
> files)
> // then don't split
> if ((!store.canSplit())) {
> return false;
> }
>
> // Mark if any store is big enough
> long size = store.getSize();
> if (size > sizeToCheck) {
> LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
> " size=" + size + ", sizeToCheck=" + sizeToCheck +
> ", regionsWithCommonTable=" + tableRegionsCount);
> foundABigStore = true;
> break;
> }
> }
>
> return foundABigStore;
> }
>