You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jean-Marc Spaggiari <je...@spaggiari.org> on 2013/08/10 01:00:15 UTC

IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question

Hi,

In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
and we look at 2 things.

First, if the store can't be split, then we return immediatly with false
without checking the other stores. Then if the store can be split and is
big enought, we return immediatly (break) with true.

The logic seems to be wrong to be. In the 2nd case, should we not continue
to loop over the other stores and see if one can't be split? Else we will
trigger a split on a region where some stores can't be split. I will sply
remove the "break".

Does anyone have another opinion on that? Else I will open a JIRA and send
a patch...

JM

  @Override
  protected boolean shouldSplit() {
    if (region.shouldForceSplit()) return true;
    boolean foundABigStore = false;
    // Get count of regions that have the same common table as this.region
    int tableRegionsCount = getCountOfCommonTableRegions();
    // Get size to check
    long sizeToCheck = getSizeToCheck(tableRegionsCount);
    LOG.debug("sizeToCheck = " + sizeToCheck);

    for (Store store : region.getStores().values()) {
      // If any of the stores is unable to split (eg they contain reference
files)
      // then don't split
      if ((!store.canSplit())) {
        return false;
      }

      // Mark if any store is big enough
      long size = store.getSize();
      if (size > sizeToCheck) {
        LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
          " size=" + size + ", sizeToCheck=" + sizeToCheck +
          ", regionsWithCommonTable=" + tableRegionsCount);
        foundABigStore = true;
        break;
      }
    }

    return foundABigStore;
  }

Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question

Posted by Ted Yu <yu...@gmail.com>.
Your reasoning looks good.

On Fri, Aug 9, 2013 at 4:00 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hi,
>
> In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
> and we look at 2 things.
>
> First, if the store can't be split, then we return immediatly with false
> without checking the other stores. Then if the store can be split and is
> big enought, we return immediatly (break) with true.
>
> The logic seems to be wrong to be. In the 2nd case, should we not continue
> to loop over the other stores and see if one can't be split? Else we will
> trigger a split on a region where some stores can't be split. I will sply
> remove the "break".
>
> Does anyone have another opinion on that? Else I will open a JIRA and send
> a patch...
>
> JM
>
>   @Override
>   protected boolean shouldSplit() {
>     if (region.shouldForceSplit()) return true;
>     boolean foundABigStore = false;
>     // Get count of regions that have the same common table as this.region
>     int tableRegionsCount = getCountOfCommonTableRegions();
>     // Get size to check
>     long sizeToCheck = getSizeToCheck(tableRegionsCount);
>     LOG.debug("sizeToCheck = " + sizeToCheck);
>
>     for (Store store : region.getStores().values()) {
>       // If any of the stores is unable to split (eg they contain reference
> files)
>       // then don't split
>       if ((!store.canSplit())) {
>         return false;
>       }
>
>       // Mark if any store is big enough
>       long size = store.getSize();
>       if (size > sizeToCheck) {
>         LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
>           " size=" + size + ", sizeToCheck=" + sizeToCheck +
>           ", regionsWithCommonTable=" + tableRegionsCount);
>         foundABigStore = true;
>         break;
>       }
>     }
>
>     return foundABigStore;
>   }
>

Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Thanks all for checking.

Opened HBASE-9189. I will do the modifications on trunk and 0.94, run the
tests and submit the patch over the day.

I'm also instigating another split related. I have big >10GB regions not
splitting even after days. More to come.

JM

2013/8/11 lars hofhansl <la...@apache.org>

> File a jira, JMS?
> Looks like to simple fix, can pull into 0.94.11 (which I'll spin early
> next week, promised).
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Stack <st...@duboce.net>
> To: HBase Dev List <de...@hbase.apache.org>
> Sent: Saturday, August 10, 2013 10:36 PM
> Subject: Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic
> question
>
>
> In Store.canSplit, we check for references.  If references, we should not
> split a region.
>
>   public boolean canSplit() {
>
>     this.lock.readLock().lock();
>
>     try {
>
>       // Not split-able if we find a reference store file present in the
> store.
>
>       boolean result = !hasReferences();
>
>       if (!result && LOG.isDebugEnabled()) {
>
>         LOG.debug("Cannot split region due to reference files being
> there");
>
>       }
>
>       return result;
>
>     } finally {
>
>       this.lock.readLock().unlock();
>
>     }
>
>   }
>
>
> So, yeah, we should keep going through stores in case one is unsplittlable.
>
> Good on you JMS,
>
> St.Ack
>
>
> On Fri, Aug 9, 2013 at 4:00 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> > Hi,
> >
> > In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
> > and we look at 2 things.
> >
> > First, if the store can't be split, then we return immediatly with false
> > without checking the other stores. Then if the store can be split and is
> > big enought, we return immediatly (break) with true.
> >
> > The logic seems to be wrong to be. In the 2nd case, should we not
> continue
> > to loop over the other stores and see if one can't be split? Else we will
> > trigger a split on a region where some stores can't be split. I will sply
> > remove the "break".
> >
> > Does anyone have another opinion on that? Else I will open a JIRA and
> send
> > a patch...
> >
> > JM
> >
> >   @Override
> >   protected boolean shouldSplit() {
> >     if (region.shouldForceSplit()) return true;
> >     boolean foundABigStore = false;
> >     // Get count of regions that have the same common table as
> this.region
> >     int tableRegionsCount = getCountOfCommonTableRegions();
> >     // Get size to check
> >     long sizeToCheck = getSizeToCheck(tableRegionsCount);
> >     LOG.debug("sizeToCheck = " + sizeToCheck);
> >
> >     for (Store store : region.getStores().values()) {
> >       // If any of the stores is unable to split (eg they contain
> reference
> > files)
> >       // then don't split
> >       if ((!store.canSplit())) {
> >         return false;
> >       }
> >
> >       // Mark if any store is big enough
> >       long size = store.getSize();
> >       if (size > sizeToCheck) {
> >         LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
> >           " size=" + size + ", sizeToCheck=" + sizeToCheck +
> >           ", regionsWithCommonTable=" + tableRegionsCount);
> >         foundABigStore = true;
> >         break;
> >       }
> >     }
> >
> >     return foundABigStore;
> >   }
> >
>

Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question

Posted by lars hofhansl <la...@apache.org>.
File a jira, JMS?
Looks like to simple fix, can pull into 0.94.11 (which I'll spin early next week, promised).


-- Lars



________________________________
 From: Stack <st...@duboce.net>
To: HBase Dev List <de...@hbase.apache.org> 
Sent: Saturday, August 10, 2013 10:36 PM
Subject: Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question
 

In Store.canSplit, we check for references.  If references, we should not
split a region.

  public boolean canSplit() {

    this.lock.readLock().lock();

    try {

      // Not split-able if we find a reference store file present in the
store.

      boolean result = !hasReferences();

      if (!result && LOG.isDebugEnabled()) {

        LOG.debug("Cannot split region due to reference files being there");

      }

      return result;

    } finally {

      this.lock.readLock().unlock();

    }

  }


So, yeah, we should keep going through stores in case one is unsplittlable.

Good on you JMS,

St.Ack


On Fri, Aug 9, 2013 at 4:00 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hi,
>
> In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
> and we look at 2 things.
>
> First, if the store can't be split, then we return immediatly with false
> without checking the other stores. Then if the store can be split and is
> big enought, we return immediatly (break) with true.
>
> The logic seems to be wrong to be. In the 2nd case, should we not continue
> to loop over the other stores and see if one can't be split? Else we will
> trigger a split on a region where some stores can't be split. I will sply
> remove the "break".
>
> Does anyone have another opinion on that? Else I will open a JIRA and send
> a patch...
>
> JM
>
>   @Override
>   protected boolean shouldSplit() {
>     if (region.shouldForceSplit()) return true;
>     boolean foundABigStore = false;
>     // Get count of regions that have the same common table as this.region
>     int tableRegionsCount = getCountOfCommonTableRegions();
>     // Get size to check
>     long sizeToCheck = getSizeToCheck(tableRegionsCount);
>     LOG.debug("sizeToCheck = " + sizeToCheck);
>
>     for (Store store : region.getStores().values()) {
>       // If any of the stores is unable to split (eg they contain reference
> files)
>       // then don't split
>       if ((!store.canSplit())) {
>         return false;
>       }
>
>       // Mark if any store is big enough
>       long size = store.getSize();
>       if (size > sizeToCheck) {
>         LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
>           " size=" + size + ", sizeToCheck=" + sizeToCheck +
>           ", regionsWithCommonTable=" + tableRegionsCount);
>         foundABigStore = true;
>         break;
>       }
>     }
>
>     return foundABigStore;
>   }
>

Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question

Posted by Stack <st...@duboce.net>.
In Store.canSplit, we check for references.  If references, we should not
split a region.

  public boolean canSplit() {

    this.lock.readLock().lock();

    try {

      // Not split-able if we find a reference store file present in the
store.

      boolean result = !hasReferences();

      if (!result && LOG.isDebugEnabled()) {

        LOG.debug("Cannot split region due to reference files being there");

      }

      return result;

    } finally {

      this.lock.readLock().unlock();

    }

  }


So, yeah, we should keep going through stores in case one is unsplittlable.

Good on you JMS,

St.Ack


On Fri, Aug 9, 2013 at 4:00 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hi,
>
> In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
> and we look at 2 things.
>
> First, if the store can't be split, then we return immediatly with false
> without checking the other stores. Then if the store can be split and is
> big enought, we return immediatly (break) with true.
>
> The logic seems to be wrong to be. In the 2nd case, should we not continue
> to loop over the other stores and see if one can't be split? Else we will
> trigger a split on a region where some stores can't be split. I will sply
> remove the "break".
>
> Does anyone have another opinion on that? Else I will open a JIRA and send
> a patch...
>
> JM
>
>   @Override
>   protected boolean shouldSplit() {
>     if (region.shouldForceSplit()) return true;
>     boolean foundABigStore = false;
>     // Get count of regions that have the same common table as this.region
>     int tableRegionsCount = getCountOfCommonTableRegions();
>     // Get size to check
>     long sizeToCheck = getSizeToCheck(tableRegionsCount);
>     LOG.debug("sizeToCheck = " + sizeToCheck);
>
>     for (Store store : region.getStores().values()) {
>       // If any of the stores is unable to split (eg they contain reference
> files)
>       // then don't split
>       if ((!store.canSplit())) {
>         return false;
>       }
>
>       // Mark if any store is big enough
>       long size = store.getSize();
>       if (size > sizeToCheck) {
>         LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
>           " size=" + size + ", sizeToCheck=" + sizeToCheck +
>           ", regionsWithCommonTable=" + tableRegionsCount);
>         foundABigStore = true;
>         break;
>       }
>     }
>
>     return foundABigStore;
>   }
>