You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2010/12/08 01:38:42 UTC

Review Request: SplitTransaction.splitStoreFiles slows splits a lot

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1273/
-----------------------------------------------------------

Review request for hbase.


Summary
-------

Patch that parallelizes the splitting of the files using ThreadPoolExecutor and Futures. The code is a bit ugly, but does the job really well as shown during cluster testing (which also uncovered HBASE-3318).

One new behavior this patch adds is that it's now possible to rollback a split because it took too long to split the files. I did some testing with a timeout of 5 secs on my cluster, even tho each machine did a few rollbacks the import went fine. The default is 30 seconds and isn't in hbase-default.xml as I don't think anyone would really want to change that.


This addresses bug HBASE-3308.
    http://issues.apache.org/jira/browse/HBASE-3308


Diffs
-----

  /branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 1043188 

Diff: http://review.cloudera.org/r/1273/diff


Testing
-------


Thanks,

Jean-Daniel


Re: Review Request: SplitTransaction.splitStoreFiles slows splits a lot

Posted by st...@duboce.net.

> On 2010-12-07 17:02:49, stack wrote:
> > /branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java, line 400
> > <http://review.cloudera.org/r/1273/diff/1/?file=17980#file17980line400>
> >
> >     Why not have an upper bound?  If 100 files thats 100 threads doing FS operations.  I bet if you had upper bound of 10 on the executorservice, it complete faster than an unbounded executorservice?
> 
> Jean-Daniel Cryans wrote:
>     I think we are already bounded by hbase.hstore.blockingStoreFiles

That'll do.  +1 on commit.


- stack


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1273/#review2043
-----------------------------------------------------------


On 2010-12-07 16:38:42, Jean-Daniel Cryans wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/1273/
> -----------------------------------------------------------
> 
> (Updated 2010-12-07 16:38:42)
> 
> 
> Review request for hbase.
> 
> 
> Summary
> -------
> 
> Patch that parallelizes the splitting of the files using ThreadPoolExecutor and Futures. The code is a bit ugly, but does the job really well as shown during cluster testing (which also uncovered HBASE-3318).
> 
> One new behavior this patch adds is that it's now possible to rollback a split because it took too long to split the files. I did some testing with a timeout of 5 secs on my cluster, even tho each machine did a few rollbacks the import went fine. The default is 30 seconds and isn't in hbase-default.xml as I don't think anyone would really want to change that.
> 
> 
> This addresses bug HBASE-3308.
>     http://issues.apache.org/jira/browse/HBASE-3308
> 
> 
> Diffs
> -----
> 
>   /branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 1043188 
> 
> Diff: http://review.cloudera.org/r/1273/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jean-Daniel
> 
>


Re: Review Request: SplitTransaction.splitStoreFiles slows splits a lot

Posted by Jean-Daniel Cryans <jd...@apache.org>.

> On 2010-12-07 17:02:49, stack wrote:
> > /branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java, line 400
> > <http://review.cloudera.org/r/1273/diff/1/?file=17980#file17980line400>
> >
> >     Why not have an upper bound?  If 100 files thats 100 threads doing FS operations.  I bet if you had upper bound of 10 on the executorservice, it complete faster than an unbounded executorservice?

I think we are already bounded by hbase.hstore.blockingStoreFiles


- Jean-Daniel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1273/#review2043
-----------------------------------------------------------


On 2010-12-07 16:38:42, Jean-Daniel Cryans wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/1273/
> -----------------------------------------------------------
> 
> (Updated 2010-12-07 16:38:42)
> 
> 
> Review request for hbase.
> 
> 
> Summary
> -------
> 
> Patch that parallelizes the splitting of the files using ThreadPoolExecutor and Futures. The code is a bit ugly, but does the job really well as shown during cluster testing (which also uncovered HBASE-3318).
> 
> One new behavior this patch adds is that it's now possible to rollback a split because it took too long to split the files. I did some testing with a timeout of 5 secs on my cluster, even tho each machine did a few rollbacks the import went fine. The default is 30 seconds and isn't in hbase-default.xml as I don't think anyone would really want to change that.
> 
> 
> This addresses bug HBASE-3308.
>     http://issues.apache.org/jira/browse/HBASE-3308
> 
> 
> Diffs
> -----
> 
>   /branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 1043188 
> 
> Diff: http://review.cloudera.org/r/1273/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jean-Daniel
> 
>


Re: Review Request: SplitTransaction.splitStoreFiles slows splits a lot

Posted by st...@duboce.net.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1273/#review2043
-----------------------------------------------------------

Ship it!


+1  Minor comment below.


/branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
<http://review.cloudera.org/r/1273/#comment6447>

    Why not have an upper bound?  If 100 files thats 100 threads doing FS operations.  I bet if you had upper bound of 10 on the executorservice, it complete faster than an unbounded executorservice?


- stack


On 2010-12-07 16:38:42, Jean-Daniel Cryans wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/1273/
> -----------------------------------------------------------
> 
> (Updated 2010-12-07 16:38:42)
> 
> 
> Review request for hbase.
> 
> 
> Summary
> -------
> 
> Patch that parallelizes the splitting of the files using ThreadPoolExecutor and Futures. The code is a bit ugly, but does the job really well as shown during cluster testing (which also uncovered HBASE-3318).
> 
> One new behavior this patch adds is that it's now possible to rollback a split because it took too long to split the files. I did some testing with a timeout of 5 secs on my cluster, even tho each machine did a few rollbacks the import went fine. The default is 30 seconds and isn't in hbase-default.xml as I don't think anyone would really want to change that.
> 
> 
> This addresses bug HBASE-3308.
>     http://issues.apache.org/jira/browse/HBASE-3308
> 
> 
> Diffs
> -----
> 
>   /branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 1043188 
> 
> Diff: http://review.cloudera.org/r/1273/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jean-Daniel
> 
>