You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2011/04/18 23:01:05 UTC

[jira] [Created] (HBASE-3797) StoreFile Level Compaction Locking

StoreFile Level Compaction Locking
----------------------------------

                 Key: HBASE-3797
                 URL: https://issues.apache.org/jira/browse/HBASE-3797
             Project: HBase
          Issue Type: Improvement
            Reporter: Nicolas Spiegelberg
            Assignee: Nicolas Spiegelberg
            Priority: Minor


Multithreaded compactions (HBASE-1476) will solve the problem of major compactions clogging high-priority minor compactions.  However, there is still a problem here.  Since compactions are store-level, the store undergoing major compaction will have it's storefile count increase during the major.  We really need a way to allow multiple outstanding compactions per store.  compactSelection() should lock/reserve the files being used for compaction.  This will also allow us to know what we're going to compact when inserting into the CompactSplitThread and make more informed priority queueing decisions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-3797) StoreFile Level Compaction Locking

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg resolved HBASE-3797.
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.92.0

Added with Stack's requests.  Compaction throttling is turned off by default.  I agree that it should be enabled by default, but decided to save that for another JIRA so we can discuss how we arrive at the default.

> StoreFile Level Compaction Locking
> ----------------------------------
>
>                 Key: HBASE-3797
>                 URL: https://issues.apache.org/jira/browse/HBASE-3797
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3797+1476.patch
>
>
> Multithreaded compactions (HBASE-1476) will solve the problem of major compactions clogging high-priority minor compactions.  However, there is still a problem here.  Since compactions are store-level, the store undergoing major compaction will have it's storefile count increase during the major.  We really need a way to allow multiple outstanding compactions per store.  compactSelection() should lock/reserve the files being used for compaction.  This will also allow us to know what we're going to compact when inserting into the CompactSplitThread and make more informed priority queueing decisions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-3797) StoreFile Level Compaction Locking

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028604#comment-13028604 ] 

Nicolas Spiegelberg edited comment on HBASE-3797 at 5/4/11 5:01 AM:
--------------------------------------------------------------------

Combined patch for HBASE-3797 & HBASE-1476.  I did not split them up because >50% of HBASE-1476 is refactoring code written for HBASE-3797.  Most of the comments I got internally about HBASE-3797 were pertaining to those deprecated code segments, so I wanted to save myself some grief on the external review :)  Note that I am having problems creating a Review Board account on review.apache.org, or I would post this up (2000+ line patch).

      was (Author: nspiegelberg):
    Combined patch for HBASE-3767 & HBASE-1476.  I did not split them up because >50% of HBASE-1476 is refactoring code written for HBASE-3767.  Most of the comments I got internally about HBASE-3767 were pertaining to those deprecated code segments, so I wanted to save myself some grief on the external review :)  Note that I am having problems creating a Review Board account on review.apache.org, or I would post this up (2000+ line patch).
  
> StoreFile Level Compaction Locking
> ----------------------------------
>
>                 Key: HBASE-3797
>                 URL: https://issues.apache.org/jira/browse/HBASE-3797
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: HBASE-3797+1476.patch
>
>
> Multithreaded compactions (HBASE-1476) will solve the problem of major compactions clogging high-priority minor compactions.  However, there is still a problem here.  Since compactions are store-level, the store undergoing major compaction will have it's storefile count increase during the major.  We really need a way to allow multiple outstanding compactions per store.  compactSelection() should lock/reserve the files being used for compaction.  This will also allow us to know what we're going to compact when inserting into the CompactSplitThread and make more informed priority queueing decisions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3797) StoreFile Level Compaction Locking

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028792#comment-13028792 ] 

jiraposter@reviews.apache.org commented on HBASE-3797:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/693/
-----------------------------------------------------------

Review request for hbase.


Summary
-------

hbase-3797 StoreFile Level Compaction Locking

I posted this here for nicolas


This addresses bug hbase-3797.
    https://issues.apache.org/jira/browse/hbase-3797


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 24450ae 
  src/main/java/org/apache/hadoop/hbase/regionserver/CompactionRequestor.java fdbf659 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 11fd50e 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java b910254 
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java f66a7cd 
  src/main/java/org/apache/hadoop/hbase/regionserver/PriorityCompactionQueue.java 13bcb3f 
  src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java e2295c2 
  src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java 98f962b 
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java 345e393 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 6517ba7 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestPriorityCompactionQueue.java c5d876d 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java adfe1f8 

Diff: https://reviews.apache.org/r/693/diff


Testing
-------


Thanks,

Michael



> StoreFile Level Compaction Locking
> ----------------------------------
>
>                 Key: HBASE-3797
>                 URL: https://issues.apache.org/jira/browse/HBASE-3797
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: HBASE-3797+1476.patch
>
>
> Multithreaded compactions (HBASE-1476) will solve the problem of major compactions clogging high-priority minor compactions.  However, there is still a problem here.  Since compactions are store-level, the store undergoing major compaction will have it's storefile count increase during the major.  We really need a way to allow multiple outstanding compactions per store.  compactSelection() should lock/reserve the files being used for compaction.  This will also allow us to know what we're going to compact when inserting into the CompactSplitThread and make more informed priority queueing decisions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3797) StoreFile Level Compaction Locking

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-3797:
---------------------------------------

    Attachment: HBASE-3797+1476.patch

Combined patch for HBASE-3767 & HBASE-1476.  I did not split them up because >50% of HBASE-1476 is refactoring code written for HBASE-3767.  Most of the comments I got internally about HBASE-3767 were pertaining to those deprecated code segments, so I wanted to save myself some grief on the external review :)  Note that I am having problems creating a Review Board account on review.apache.org, or I would post this up (2000+ line patch).

> StoreFile Level Compaction Locking
> ----------------------------------
>
>                 Key: HBASE-3797
>                 URL: https://issues.apache.org/jira/browse/HBASE-3797
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: HBASE-3797+1476.patch
>
>
> Multithreaded compactions (HBASE-1476) will solve the problem of major compactions clogging high-priority minor compactions.  However, there is still a problem here.  Since compactions are store-level, the store undergoing major compaction will have it's storefile count increase during the major.  We really need a way to allow multiple outstanding compactions per store.  compactSelection() should lock/reserve the files being used for compaction.  This will also allow us to know what we're going to compact when inserting into the CompactSplitThread and make more informed priority queueing decisions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3797) StoreFile Level Compaction Locking

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029528#comment-13029528 ] 

stack commented on HBASE-3797:
------------------------------

@N

That sounds like some good testing done already.  Agree should be good for TRUNK.  Yeah, on commit, lets change default. I'm ok if its sub-optimal on commit and we tune it later rather than other way around because then folks will notice that this fancy new stuff exists.

CompactionRequestor.java#34: I'm ok on limiting refactoring.

Would suggest you remove setServer method for SplitRequest.java#34 on commit.

I'm good on the rest.  You want to make a new patch or you just want me to commit this (with amendments above)?





> StoreFile Level Compaction Locking
> ----------------------------------
>
>                 Key: HBASE-3797
>                 URL: https://issues.apache.org/jira/browse/HBASE-3797
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: HBASE-3797+1476.patch
>
>
> Multithreaded compactions (HBASE-1476) will solve the problem of major compactions clogging high-priority minor compactions.  However, there is still a problem here.  Since compactions are store-level, the store undergoing major compaction will have it's storefile count increase during the major.  We really need a way to allow multiple outstanding compactions per store.  compactSelection() should lock/reserve the files being used for compaction.  This will also allow us to know what we're going to compact when inserting into the CompactSplitThread and make more informed priority queueing decisions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3797) StoreFile Level Compaction Locking

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028824#comment-13028824 ] 

jiraposter@reviews.apache.org commented on HBASE-3797:
------------------------------------------------------

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/693/#review646
-----------------------------------------------------------

This looks great.  Some comments in below.  Have you tried it Nicolas?  Does it work for you?  I'm game for getting this into TRUNK sooner rather than earlier but was thinking the default settings conservative.  Should we have at least one small compactions thread running (Default is zero).

src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
<https://reviews.apache.org/r/693/#comment1293>

    Funny. What you doing at walmart?  Oh yeah, you are from the south!

src/main/java/org/apache/hadoop/hbase/regionserver/CompactionRequestor.java
<https://reviews.apache.org/r/693/#comment1294>

    Deprecate this in favor of the new API?  Or just remove this one since its an internal-only API?

src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
<https://reviews.apache.org/r/693/#comment1304>

    How we ensure a compaction and split of same region do not clash?  Or is that not a prob. in this redo?

src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
<https://reviews.apache.org/r/693/#comment1300>

    Missing license

src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
<https://reviews.apache.org/r/693/#comment1301>

    Is this good name for this?  Would 'Split' be a better name?  Or 'Splitter'

src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
<https://reviews.apache.org/r/693/#comment1302>

    Can you pass this on construction?  Seems like something that should not be changed post-creation

src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
<https://reviews.apache.org/r/693/#comment1303>

    This is a nice refactoring.

src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<https://reviews.apache.org/r/693/#comment1299>

    Did we remove this.  Is it not needed any more?

src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java
<https://reviews.apache.org/r/693/#comment1296>

    The class comment is off now, is that right?  Now it does more than hold the details.  Now it actually runs the compaction? If so, should we rename the class?

src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java
<https://reviews.apache.org/r/693/#comment1297>

    You have white space throughout your patch.  No biggie but you might want to fix on commit.

src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java
<https://reviews.apache.org/r/693/#comment1298>

    Do we check that the files that make up the compaction are still around when we go to run?  Is it possible they might have changed between queu'ing and running?

src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
<https://reviews.apache.org/r/693/#comment1295>

    Good

- Michael

On 2011-05-04 15:48:58, Michael Stack wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/693/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-05-04 15:48:58)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  hbase-3797 StoreFile Level Compaction Locking
bq.  
bq.  I posted this here for nicolas
bq.  
bq.  
bq.  This addresses bug hbase-3797.
bq.      https://issues.apache.org/jira/browse/hbase-3797
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 24450ae 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/CompactionRequestor.java fdbf659 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 11fd50e 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java b910254 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java f66a7cd 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/PriorityCompactionQueue.java 13bcb3f 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/Store.java e2295c2 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java 98f962b 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java 345e393 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 6517ba7 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/TestPriorityCompactionQueue.java c5d876d 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java adfe1f8 
bq.  
bq.  Diff: https://reviews.apache.org/r/693/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Michael
bq.  
bq.

> StoreFile Level Compaction Locking
> ----------------------------------
>
>                 Key: HBASE-3797
>                 URL: https://issues.apache.org/jira/browse/HBASE-3797
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: HBASE-3797+1476.patch
>
>
> Multithreaded compactions (HBASE-1476) will solve the problem of major compactions clogging high-priority minor compactions.  However, there is still a problem here.  Since compactions are store-level, the store undergoing major compaction will have it's storefile count increase during the major.  We really need a way to allow multiple outstanding compactions per store.  compactSelection() should lock/reserve the files being used for compaction.  This will also allow us to know what we're going to compact when inserting into the CompactSplitThread and make more informed priority queueing decisions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3797) StoreFile Level Compaction Locking

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028612#comment-13028612 ] 

stack commented on HBASE-3797:
------------------------------

I got 'Something broke! (Error 500)' when I tried uploading it.  Will try again in morning and then ask INFRA if it persists.

> StoreFile Level Compaction Locking
> ----------------------------------
>
>                 Key: HBASE-3797
>                 URL: https://issues.apache.org/jira/browse/HBASE-3797
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: HBASE-3797+1476.patch
>
>
> Multithreaded compactions (HBASE-1476) will solve the problem of major compactions clogging high-priority minor compactions.  However, there is still a problem here.  Since compactions are store-level, the store undergoing major compaction will have it's storefile count increase during the major.  We really need a way to allow multiple outstanding compactions per store.  compactSelection() should lock/reserve the files being used for compaction.  This will also allow us to know what we're going to compact when inserting into the CompactSplitThread and make more informed priority queueing decisions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3797) StoreFile Level Compaction Locking

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029076#comment-13029076 ] 

Nicolas Spiegelberg commented on HBASE-3797:
--------------------------------------------

@Stack: review board comments.  Note that this is being used in our 0.89 branch, but the 0.90 port hasn't had any cluster testing, just unit testing.  The main differences were: (1) coprocessors, (2) have to pass the server to CompactRequest/SplitRequest classes, & (3) Split code has changed.  All 3 differences were pretty trivial, so it should work out of the box and we plan to use it for our 0.90 build as well.

I wanted to keep the defaults the same as the old algorithm, but I certainly agree that we should think of some reasonable default for the throttle size and default to 1 large + 1 small compaction.  It's just that our use case has 10GB StoreFiles, and I don't think that's normal.  Maybe after some auto-split workload uses/tunes this feature?  Also, note that 2 compactions/server means that up to 6 hard disks/server could be busy handling compactions, so there's diminished benefits on increasing the number of compaction threads past your HD/server count.  However, throttle partitioning will probably always be useful for congestion scenarios even if HD/server is low.

CompactionRequestor.java#34 : we can remove the Region-level compaction API; but user-requested compactions and unit tests still use it for simplicity, so I was going to limit my refactoring.

MemStoreFlusher.java#257 : Concurrent Splits & Compactions are fine.  The first thing a split request does is region.close().  This waits for region.writestate.compacting == 0, or all the compactions to finish.  Additionally, compactions prematurely interrupt on a  close request, so the split won't be stalled for long.

SplitRequest.java#34 : The whole setServer() stuff was hacked together [you can call region.getHRegionServer() in 0.89].  I tried to be consistent with CompactionRequest, but constructor is fine as well.

Store.java#632 : Note that the old store.compact() code was moved to store.requestCompaction().

CompactionRequest.java#48 : Feel free to change the comments for the class header.  You don't technically have to call CompactionRequest.run() to execute a compaction [see HRegion.compactStores()]; however you do need a CompactionRequest object for the Store to get the details about a compaction.

CompactionRequest.java#177 : The Store.filesCompacting variable and associated locking ensures that 2 compactions for the same Store will not have overlapping StoreFiles.  See the Collections.disjoint() check at Store.java#881. Currently, a user can independently add StoreFiles [e.g. bulk import], but not remove StoreFiles.  We would have to check some code here if the user was allowed to arbitrarily remove StoreFiles outside of a custom compaction algorithm.

> StoreFile Level Compaction Locking
> ----------------------------------
>
>                 Key: HBASE-3797
>                 URL: https://issues.apache.org/jira/browse/HBASE-3797
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: HBASE-3797+1476.patch
>
>
> Multithreaded compactions (HBASE-1476) will solve the problem of major compactions clogging high-priority minor compactions.  However, there is still a problem here.  Since compactions are store-level, the store undergoing major compaction will have it's storefile count increase during the major.  We really need a way to allow multiple outstanding compactions per store.  compactSelection() should lock/reserve the files being used for compaction.  This will also allow us to know what we're going to compact when inserting into the CompactSplitThread and make more informed priority queueing decisions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira