You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2011/03/23 02:18:05 UTC

[jira] [Created] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Option to Exclude Bulk Import Files from Minor Compaction
---------------------------------------------------------

                 Key: HBASE-3690
                 URL: https://issues.apache.org/jira/browse/HBASE-3690
             Project: HBase
          Issue Type: Bug
          Components: regionserver
            Reporter: Nicolas Spiegelberg
            Assignee: Nicolas Spiegelberg
            Priority: Minor
             Fix For: 0.92.0


We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Phabricator (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148031#comment-13148031 ] 

Phabricator commented on HBASE-3690:
------------------------------------

nspiegelberg has commented on the revision "[jira] [HBASE-3690] Option to Exclude Bulk Import Files from Minor Compaction".

  @lhofhansl that's roughly how I changed it.  On a per-job basis, you specify the "hbase.mapreduce.hfileoutputformat.compaction.exclude" parameter.  The reduce phase will embed into the HFiles produced from that job whether to exclude from minor compaction or not.  I put it in the HFile generation stage instead of the import stage so I wouldn't have to deal with a second file to persist this metadata.

REVISION DETAIL
  https://reviews.facebook.net/D357

                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148061#comment-13148061 ] 

stack commented on HBASE-3690:
------------------------------

+1
                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Phabricator (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147524#comment-13147524 ] 

Phabricator commented on HBASE-3690:
------------------------------------

dhruba has commented on the revision "[jira] [HBASE-3690] Option to Exclude Bulk Import Files from Minor Compaction".

  +1, code looks good to me.

REVISION DETAIL
  https://reviews.facebook.net/D357

                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Phabricator (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148039#comment-13148039 ] 

Phabricator commented on HBASE-3690:
------------------------------------

lhofhansl has commented on the revision "[jira] [HBASE-3690] Option to Exclude Bulk Import Files from Minor Compaction".

  Ah yes, never mind me. The conf is the job config (not global HBase config).

REVISION DETAIL
  https://reviews.facebook.net/D357

                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Nicolas Spiegelberg (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-3690:
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.94.0
           Status: Resolved  (was: Patch Available)
    
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-3690:
---------------------------------------

    Attachment: HBASE-3690.patch

> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009966#comment-13009966 ] 

Todd Lipcon commented on HBASE-3690:
------------------------------------

One thought - could this be done on a per-bulkload case? ie when you write out the files from HFOF, you specify some kind of HFile meta tag about whether to include or not?

It seems like this isn't good to be cluster-wide for mixed-use clusters, when you might have some tables that have big bulk loads and others that are small increments?

> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148228#comment-13148228 ] 

Hudson commented on HBASE-3690:
-------------------------------

Integrated in HBase-TRUNK #2427 (See [https://builds.apache.org/job/HBase-TRUNK/2427/])
    HBASE-3690 Option to Exclude Bulk Import Files from Minor Compaction

Summary:
We ran an incremental scrape with HFileOutputFormat and
encountered major compaction storms. This is caused by the bug in
HBASE-3404. The permanent fix is a little tricky without HBASE-2856. We
realized that a quicker solution for avoiding these compaction storms is
to simply exclude bulk import files from minor compactions and let them
only be handled by time-based major compactions. Add with functionality
along with a config option to enable it.

Rewrote this feature to be done on a per-bulkload basis.

Test Plan:
 - mvn test -Dtest=TestHFileOutputFormat

DiffCamp Revision:

Reviewers: stack, Kannan, JIRA, dhruba

Reviewed By: stack

CC: dhruba, lhofhansl, nspiegelberg, stack

Differential Revision: 357

nspiegelberg : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java

                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-3690:
---------------------------------------

    Status: Patch Available  (was: Open)

> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Phabricator (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148093#comment-13148093 ] 

Phabricator commented on HBASE-3690:
------------------------------------

stack has accepted the revision "[jira] [HBASE-3690] Option to Exclude Bulk Import Files from Minor Compaction".

  +1

REVISION DETAIL
  https://reviews.facebook.net/D357

                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Phabricator (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148115#comment-13148115 ] 

Phabricator commented on HBASE-3690:
------------------------------------

nspiegelberg has committed the revision "[jira] [HBASE-3690] Option to Exclude Bulk Import Files from Minor Compaction".

REVISION DETAIL
  https://reviews.facebook.net/D357

COMMIT
  https://reviews.facebook.net/rHBASE1200621

                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147566#comment-13147566 ] 

Hadoop QA commented on HBASE-3690:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12503168/D357.1.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -164 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 52 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/222//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/222//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/222//console

This message is automatically generated.
                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Phabricator (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147976#comment-13147976 ] 

Phabricator commented on HBASE-3690:
------------------------------------

lhofhansl has commented on the revision "[jira] [HBASE-3690] Option to Exclude Bulk Import Files from Minor Compaction".

  +1
  Looks good to me.

  What about Todd's suggestion about making this an option per bulk import?

REVISION DETAIL
  https://reviews.facebook.net/D357

                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3690) Option to Exclude Bulk Import Files from Minor Compaction

Posted by "Phabricator (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Phabricator updated HBASE-3690:
-------------------------------

    Attachment: D357.1.patch

nspiegelberg requested code review of "[jira] [HBASE-3690] Option to Exclude Bulk Import Files from Minor Compaction".
Reviewers: stack, Kannan, JIRA

  We ran an incremental scrape with HFileOutputFormat and
  encountered major compaction storms. This is caused by the bug in
  HBASE-3404. The permanent fix is a little tricky without HBASE-2856. We
  realized that a quicker solution for avoiding these compaction storms is
  to simply exclude bulk import files from minor compactions and let them
  only be handled by time-based major compactions. Add with functionality
  along with a config option to enable it.

  Rewrote this feature to be done on a per-bulkload basis.

TEST PLAN
   - mvn test -Dtest=TestHFileOutputFormat

  DiffCamp Revision:

REVISION DETAIL
  https://reviews.facebook.net/D357

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java

                
> Option to Exclude Bulk Import Files from Minor Compaction
> ---------------------------------------------------------
>
>                 Key: HBASE-3690
>                 URL: https://issues.apache.org/jira/browse/HBASE-3690
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>         Attachments: D357.1.patch, HBASE-3690.patch
>
>
> We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms.  This is caused by the bug in HBASE-3404.  The permanent fix is a little tricky without HBASE-2856.  We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions.  Add with functionality along with a config option to enable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira