You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ming Ma (JIRA)" <ji...@apache.org> on 2011/07/09 01:36:16 UTC

[jira] [Created] (HBASE-4081) Issues with HRegion.compactStores methods

Issues with HRegion.compactStores methods
-----------------------------------------

                 Key: HBASE-4081
                 URL: https://issues.apache.org/jira/browse/HBASE-4081
             Project: HBase
          Issue Type: Bug
          Components: regionserver
            Reporter: Ming Ma
            Assignee: Ming Ma


HRegion.java,

  byte [] compactStores(final boolean majorCompaction)
  throws IOException {
    if (majorCompaction) {
      this.triggerMajorCompaction();
    }
    return compactStores();
  }

  /**
   * Compact all the stores and return the split key of the first store that needs
   * to be split.
   */
  public byte[] compactStores() throws IOException {
    for(Store s : getStores().values()) {
      CompactionRequest cr = s.requestCompaction();
      if(cr != null) {
        try {
          compact(cr);
        } finally {
          s.finishRequest(cr);
        }
      }
      byte[] splitRow = s.checkSplit();
      if (splitRow != null) {
        return splitRow;
      }
    }
    return null;
  }

1. It seems the second method's intention is to compact all the stores. However, if a store requires split, the process will stop.
2. Only MetaUtils, HRegion.merge, HRegion.processTable use these two methods. No caller uses the return value.
3. HRegion.merge expects major compaction for each store after the call and has code like below to check error condition.

      // Because we compacted the source regions we should have no more than two
      // HStoreFiles per family and there will be no reference store
      if (srcFiles.size() == 2)


So it seems like the fixes are: a) take out s.CheckSplit() call inside compactStores. b) make the return type "void" for these two compactStores functions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4081) Issues with HRegion.compactStores methods

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066308#comment-13066308 ] 

stack commented on HBASE-4081:
------------------------------

Your breakdown of the evolution of that section of cod sounds right to me Ming Ma.  Go for it.

> Issues with HRegion.compactStores methods
> -----------------------------------------
>
>                 Key: HBASE-4081
>                 URL: https://issues.apache.org/jira/browse/HBASE-4081
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HBASE-4081-trunk.patch
>
>
> HRegion.java,
>   byte [] compactStores(final boolean majorCompaction)
>   throws IOException {
>     if (majorCompaction) {
>       this.triggerMajorCompaction();
>     }
>     return compactStores();
>   }
>   /**
>    * Compact all the stores and return the split key of the first store that needs
>    * to be split.
>    */
>   public byte[] compactStores() throws IOException {
>     for(Store s : getStores().values()) {
>       CompactionRequest cr = s.requestCompaction();
>       if(cr != null) {
>         try {
>           compact(cr);
>         } finally {
>           s.finishRequest(cr);
>         }
>       }
>       byte[] splitRow = s.checkSplit();
>       if (splitRow != null) {
>         return splitRow;
>       }
>     }
>     return null;
>   }
> 1. It seems the second method's intention is to compact all the stores. However, if a store requires split, the process will stop.
> 2. Only MetaUtils, HRegion.merge, HRegion.processTable use these two methods. No caller uses the return value.
> 3. HRegion.merge expects major compaction for each store after the call and has code like below to check error condition.
>       // Because we compacted the source regions we should have no more than two
>       // HStoreFiles per family and there will be no reference store
>       if (srcFiles.size() == 2)
> So it seems like the fixes are: a) take out s.CheckSplit() call inside compactStores. b) make the return type "void" for these two compactStores functions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4081) Issues with HRegion.compactStores methods

Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ming Ma resolved HBASE-4081.
----------------------------

       Resolution: Fixed
    Fix Version/s: 0.92.0
     Release Note: The fix has been checked in.
     Hadoop Flags: [Reviewed]

> Issues with HRegion.compactStores methods
> -----------------------------------------
>
>                 Key: HBASE-4081
>                 URL: https://issues.apache.org/jira/browse/HBASE-4081
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4081-trunk.patch
>
>
> HRegion.java,
>   byte [] compactStores(final boolean majorCompaction)
>   throws IOException {
>     if (majorCompaction) {
>       this.triggerMajorCompaction();
>     }
>     return compactStores();
>   }
>   /**
>    * Compact all the stores and return the split key of the first store that needs
>    * to be split.
>    */
>   public byte[] compactStores() throws IOException {
>     for(Store s : getStores().values()) {
>       CompactionRequest cr = s.requestCompaction();
>       if(cr != null) {
>         try {
>           compact(cr);
>         } finally {
>           s.finishRequest(cr);
>         }
>       }
>       byte[] splitRow = s.checkSplit();
>       if (splitRow != null) {
>         return splitRow;
>       }
>     }
>     return null;
>   }
> 1. It seems the second method's intention is to compact all the stores. However, if a store requires split, the process will stop.
> 2. Only MetaUtils, HRegion.merge, HRegion.processTable use these two methods. No caller uses the return value.
> 3. HRegion.merge expects major compaction for each store after the call and has code like below to check error condition.
>       // Because we compacted the source regions we should have no more than two
>       // HStoreFiles per family and there will be no reference store
>       if (srcFiles.size() == 2)
> So it seems like the fixes are: a) take out s.CheckSplit() call inside compactStores. b) make the return type "void" for these two compactStores functions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4081) Issues with HRegion.compactStores methods

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062312#comment-13062312 ] 

stack commented on HBASE-4081:
------------------------------

@Ming Ma I see Ted's point (as long as the javadoc is updated) but I could also go your way.  As is the method is 'messy' and its method name misleads -- and if no one is using the split return anyways, then, it would make sense to clean it up (This code smells of an incomplete refactoring).

> Issues with HRegion.compactStores methods
> -----------------------------------------
>
>                 Key: HBASE-4081
>                 URL: https://issues.apache.org/jira/browse/HBASE-4081
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>
> HRegion.java,
>   byte [] compactStores(final boolean majorCompaction)
>   throws IOException {
>     if (majorCompaction) {
>       this.triggerMajorCompaction();
>     }
>     return compactStores();
>   }
>   /**
>    * Compact all the stores and return the split key of the first store that needs
>    * to be split.
>    */
>   public byte[] compactStores() throws IOException {
>     for(Store s : getStores().values()) {
>       CompactionRequest cr = s.requestCompaction();
>       if(cr != null) {
>         try {
>           compact(cr);
>         } finally {
>           s.finishRequest(cr);
>         }
>       }
>       byte[] splitRow = s.checkSplit();
>       if (splitRow != null) {
>         return splitRow;
>       }
>     }
>     return null;
>   }
> 1. It seems the second method's intention is to compact all the stores. However, if a store requires split, the process will stop.
> 2. Only MetaUtils, HRegion.merge, HRegion.processTable use these two methods. No caller uses the return value.
> 3. HRegion.merge expects major compaction for each store after the call and has code like below to check error condition.
>       // Because we compacted the source regions we should have no more than two
>       // HStoreFiles per family and there will be no reference store
>       if (srcFiles.size() == 2)
> So it seems like the fixes are: a) take out s.CheckSplit() call inside compactStores. b) make the return type "void" for these two compactStores functions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4081) Issues with HRegion.compactStores methods

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062255#comment-13062255 ] 

Ted Yu commented on HBASE-4081:
-------------------------------

It doesn't hurt to keep the return value.

According to javadoc:
{code}
return the split key of the first store that needs to be split
{code}
I think call to s.CheckSplit() should be kept. Instead of returning immediately, we can continue iterating Store's. If we want to honor the above contract, splitRow should be declared outside for loop and used to remember the split key of the first store.
Since no one is using the return value, we can return the split key of the last store and modify javadoc accordingly.


> Issues with HRegion.compactStores methods
> -----------------------------------------
>
>                 Key: HBASE-4081
>                 URL: https://issues.apache.org/jira/browse/HBASE-4081
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>
> HRegion.java,
>   byte [] compactStores(final boolean majorCompaction)
>   throws IOException {
>     if (majorCompaction) {
>       this.triggerMajorCompaction();
>     }
>     return compactStores();
>   }
>   /**
>    * Compact all the stores and return the split key of the first store that needs
>    * to be split.
>    */
>   public byte[] compactStores() throws IOException {
>     for(Store s : getStores().values()) {
>       CompactionRequest cr = s.requestCompaction();
>       if(cr != null) {
>         try {
>           compact(cr);
>         } finally {
>           s.finishRequest(cr);
>         }
>       }
>       byte[] splitRow = s.checkSplit();
>       if (splitRow != null) {
>         return splitRow;
>       }
>     }
>     return null;
>   }
> 1. It seems the second method's intention is to compact all the stores. However, if a store requires split, the process will stop.
> 2. Only MetaUtils, HRegion.merge, HRegion.processTable use these two methods. No caller uses the return value.
> 3. HRegion.merge expects major compaction for each store after the call and has code like below to check error condition.
>       // Because we compacted the source regions we should have no more than two
>       // HStoreFiles per family and there will be no reference store
>       if (srcFiles.size() == 2)
> So it seems like the fixes are: a) take out s.CheckSplit() call inside compactStores. b) make the return type "void" for these two compactStores functions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4081) Issues with HRegion.compactStores methods

Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ming Ma updated HBASE-4081:
---------------------------

    Attachment: HBASE-4081-trunk.patch

Thanks, Stack, Ted.

1. It looks like there was some code rewrite for compaction and split if we compare trunk and 0.90.2.
2. compactStores method used to the main method for compaction in 0.90.2. It returns split key to so that split can happen if necessary.
3. In trunk, it has been rewritten to use thread pool to do compaction and split. There is no need to return split key in normal compaction process. compactStores are there still to support synchronous compaction need by utilities and testing.

Given these, I basically go with the original proposals and fix it so that test code can get split key with a separate method. Separating compaction and split on method level seems to be cleaner.

> Issues with HRegion.compactStores methods
> -----------------------------------------
>
>                 Key: HBASE-4081
>                 URL: https://issues.apache.org/jira/browse/HBASE-4081
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HBASE-4081-trunk.patch
>
>
> HRegion.java,
>   byte [] compactStores(final boolean majorCompaction)
>   throws IOException {
>     if (majorCompaction) {
>       this.triggerMajorCompaction();
>     }
>     return compactStores();
>   }
>   /**
>    * Compact all the stores and return the split key of the first store that needs
>    * to be split.
>    */
>   public byte[] compactStores() throws IOException {
>     for(Store s : getStores().values()) {
>       CompactionRequest cr = s.requestCompaction();
>       if(cr != null) {
>         try {
>           compact(cr);
>         } finally {
>           s.finishRequest(cr);
>         }
>       }
>       byte[] splitRow = s.checkSplit();
>       if (splitRow != null) {
>         return splitRow;
>       }
>     }
>     return null;
>   }
> 1. It seems the second method's intention is to compact all the stores. However, if a store requires split, the process will stop.
> 2. Only MetaUtils, HRegion.merge, HRegion.processTable use these two methods. No caller uses the return value.
> 3. HRegion.merge expects major compaction for each store after the call and has code like below to check error condition.
>       // Because we compacted the source regions we should have no more than two
>       // HStoreFiles per family and there will be no reference store
>       if (srcFiles.size() == 2)
> So it seems like the fixes are: a) take out s.CheckSplit() call inside compactStores. b) make the return type "void" for these two compactStores functions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4081) Issues with HRegion.compactStores methods

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066354#comment-13066354 ] 

Hudson commented on HBASE-4081:
-------------------------------

Integrated in HBase-TRUNK #2035 (See [https://builds.apache.org/job/HBase-TRUNK/2035/])
    HBASE-4081 Issues with HRegion.compactStores methods

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java


> Issues with HRegion.compactStores methods
> -----------------------------------------
>
>                 Key: HBASE-4081
>                 URL: https://issues.apache.org/jira/browse/HBASE-4081
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HBASE-4081-trunk.patch
>
>
> HRegion.java,
>   byte [] compactStores(final boolean majorCompaction)
>   throws IOException {
>     if (majorCompaction) {
>       this.triggerMajorCompaction();
>     }
>     return compactStores();
>   }
>   /**
>    * Compact all the stores and return the split key of the first store that needs
>    * to be split.
>    */
>   public byte[] compactStores() throws IOException {
>     for(Store s : getStores().values()) {
>       CompactionRequest cr = s.requestCompaction();
>       if(cr != null) {
>         try {
>           compact(cr);
>         } finally {
>           s.finishRequest(cr);
>         }
>       }
>       byte[] splitRow = s.checkSplit();
>       if (splitRow != null) {
>         return splitRow;
>       }
>     }
>     return null;
>   }
> 1. It seems the second method's intention is to compact all the stores. However, if a store requires split, the process will stop.
> 2. Only MetaUtils, HRegion.merge, HRegion.processTable use these two methods. No caller uses the return value.
> 3. HRegion.merge expects major compaction for each store after the call and has code like below to check error condition.
>       // Because we compacted the source regions we should have no more than two
>       // HStoreFiles per family and there will be no reference store
>       if (srcFiles.size() == 2)
> So it seems like the fixes are: a) take out s.CheckSplit() call inside compactStores. b) make the return type "void" for these two compactStores functions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira