You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Zhihong Ted Yu (JIRA)" <ji...@apache.org> on 2012/06/27 23:13:44 UTC

[jira] [Created] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Zhihong Ted Yu created HBASE-6284:
-------------------------------------

             Summary: Introduce HRegion#doMiniBatchDelete()
                 Key: HBASE-6284
                 URL: https://issues.apache.org/jira/browse/HBASE-6284
             Project: HBase
          Issue Type: Bug
            Reporter: Zhihong Ted Yu


>From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':

The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.

I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
Only one CF and qualifier.
10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
With the new way the net time taken is reduced by more than 1/10
Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403687#comment-13403687 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

https://reviews.apache.org/r/5654/
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405640#comment-13405640 ] 

Hudson commented on HBASE-6284:
-------------------------------

Integrated in HBase-TRUNK #3092 (See [https://builds.apache.org/job/HBase-TRUNK/3092/])
    HBASE-6284 HRegion#doMiniBatchMutation() (Anoop) (Revision 1356566)

     Result = FAILURE
tedyu : 
Files : 
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiRowMutationProcessor.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/OperationMetrics.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java

                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-V3.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6284:
----------------------------------

    Status: Open  (was: Patch Available)
    
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-V3.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404698#comment-13404698 ] 

Hadoop QA commented on HBASE-6284:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12534145/HBASE-6284_Trunk-V2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.TestZooKeeper

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2306//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2306//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2306//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2306//console

This message is automatically generated.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428710#comment-13428710 ] 

Hudson commented on HBASE-6284:
-------------------------------

Integrated in HBase-0.94-security-on-Hadoop-23 #6 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/6/])
    HBASE-6284 Introduce HRegion#doMiniBatchMutation() (Anoop) (Revision 1360020)

     Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/OperationMetrics.java

                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405487#comment-13405487 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

@Ted
bq.Can we buffer Appends and Increments so that mutates List contains certain amount of Mutate's ?

This will actually delay the calls for Appends and Increments right?  Suppose the List comes is as Put(r1),Put(r2),Increment(r1),Put(r1)  [I dont know some one use this way]
In he current way of Trunk code as we try to maintain the seq the value corresponding to ( suppose only one CF and qualifier) r1 will be that from the last Put. But if we buffer the Increment and allow the Puts to happen first, the final value may come as different!
Basically we might loose the seq..  I am not sure because of this reason we changed the code in trunk and which issue changed this code.. I am just thinking this may be the reason.  What do u say?  Correct me if my understanding is wrong pls..
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284-trunk-suggest.txt, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412476#comment-13412476 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

I mean public methods in HRegion can be called from co processors at RS side.

                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405491#comment-13405491 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

Looks like patch v3 keeps the existing semantics in a simple way.
Optimization should group operations by row and column family. Let's leave it to future JIRA.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404892#comment-13404892 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

bq.Consider maintaining two variables: one for puts, one for deletions.

You mean there should be 2 time variables and within the loop in different places as per the type of Mutation we need to update the variables accordingly?
Like below
{code}
if (isPutMutation) {
            // Check the families in the put. If bad, skip this one.
            checkFamilies(familyMap.keySet());
            checkTimestamps(mutation.getFamilyMap(), now);
            // update the put net time
          } else {
            prepareDelete((Delete) mutation);
            // update delete net time
          }
{code}
Similarly
{code}
if (mutation instanceof Put) {
          updateKVTimestamps(familyMaps[i].values(), byteNow);
          noOfPuts++;
        } else {
          prepareDeleteTimestamps(familyMaps[i], byteNow);
          noOfDeletes++;
        }
{code}
Down the line in this method we need to differentiate the type of mutation where we apply KVs to memstore and write to WAL. Also the WAL sync is one operation. This sync may be for some Puts and some Deletes.  How to get the exact numbers here? The code will become more complex I felt. U have any suggestions?
I think I got you comment correctly. Correct me if I am wrong pls
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anoop Sam John updated HBASE-6284:
----------------------------------

    Attachment: HBASE-6284_Trunk.patch

Patch on Trunk for review.
Will give exact performance test result soon.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404232#comment-13404232 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

I put a few comments on review board.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6284:
----------------------------------

    Attachment:     (was: 6284-trunk-suggest.txt)
    
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anoop Sam John updated HBASE-6284:
----------------------------------

    Component/s: regionserver
                 performance
    
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6284:
----------------------------------

    Summary: Introduce HRegion#doMiniBatchMutation()  (was: Introduce HRegion#doMiniBatchDelete())
    
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-V3.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404908#comment-13404908 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

I am fine with the approximation of metric computation in the current patch.
Other people's comments are welcome.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402808#comment-13402808 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

I will come up with a patch based on 0.94 version soon...
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Jeff Whiting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402656#comment-13402656 ] 

Jeff Whiting commented on HBASE-6284:
-------------------------------------

While adding this it is probably worth looking at HBASE-5941 and incorporating its performance improvements as well.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhihong Ted Yu
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406256#comment-13406256 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

Addendum integrated to trunk.

Thanks Anoop.
                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403683#comment-13403683 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

The patch was generated by svn, so choose hbase and specify / as Base Directory.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406252#comment-13406252 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

Attached patch fo 94 version. Pls review.

@Ted - There is a minor correction needed in the Javadoc for trunk patch. Attached addendum for that. Can you take a look and integrate the same pls.
                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6284:
----------------------------------

    Hadoop Flags: Reviewed
          Status: Patch Available  (was: Open)
    
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6284:
---------------------------------

    Fix Version/s:     (was: 0.94.1)
                   0.94.2

Moving to 0.94.2.
Would also consider removing from 0.94 altogether. Thoughts?
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404694#comment-13404694 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

Okey to move to 94.2
But 94 being a performance version I think better have this change in 0.94.2 atleast.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anoop Sam John reassigned HBASE-6284:
-------------------------------------

    Assignee: Anoop Sam John
    
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405788#comment-13405788 ] 

Hudson commented on HBASE-6284:
-------------------------------

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #78 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/78/])
    HBASE-6284 HRegion#doMiniBatchMutation() (Anoop) (Revision 1356566)

     Result = FAILURE
tedyu : 
Files : 
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiRowMutationProcessor.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/OperationMetrics.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java

                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-V3.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404886#comment-13404886 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

I haven't tracked the source of the above code.
We should pay attention to the effect it may have on batching Puts/Deletes.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anoop Sam John updated HBASE-6284:
----------------------------------

    Attachment: HBASE-6284_94.patch
    
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl closed HBASE-6284.
--------------------------------

    
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance, regionserver
>            Reporter: Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.94.1, 0.96.0
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404760#comment-13404760 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

The handling of:
{code}
  public MultiResponse multi(final RpcController controller,
      final MultiRequest request) throws ServiceException {
{code}
is correct. However, MultiAction might be interleaved such that the Puts and Deletes are separated by Appends and Increments.
The folloing code may result in frequent calls to doBatchOp():
{code}
              if (type != MutateType.PUT && type != MutateType.DELETE) {
                if (!mutates.isEmpty()) {
                  doBatchOp(builder, region, mutates);
                  mutates.clear();
{code}
Can we buffer Appends and Increments so that mutates List contains certain amount of Mutate's ?
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anoop Sam John updated HBASE-6284:
----------------------------------

    Attachment: HBASE-6284_Trunk-V2.patch
    
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402809#comment-13402809 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

Please work out a trunk patch first.
Thanks
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6284:
----------------------------------

    Status: Patch Available  (was: Open)
    
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404750#comment-13404750 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

The patch produces the new API:
{code}
  public OperationStatus[] batchMutate(
      Pair<Mutation, Integer>[] mutationsAndLocks) throws IOException {
{code}
whose predecessor wasn't designed to handle Append and Increment.
{code}
  private long doMiniBatchMutation(
    BatchOperationInProgress<Pair<Mutation, Integer>> batchOp) throws IOException {
{code}
Please add javadoc above, explaining that only Put and Delete are currently handled.
Assertion for Mutation not being Append should be added.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6284:
----------------------------------

    Status: Open  (was: Patch Available)
    
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411318#comment-13411318 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

HRegion#put(Pair<Put, Integer>[] putsAndLocks)
This method was removed in trunk.
In 94 patch I kept that method but deprecated. HRegion is exposed to CPs now.
                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405594#comment-13405594 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

TestZooKeeper passed locally.

Patch v3 integrated to trunk.

Thanks for the patch, Anoop.

Please prepare patch for 0.94.
                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-V3.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403535#comment-13403535 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

I went over the patch once.
It looks good.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403678#comment-13403678 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

Ted
Put the patch on review board -  I need to select Repository as hbase or hbase-git?

                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6284:
----------------------------------

    Attachment: 6284_Trunk-V3.patch
    
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-V3.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403691#comment-13403691 ] 

Ted Yu commented on HBASE-6284:
-------------------------------

Do you have performance numbers ?
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411251#comment-13411251 ] 

Lars Hofhansl commented on HBASE-6284:
--------------------------------------

Looks good to me.
In HBASE-5203 I made sure that other parts of HBase (Replication, Import/Export, etc) can deal with multiple Puts/Deletes grouped in a single WAL entry, so should be good from that side.
Feel free to integrate into 0.94 now if you like (change fix version to 0.94.1 then)

                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Jieshan Bean (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402846#comment-13402846 ] 

Jieshan Bean commented on HBASE-6284:
-------------------------------------

Sounds like a good improvement:)
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402873#comment-13402873 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

HBASE-5941 is doing almost the same thing which I had done in my testing. Batching the deletes in region level and grabbing as many locks as possible and do that operations in one shot.
The write and sync to WAL will be just one time for this one shot.

Already in 94 and Trunk there is mutateRowsWithLocks() which is also doing almost the same. The only difference is that this method needs the locks on all the rows in one time and will do those many ops in one time.[Yes it need to]
Instead our new method will do in mini batches. At least one row lock (batch of one Put /Delete) is needed.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403384#comment-13403384 ] 

Hadoop QA commented on HBASE-6284:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533845/HBASE-6284_Trunk.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2283//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2283//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2283//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2283//console

This message is automatically generated.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413415#comment-13413415 ] 

Hudson commented on HBASE-6284:
-------------------------------

Integrated in HBase-0.94-security #40 (See [https://builds.apache.org/job/HBase-0.94-security/40/])
    HBASE-6284 Introduce HRegion#doMiniBatchMutation() (Anoop) (Revision 1360020)

     Result = SUCCESS
tedyu : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/OperationMetrics.java

                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404881#comment-13404881 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

@Ted
Do u know the reason why the above code was added in Trunk? In 94 this is not there. - When any other type comes in the seq do a batch op for the older Puts ( now Deletes) also
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404080#comment-13404080 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

Run a performance comparison on a single node cluster.
Table with single region and 1000 rows in that
Deleting all 1000 rows in batch of 100 rows at a time
Test result is as follows
With out patch : 240104 ms
With patch     : 2230 ms

Sorry I could not run on a bigger cluster with more data as some other NFT testing underway. Will try to give by Monday.


                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anoop Sam John updated HBASE-6284:
----------------------------------

    Attachment: 6284_Trunk-Addendum.patch
    
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6284:
------------------------------------------

    Fix Version/s: 0.94.1
                   0.96.0
    
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411879#comment-13411879 ] 

Lars Hofhansl commented on HBASE-6284:
--------------------------------------

That's fine. What you mean with "HRegion is exposed to CPs now."?
                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl resolved HBASE-6284.
----------------------------------

    Resolution: Fixed
    
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404740#comment-13404740 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

w.r.t. computation of timeTakenForPuts:
bq. Consider maintaining two variables: one for puts, one for deletions.
Please address the above in v3.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403279#comment-13403279 ] 

Zhihong Ted Yu commented on HBASE-6284:
---------------------------------------

Please put patch on review board - it is of decent size.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6284:
----------------------------------

    Attachment: 6284-trunk-suggest.txt

@Anoop:
What do you think my suggested change to multi() ?

I looked at the code in 0.94 and came to the suggested form.
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284-trunk-suggest.txt, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405191#comment-13405191 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

@Ted
{code}
if (type != MutateType.PUT) {
                if (!mutates.isEmpty()) {
                  doBatchOp(builder, region, mutates);
                  mutates.clear();
{code}
This was existing code which might result in many calls to doMiniBatchPut(). Can we handle the discussion regarding this point and fix in another JIRA?
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405601#comment-13405601 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

{quote}
Lars Hofhansl added a comment - 30/Jun/12 16:37
Moving to 0.94.2.
{quote}
We need to wait Ted for 94 commit? Any way I will prepare the patch and attach to JIRA
                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284_Trunk-V3.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anoop Sam John updated HBASE-6284:
----------------------------------

    Attachment: HBASE-6284_Trunk-V3.patch

Addressed Ted's comment
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6284:
----------------------------------

    Fix Version/s:     (was: 0.94.2)
                   0.94.1

Integrated to 0.94 branch as well.

Thanks for the review, Lars.
                
> Introduce HRegion#doMiniBatchMutation()
> ---------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.1
>
>         Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()

Posted by "Anoop Sam John (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402810#comment-13402810 ] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

Ok Ted.. I will do that... :)
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira