You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Erik Holstad (JIRA)" <ji...@apache.org> on 2009/09/16 17:51:57 UTC

[jira] Created: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
--------------------------------------------------------------------

                 Key: HBASE-1845
                 URL: https://issues.apache.org/jira/browse/HBASE-1845
             Project: Hadoop HBase
          Issue Type: New Feature
            Reporter: Erik Holstad
             Fix For: 0.21.0


I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
look like. 

First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Marc Limotte (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832619#action_12832619 ] 

Marc Limotte commented on HBASE-1845:
-------------------------------------

Ryan, 

Good point.  Maybe there is time to get it into 0.21?


> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, hbase-1845_0.20.3.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757546#action_12757546 ] 

ryan rawson commented on HBASE-1845:
------------------------------------

great, i havent been looking at patches yet, too busy. much better approach, i look forward to seeing it speed up some of my jobs.

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757528#action_12757528 ] 

ryan rawson commented on HBASE-1845:
------------------------------------

when we batch, the old code used to batch by region, which is not the most performant, instead we should batch by regionserver. the recovery might be a little more complex, since did we fail due to split or due to regions moving?

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Erik Holstad (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784631#action_12784631 ] 

Erik Holstad commented on HBASE-1845:
-------------------------------------

@ Ryan!
Sorry that I haven't been to active on this, but have had some other stuff to work on.
But it would be great if you would test it out so I could get some feedback.

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Erik Holstad (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832103#action_12832103 ] 

Erik Holstad commented on HBASE-1845:
-------------------------------------

Marc, have at it, good luck and let me know if you have any questions.

Erik

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, hbase-1845_0.20.3.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Erik Holstad (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757558#action_12757558 ] 

Erik Holstad commented on HBASE-1845:
-------------------------------------

Will try to move the sorting into the server instead of the client so it will be done on smaller lists and in parallel on the individual servers.

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757539#action_12757539 ] 

Jonathan Gray commented on HBASE-1845:
--------------------------------------

@ryan this groups by regionserver, and sends out to each in threads.  it definitely adds complexity to recovery, but is solvable and at least partially solved in the current patch, though not well tested yet.

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774990#action_12774990 ] 

Andrew Purtell commented on HBASE-1845:
---------------------------------------

I assume that write buffering of single operations will wrap the new multi calls. Currently there is a wart with respect to the write buffer.

>From a Trend dev team:

{quote}
When insert rows into one table by calling the method public synchronized void put(final Put put), if the column family of one row does not exist, the insert operation will failed and throw NoSuchColumnFamilyException.. We observed that all the following insert operation will fails even though all of them have valid column family. That is one exception of insert operation can cause failure of all the following insert operation.
{quote}

Their further analysis explains in detail the scenario, which I will summarize here:

1) An invalid put is added to the writeBuffer by put(Put put). It will trigger a NoSuchColumnFamilyException once it goes to the region server.

2) At some point, the buffer is flushed.

3) When the invalid put is processed, an exception is thrown. The finally clause of flushCommits() removes all successful puts from the writebuffer list but the failed put remains at the top.

4) Subsequent puts will add more entries to the write buffer but the first entry on the list is invalid so eventually every Put will throw an exception once the buffer limit . 

I don't see how the patch on this issue handles this. The invalid entries will be retried here over and over as well. 

A workaround with the current write buffering in HTable is for the client to call getWriteBuffer() and remove the entry at the head of the list manually. 

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Erik Holstad (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Holstad updated HBASE-1845:
--------------------------------

    Attachment: batch.patch

Fixed some issues after comments from Jonathan, also moved most of the code into the HConnectionManager to not pollute 
HTable too much. Renamed the call from multi to batch and fixed some bugs related to the grouping into HRS maps.

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Erik Holstad (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Holstad updated HBASE-1845:
--------------------------------

    Attachment:     (was: batch.patch)

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Erik Holstad (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775021#action_12775021 ] 

Erik Holstad commented on HBASE-1845:
-------------------------------------

@Andrew
I haven't worked on this for a little while, but they way I was thinking it is that the failed inserts gets returned separately from the successful once and that only the failed are retried.
If the code that you have looked at doesn't do that, it is wrong. 

About the write buffer I'm not sure how we want to do, since I think that we are gong to be able to mix calls, get, puts and deletes in a single multi call, so we have to decide if this is something that makes sense and in that case maybe not use the write buffer. But like I said earlier, havn't looked at the code recently, so don't exactly remember. 

Erik

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784027#action_12784027 ] 

ryan rawson commented on HBASE-1845:
------------------------------------

Hey guys any update on this patch?  Should I beta test it and see if I can shake out and hopefully fix some bugs?

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832114#action_12832114 ] 

ryan rawson commented on HBASE-1845:
------------------------------------

this won't be able to go into 0.20 for the same reason 2066 cant - it requires a version bump in the RPC. Even adding methods changes the on-wire protocol alas.

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, hbase-1845_0.20.3.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Erik Holstad (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Holstad updated HBASE-1845:
--------------------------------

    Attachment: multi-v1.patch

Added a first draft of something that is a way to deal with batching of all types of actions, Deletes, Gets and Puts.

The patch includes a couple of different classes to deal with sending and receiving theses multi calls. The calls to the different Region servers
are threaded on the client, but the instantiation is happening in the call at the moment, will be moved so we don't have to do it every time.
The return format from all the calls multi(List<Delete>), multi(List<Get>) and multi(List<Put>) all return the same type Result [] of the same length
as the request to the specific RS. For all elements where the Result[i] == null we have to retry the action, from the client. This common return type
adds a little bit of overhead for returns of Puts and Deletes but is needed to keep it simple and generic.

Changed the Row interface a little to be WritableComparable<Row>, needed for the new methods.

I have only done sanity check tests so far and are going to add more test to cover corner cases, but just wanted
to get the first draft out there to get feedback.

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Erik Holstad (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757553#action_12757553 ] 

Erik Holstad commented on HBASE-1845:
-------------------------------------

@Stack 

Deprecating sounds good to me, will have the old calls use this new one behind the scenes if we decide to use this.

I would say that checking if the list is sorted will probably go pretty fast, usually, since it breaks at the first occurrence of an unsorted element, but I'm open to remove it, would like to do
some timing test to see if it is worth it or not before making the decision. 

Will add the check for one element

Like the name Results, but not really liking Rows but it is not a big deal to me either. Think that batch is kinda ok.

The reason that Row needs to be writable is the same as for Filter. Will make it WritableComparable, so that will lead to Gets, Puts and Deletes just implement Row. Not sure that this is good though, to hide those inside Row?

As the code is written now, you can mix and match all of the types into one batch, but I need to do more testing on this and we might need to extend the compare code to include something more than row.

Yeah, I guess that it what that means, Not really sure what would be the best alternative here and if we are looking to make HTable thread safe?

I have done some test for recovery, that are not in the patch, but those required modifying the actual code, but will try to figure something out so we can put it in a unit test.



> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832620#action_12832620 ] 

stack commented on HBASE-1845:
------------------------------

Marc: I'd aim for 0.21, yes.  Also, see hbase-2209 for more on what Kay Kay is on about.

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, hbase-1845_0.20.3.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757577#action_12757577 ] 

stack commented on HBASE-1845:
------------------------------

HTable is not thread-safe, for updates anyways. Lets go w/ that.  Just make sure that changes down in HCM are.

Would be sweet if didn't have to do all testing spinning up a minicluster.

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Marc Limotte (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marc Limotte updated HBASE-1845:
--------------------------------

    Attachment: hbase-1845_0.20.3.patch

After some discussion with Jonathan, it seems that Erik is no longer actively working on this patch.  I'd like to help out.

So far, I've updated the patch, so that it applies to hbase-0.20.3, and I've expanded some of the unit tests.  I have not yet implemented the other comments in this ticket and there's still some other clean up to do.

Also, I haven't looked in detail at HBASE-2066, but need to think about how it impacts this issue.  I'm not sure, yet, but I don't think it's a replacement of the MultiPut functionality here.  



> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, hbase-1845_0.20.3.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Erik Holstad (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Holstad updated HBASE-1845:
--------------------------------

    Attachment: batch.patch

Small changes

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757497#action_12757497 ] 

stack commented on HBASE-1845:
------------------------------

I think this patch looks great Holstad.

Here are some initial comments.

Deprecate the old means of batching in favor of your new method?  You can have the old methods call into your new method?

The check a list is sorted probably takes as long as actual sort?

If only one item in list, avoid sort?  Its an optimization we have elsehwere.

Multi should be renamed.  Name it Rows?  MultiResults should be renamed Results?  Says in class comment that Multi is about Gets but its more general than this?

Row has to be a Writable?  Is it OK making it WritableComparable?  I was going to make Row implement Comparable but thought that it'd narrow our ability to mix it in?  If Row is WritableComparable, then should change how Get, Put, etc. implement removing the Comparable and Writable and just use Row?

Can we batch Puts, Deletes, and Gets?  Or does the batch have to be pure -- all of one type?

Whats the threading story?  HTable has a pool of executors used for batching.  So, this would be a continuation of HTable not being thread-safe.  It looks like HCM should be fine if used by many threads?

Testing of failure mid-put with verification that we recover properly would be nice, especially if could be done in a unit test?












> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839548#action_12839548 ] 

ryan rawson commented on HBASE-1845:
------------------------------------

you should work this against the branch for 0.20.4 - will will introduce HBASE-2219 to the branch and allow the addition of methods in patches from now on.  So aim for the branch and we can get it in for 0.20.4 - sounds good?

> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, hbase-1845_0.20.3.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1845) Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut

Posted by "Marc Limotte (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839480#action_12839480 ] 

Marc Limotte commented on HBASE-1845:
-------------------------------------

Stack, thanks for the pointer on 2209.

To shoot for 0.21, I should work against the trunk?  Or is there a separate branch for what will be 0.21?



> Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut
> --------------------------------------------------------------------
>
>                 Key: HBASE-1845
>                 URL: https://issues.apache.org/jira/browse/HBASE-1845
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Erik Holstad
>             Fix For: 0.21.0
>
>         Attachments: batch.patch, hbase-1845_0.20.3.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls and would like to get some input and thoughts about how we should handle this and what the protocol should
> look like. 
> First naive patch, coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.