You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Lars George <la...@worldlingo.com> on 2009/09/29 12:36:19 UTC

Use Delete with write-buffer in HTable

Hi Guys,

Is there a reason why Delete's are not also put on the write buffer like 
Put's and then flushed out. That way we get an implicit batch delete 
(using the new batch delete calls internally, just like the batch put 
does) and MR jobs that output Put or Delete are automatically using the 
buffer for much better performance.

I recently had an MR with 18K deletes and since batch delete was not yet 
available it crashed my job constantly because of the huge amount of RPC 
calls.

Opinions?

Lars

Re: Use Delete with write-buffer in HTable

Posted by Lars George <la...@worldlingo.com>.
Hey Michael,

Sure it would change to take Writable's, but is that a big deal? The 
problem is that the batch delete is an explicit call. But using TOF does 
not make use of it because it relies on the table to commit and flush. 
Hence my suggestion to do the same for Delete's - especially seeing how 
slow single deletes are when doing thousands of them.

You did not say hey or ney yet if I read correctly?

Lars

stack wrote:
> The write buffer looks like this currently:
>
>   private final ArrayList<Put> writeBuffer = new ArrayList<Put>();
>
> ... so that would have to change.
>
> A batch delete was added to TRUNK and to head of the 0.20 branch recently
> FYI.
>
> HBASE-1845 is about cleaning up our batching and adding batch Get to the
> mix.
>
> St.Ack
>
> On Tue, Sep 29, 2009 at 3:36 AM, Lars George <la...@worldlingo.com> wrote:
>
>   
>> Hi Guys,
>>
>> Is there a reason why Delete's are not also put on the write buffer like
>> Put's and then flushed out. That way we get an implicit batch delete (using
>> the new batch delete calls internally, just like the batch put does) and MR
>> jobs that output Put or Delete are automatically using the buffer for much
>> better performance.
>>
>> I recently had an MR with 18K deletes and since batch delete was not yet
>> available it crashed my job constantly because of the huge amount of RPC
>> calls.
>>
>> Opinions?
>>
>> Lars
>>
>>     
>
>   

Re: Use Delete with write-buffer in HTable

Posted by stack <st...@duboce.net>.
The write buffer looks like this currently:

  private final ArrayList<Put> writeBuffer = new ArrayList<Put>();

... so that would have to change.

A batch delete was added to TRUNK and to head of the 0.20 branch recently
FYI.

HBASE-1845 is about cleaning up our batching and adding batch Get to the
mix.

St.Ack

On Tue, Sep 29, 2009 at 3:36 AM, Lars George <la...@worldlingo.com> wrote:

> Hi Guys,
>
> Is there a reason why Delete's are not also put on the write buffer like
> Put's and then flushed out. That way we get an implicit batch delete (using
> the new batch delete calls internally, just like the batch put does) and MR
> jobs that output Put or Delete are automatically using the buffer for much
> better performance.
>
> I recently had an MR with 18K deletes and since batch delete was not yet
> available it crashed my job constantly because of the huge amount of RPC
> calls.
>
> Opinions?
>
> Lars
>