You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Lars George <la...@worldlingo.com> on 2009/09/29 12:36:19 UTC
Use Delete with write-buffer in HTable
Hi Guys,
Is there a reason why Delete's are not also put on the write buffer like
Put's and then flushed out. That way we get an implicit batch delete
(using the new batch delete calls internally, just like the batch put
does) and MR jobs that output Put or Delete are automatically using the
buffer for much better performance.
I recently had an MR with 18K deletes and since batch delete was not yet
available it crashed my job constantly because of the huge amount of RPC
calls.
Opinions?
Lars
Re: Use Delete with write-buffer in HTable
Posted by Lars George <la...@worldlingo.com>.
Hey Michael,
Sure it would change to take Writable's, but is that a big deal? The
problem is that the batch delete is an explicit call. But using TOF does
not make use of it because it relies on the table to commit and flush.
Hence my suggestion to do the same for Delete's - especially seeing how
slow single deletes are when doing thousands of them.
You did not say hey or ney yet if I read correctly?
Lars
stack wrote:
> The write buffer looks like this currently:
>
> private final ArrayList<Put> writeBuffer = new ArrayList<Put>();
>
> ... so that would have to change.
>
> A batch delete was added to TRUNK and to head of the 0.20 branch recently
> FYI.
>
> HBASE-1845 is about cleaning up our batching and adding batch Get to the
> mix.
>
> St.Ack
>
> On Tue, Sep 29, 2009 at 3:36 AM, Lars George <la...@worldlingo.com> wrote:
>
>
>> Hi Guys,
>>
>> Is there a reason why Delete's are not also put on the write buffer like
>> Put's and then flushed out. That way we get an implicit batch delete (using
>> the new batch delete calls internally, just like the batch put does) and MR
>> jobs that output Put or Delete are automatically using the buffer for much
>> better performance.
>>
>> I recently had an MR with 18K deletes and since batch delete was not yet
>> available it crashed my job constantly because of the huge amount of RPC
>> calls.
>>
>> Opinions?
>>
>> Lars
>>
>>
>
>
Re: Use Delete with write-buffer in HTable
Posted by stack <st...@duboce.net>.
The write buffer looks like this currently:
private final ArrayList<Put> writeBuffer = new ArrayList<Put>();
... so that would have to change.
A batch delete was added to TRUNK and to head of the 0.20 branch recently
FYI.
HBASE-1845 is about cleaning up our batching and adding batch Get to the
mix.
St.Ack
On Tue, Sep 29, 2009 at 3:36 AM, Lars George <la...@worldlingo.com> wrote:
> Hi Guys,
>
> Is there a reason why Delete's are not also put on the write buffer like
> Put's and then flushed out. That way we get an implicit batch delete (using
> the new batch delete calls internally, just like the batch put does) and MR
> jobs that output Put or Delete are automatically using the buffer for much
> better performance.
>
> I recently had an MR with 18K deletes and since batch delete was not yet
> available it crashed my job constantly because of the huge amount of RPC
> calls.
>
> Opinions?
>
> Lars
>