You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Steinmaurer Thomas <Th...@scch.at> on 2011/08/11 08:35:12 UTC

Using HTable.batch - Still room for performance improvements?

Hello,

 

our test data generator client uses HTable.batch to transfer puts to the
server. For example, I think auto flush at table-level is off when
running batches? Any other client API side optimizations when working
with HTable.batch? We have writing to the WAL disabled already.

 

Thanks,

Thomas

 


Re: Using HTable.batch - Still room for performance improvements?

Posted by Doug Meil <do...@explorysmedical.com>.
One thing you might want to look at is HTableUtil.  It's on trunk, but you
can look at the source and port it to whatever version you are using.

We've found that region-sorting helps a lot by minimizing the number of RS
calls in any given flush.





On 8/11/11 5:57 PM, "Jean-Daniel Cryans" <jd...@apache.org> wrote:

>I'm sure you've already seen this but just to be sure, do read
>http://hbase.apache.org/book/perf.writing.html
>
>Auto-flush is always on unless you turn it off yourself. Using
>HTable.batch still respects this except that it flushes all the rows
>at the same time.
>
>If you have fat values to insert and you can compress them, I would
>recommend doing so client-side.
>
>J-D
>
>On Wed, Aug 10, 2011 at 11:35 PM, Steinmaurer Thomas
><Th...@scch.at> wrote:
>> Hello,
>>
>>
>>
>> our test data generator client uses HTable.batch to transfer puts to the
>> server. For example, I think auto flush at table-level is off when
>> running batches? Any other client API side optimizations when working
>> with HTable.batch? We have writing to the WAL disabled already.
>>
>>
>>
>> Thanks,
>>
>> Thomas
>>
>>
>>
>>


Re: Using HTable.batch - Still room for performance improvements?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
I'm sure you've already seen this but just to be sure, do read
http://hbase.apache.org/book/perf.writing.html

Auto-flush is always on unless you turn it off yourself. Using
HTable.batch still respects this except that it flushes all the rows
at the same time.

If you have fat values to insert and you can compress them, I would
recommend doing so client-side.

J-D

On Wed, Aug 10, 2011 at 11:35 PM, Steinmaurer Thomas
<Th...@scch.at> wrote:
> Hello,
>
>
>
> our test data generator client uses HTable.batch to transfer puts to the
> server. For example, I think auto flush at table-level is off when
> running batches? Any other client API side optimizations when working
> with HTable.batch? We have writing to the WAL disabled already.
>
>
>
> Thanks,
>
> Thomas
>
>
>
>