You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2011/08/11 20:30:02 UTC

About the new HBA.flush behavior in 0.90

Hey devs,

I want to have your opinion on the new way HBA.flush is working. It
used to be that it would contact the master which issued the flush
calls to every RS which were all queued. Now HBA calls every RS for
every region (so if you have 2000k regions in a table, it's that many
RPCs) and the flushing is done in-line meaning that in situations like
mine my call has been running for now more than an hour.

While it's nice to be able to tell if everything is truly flushed, the
current function doesn't give feedback on its progress so you don't
even know how far you are.

(oh my flush is done, took 1h25min)

So what do people think? Should we have both a flush and a flushAsync
command like we have for creating tables? The latter would ideally
queue all the flushes instead of doing them inline, which would also
require new HRS public method.

Also we could optimize how it works right now by adding some
parallelization while keeping the current guarantees.

J-D

Re: About the new HBA.flush behavior in 0.90

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Thanks Ted, I created https://issues.apache.org/jira/browse/HBASE-4198

J-D

On Thu, Aug 11, 2011 at 11:50 AM, Ted Yu <yu...@gmail.com> wrote:
> I think we can
> 1. introduce flushRegions() for region server
> 2. batch HRegionInfo's per server in HBA.flush() to call the above new API
>
> An asynchronous flushAsync() may be useful as well.
>
> On Thu, Aug 11, 2011 at 11:30 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> Hey devs,
>>
>> I want to have your opinion on the new way HBA.flush is working. It
>> used to be that it would contact the master which issued the flush
>> calls to every RS which were all queued. Now HBA calls every RS for
>> every region (so if you have 2000k regions in a table, it's that many
>> RPCs) and the flushing is done in-line meaning that in situations like
>> mine my call has been running for now more than an hour.
>>
>> While it's nice to be able to tell if everything is truly flushed, the
>> current function doesn't give feedback on its progress so you don't
>> even know how far you are.
>>
>> (oh my flush is done, took 1h25min)
>>
>> So what do people think? Should we have both a flush and a flushAsync
>> command like we have for creating tables? The latter would ideally
>> queue all the flushes instead of doing them inline, which would also
>> require new HRS public method.
>>
>> Also we could optimize how it works right now by adding some
>> parallelization while keeping the current guarantees.
>>
>> J-D
>>
>

Re: About the new HBA.flush behavior in 0.90

Posted by Ted Yu <yu...@gmail.com>.
I think we can
1. introduce flushRegions() for region server
2. batch HRegionInfo's per server in HBA.flush() to call the above new API

An asynchronous flushAsync() may be useful as well.

On Thu, Aug 11, 2011 at 11:30 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Hey devs,
>
> I want to have your opinion on the new way HBA.flush is working. It
> used to be that it would contact the master which issued the flush
> calls to every RS which were all queued. Now HBA calls every RS for
> every region (so if you have 2000k regions in a table, it's that many
> RPCs) and the flushing is done in-line meaning that in situations like
> mine my call has been running for now more than an hour.
>
> While it's nice to be able to tell if everything is truly flushed, the
> current function doesn't give feedback on its progress so you don't
> even know how far you are.
>
> (oh my flush is done, took 1h25min)
>
> So what do people think? Should we have both a flush and a flushAsync
> command like we have for creating tables? The latter would ideally
> queue all the flushes instead of doing them inline, which would also
> require new HRS public method.
>
> Also we could optimize how it works right now by adding some
> parallelization while keeping the current guarantees.
>
> J-D
>