You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by fazool mein <fa...@gmail.com> on 2011/05/19 17:02:09 UTC

Reading from client buffer

Hi,

I am going through the Hbase code to understand its properties better.

There is something called 'write buffer' on the client side. Say it is
enabled. Now, assume a client puts value v under key k, and immediately
reads k.

As I understand from the code, the put will be stored in the client side
write buffer, while the read will go to the region server, returing an older
value, instead of v.

Doesn't this violate the ACID semantics (visibility in particular ) of Hbase
given at: http://hbase.apache.org/acid-semantics.html

<quote>

When a client receives a "success" response for any mutation, that mutation
is immediately visible to both that client and any client with whom it later
communicates through side channels.

</quote>

Thanks.

Regards

Re: Reading from client buffer

Posted by Stack <st...@duboce.net>.
On Thu, May 19, 2011 at 8:02 AM, fazool mein <fa...@gmail.com> wrote:
> As I understand from the code, the put will be stored in the client side
> write buffer, while the read will go to the region server, returing an older
> value, instead of v.
>

That is right.


> Doesn't this violate the ACID semantics (visibility in particular ) of Hbase
> given at: http://hbase.apache.org/acid-semantics.html
>
> <quote>
>
> When a client receives a "success" response for any mutation, that mutation
> is immediately visible to both that client and any client with whom it later
> communicates through side channels.
>
> </quote>
>

Please file an issue.

Thanks for noticing this.

St.Ack

Re: Reading from client buffer

Posted by tsuna <ts...@gmail.com>.
On Thu, May 19, 2011 at 8:02 AM, fazool mein <fa...@gmail.com> wrote:
> Doesn't this violate the ACID semantics (visibility in particular ) of Hbase
> given at: http://hbase.apache.org/acid-semantics.html
>
> <quote>
>
> When a client receives a "success" response for any mutation, that mutation
> is immediately visible to both that client and any client with whom it later
> communicates through side channels.
>
> </quote>

That's why asynchbase doesn't return a "success" response to the
client until the write has actually gone through and persisted
successfully.  But asynchbase can do this because it's asynchronous.
You can't have a useful write buffer that's synchronous, because it
would mean that as soon as you call put(), you get blocked until the
buffer is flushed.  Well you can always compensate by creating a
shitload of threads, but I personally don't wanna do that.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: Reading from client buffer

Posted by Fazool <fa...@gmail.com>.
Sure, will do.

On Fri, May 20, 2011 at 12:06 AM, Stack <st...@duboce.net> wrote:

> Mind filing an issue referring to this conversation Fazool?  Thanks.
>  St.Ack
>
> On Thu, May 19, 2011 at 2:49 PM, Fazool <fa...@gmail.com> wrote:
> > Agreed.
> >
> > On Thu, May 19, 2011 at 11:15 PM, Jean-Daniel Cryans <
> jdcryans@apache.org>wrote:
> >
> >> You would still have "confirmed" writes that may never get to the
> >> server, which comes back to my point that the buffer shouldn't be used
> >> in this case.
> >>
> >> J-D
> >>
> >> On Thu, May 19, 2011 at 1:56 PM, Fazool <fa...@gmail.com> wrote:
> >> > Another way would be that when you read, check the write buffer. If it
> is
> >> a
> >> > hit, flush the buffer, and then return the read.
> >> >
> >> > This way, bulk loads will still work, and occasionally, we might have
> a
> >> > slower read.
> >> >
> >> >
> >> > On Thu, May 19, 2011 at 9:14 PM, Jean-Daniel Cryans <
> jdcryans@apache.org
> >> >wrote:
> >> >
> >> >> The write buffer is a hack for faster write performance during bulk
> >> >> loads, no one should use it in a situation like you described.
> >> >>
> >> >> Even if the client was able to read from it's own buffer, the edits
> >> >> didn't make it to the region server so the other clients wouldn't be
> >> >> able to see that new data either. Now let's suppose the client died
> >> >> before flushing, well you would be serving data that actually never
> >> >> existed!
> >> >>
> >> >> I think we should just fix the documentation.
> >> >>
> >> >> J-D
> >> >>
> >> >> On Thu, May 19, 2011 at 8:02 AM, fazool mein <fa...@gmail.com>
> >> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > I am going through the Hbase code to understand its properties
> better.
> >> >> >
> >> >> > There is something called 'write buffer' on the client side. Say it
> is
> >> >> > enabled. Now, assume a client puts value v under key k, and
> >> immediately
> >> >> > reads k.
> >> >> >
> >> >> > As I understand from the code, the put will be stored in the client
> >> side
> >> >> > write buffer, while the read will go to the region server, returing
> an
> >> >> older
> >> >> > value, instead of v.
> >> >> >
> >> >> > Doesn't this violate the ACID semantics (visibility in particular )
> of
> >> >> Hbase
> >> >> > given at: http://hbase.apache.org/acid-semantics.html
> >> >> >
> >> >> > <quote>
> >> >> >
> >> >> > When a client receives a "success" response for any mutation, that
> >> >> mutation
> >> >> > is immediately visible to both that client and any client with whom
> it
> >> >> later
> >> >> > communicates through side channels.
> >> >> >
> >> >> > </quote>
> >> >> >
> >> >> > Thanks.
> >> >> >
> >> >> > Regards
> >> >> >
> >> >>
> >> >
> >>
> >
>

Re: Reading from client buffer

Posted by Stack <st...@duboce.net>.
Mind filing an issue referring to this conversation Fazool?  Thanks.  St.Ack

On Thu, May 19, 2011 at 2:49 PM, Fazool <fa...@gmail.com> wrote:
> Agreed.
>
> On Thu, May 19, 2011 at 11:15 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> You would still have "confirmed" writes that may never get to the
>> server, which comes back to my point that the buffer shouldn't be used
>> in this case.
>>
>> J-D
>>
>> On Thu, May 19, 2011 at 1:56 PM, Fazool <fa...@gmail.com> wrote:
>> > Another way would be that when you read, check the write buffer. If it is
>> a
>> > hit, flush the buffer, and then return the read.
>> >
>> > This way, bulk loads will still work, and occasionally, we might have a
>> > slower read.
>> >
>> >
>> > On Thu, May 19, 2011 at 9:14 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> The write buffer is a hack for faster write performance during bulk
>> >> loads, no one should use it in a situation like you described.
>> >>
>> >> Even if the client was able to read from it's own buffer, the edits
>> >> didn't make it to the region server so the other clients wouldn't be
>> >> able to see that new data either. Now let's suppose the client died
>> >> before flushing, well you would be serving data that actually never
>> >> existed!
>> >>
>> >> I think we should just fix the documentation.
>> >>
>> >> J-D
>> >>
>> >> On Thu, May 19, 2011 at 8:02 AM, fazool mein <fa...@gmail.com>
>> wrote:
>> >> > Hi,
>> >> >
>> >> > I am going through the Hbase code to understand its properties better.
>> >> >
>> >> > There is something called 'write buffer' on the client side. Say it is
>> >> > enabled. Now, assume a client puts value v under key k, and
>> immediately
>> >> > reads k.
>> >> >
>> >> > As I understand from the code, the put will be stored in the client
>> side
>> >> > write buffer, while the read will go to the region server, returing an
>> >> older
>> >> > value, instead of v.
>> >> >
>> >> > Doesn't this violate the ACID semantics (visibility in particular ) of
>> >> Hbase
>> >> > given at: http://hbase.apache.org/acid-semantics.html
>> >> >
>> >> > <quote>
>> >> >
>> >> > When a client receives a "success" response for any mutation, that
>> >> mutation
>> >> > is immediately visible to both that client and any client with whom it
>> >> later
>> >> > communicates through side channels.
>> >> >
>> >> > </quote>
>> >> >
>> >> > Thanks.
>> >> >
>> >> > Regards
>> >> >
>> >>
>> >
>>
>

Re: Reading from client buffer

Posted by Fazool <fa...@gmail.com>.
Agreed.

On Thu, May 19, 2011 at 11:15 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> You would still have "confirmed" writes that may never get to the
> server, which comes back to my point that the buffer shouldn't be used
> in this case.
>
> J-D
>
> On Thu, May 19, 2011 at 1:56 PM, Fazool <fa...@gmail.com> wrote:
> > Another way would be that when you read, check the write buffer. If it is
> a
> > hit, flush the buffer, and then return the read.
> >
> > This way, bulk loads will still work, and occasionally, we might have a
> > slower read.
> >
> >
> > On Thu, May 19, 2011 at 9:14 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> The write buffer is a hack for faster write performance during bulk
> >> loads, no one should use it in a situation like you described.
> >>
> >> Even if the client was able to read from it's own buffer, the edits
> >> didn't make it to the region server so the other clients wouldn't be
> >> able to see that new data either. Now let's suppose the client died
> >> before flushing, well you would be serving data that actually never
> >> existed!
> >>
> >> I think we should just fix the documentation.
> >>
> >> J-D
> >>
> >> On Thu, May 19, 2011 at 8:02 AM, fazool mein <fa...@gmail.com>
> wrote:
> >> > Hi,
> >> >
> >> > I am going through the Hbase code to understand its properties better.
> >> >
> >> > There is something called 'write buffer' on the client side. Say it is
> >> > enabled. Now, assume a client puts value v under key k, and
> immediately
> >> > reads k.
> >> >
> >> > As I understand from the code, the put will be stored in the client
> side
> >> > write buffer, while the read will go to the region server, returing an
> >> older
> >> > value, instead of v.
> >> >
> >> > Doesn't this violate the ACID semantics (visibility in particular ) of
> >> Hbase
> >> > given at: http://hbase.apache.org/acid-semantics.html
> >> >
> >> > <quote>
> >> >
> >> > When a client receives a "success" response for any mutation, that
> >> mutation
> >> > is immediately visible to both that client and any client with whom it
> >> later
> >> > communicates through side channels.
> >> >
> >> > </quote>
> >> >
> >> > Thanks.
> >> >
> >> > Regards
> >> >
> >>
> >
>

Re: Reading from client buffer

Posted by Jean-Daniel Cryans <jd...@apache.org>.
You would still have "confirmed" writes that may never get to the
server, which comes back to my point that the buffer shouldn't be used
in this case.

J-D

On Thu, May 19, 2011 at 1:56 PM, Fazool <fa...@gmail.com> wrote:
> Another way would be that when you read, check the write buffer. If it is a
> hit, flush the buffer, and then return the read.
>
> This way, bulk loads will still work, and occasionally, we might have a
> slower read.
>
>
> On Thu, May 19, 2011 at 9:14 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> The write buffer is a hack for faster write performance during bulk
>> loads, no one should use it in a situation like you described.
>>
>> Even if the client was able to read from it's own buffer, the edits
>> didn't make it to the region server so the other clients wouldn't be
>> able to see that new data either. Now let's suppose the client died
>> before flushing, well you would be serving data that actually never
>> existed!
>>
>> I think we should just fix the documentation.
>>
>> J-D
>>
>> On Thu, May 19, 2011 at 8:02 AM, fazool mein <fa...@gmail.com> wrote:
>> > Hi,
>> >
>> > I am going through the Hbase code to understand its properties better.
>> >
>> > There is something called 'write buffer' on the client side. Say it is
>> > enabled. Now, assume a client puts value v under key k, and immediately
>> > reads k.
>> >
>> > As I understand from the code, the put will be stored in the client side
>> > write buffer, while the read will go to the region server, returing an
>> older
>> > value, instead of v.
>> >
>> > Doesn't this violate the ACID semantics (visibility in particular ) of
>> Hbase
>> > given at: http://hbase.apache.org/acid-semantics.html
>> >
>> > <quote>
>> >
>> > When a client receives a "success" response for any mutation, that
>> mutation
>> > is immediately visible to both that client and any client with whom it
>> later
>> > communicates through side channels.
>> >
>> > </quote>
>> >
>> > Thanks.
>> >
>> > Regards
>> >
>>
>

Re: Reading from client buffer

Posted by Fazool <fa...@gmail.com>.
Another way would be that when you read, check the write buffer. If it is a
hit, flush the buffer, and then return the read.

This way, bulk loads will still work, and occasionally, we might have a
slower read.


On Thu, May 19, 2011 at 9:14 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> The write buffer is a hack for faster write performance during bulk
> loads, no one should use it in a situation like you described.
>
> Even if the client was able to read from it's own buffer, the edits
> didn't make it to the region server so the other clients wouldn't be
> able to see that new data either. Now let's suppose the client died
> before flushing, well you would be serving data that actually never
> existed!
>
> I think we should just fix the documentation.
>
> J-D
>
> On Thu, May 19, 2011 at 8:02 AM, fazool mein <fa...@gmail.com> wrote:
> > Hi,
> >
> > I am going through the Hbase code to understand its properties better.
> >
> > There is something called 'write buffer' on the client side. Say it is
> > enabled. Now, assume a client puts value v under key k, and immediately
> > reads k.
> >
> > As I understand from the code, the put will be stored in the client side
> > write buffer, while the read will go to the region server, returing an
> older
> > value, instead of v.
> >
> > Doesn't this violate the ACID semantics (visibility in particular ) of
> Hbase
> > given at: http://hbase.apache.org/acid-semantics.html
> >
> > <quote>
> >
> > When a client receives a "success" response for any mutation, that
> mutation
> > is immediately visible to both that client and any client with whom it
> later
> > communicates through side channels.
> >
> > </quote>
> >
> > Thanks.
> >
> > Regards
> >
>

Re: Reading from client buffer

Posted by Jean-Daniel Cryans <jd...@apache.org>.
The write buffer is a hack for faster write performance during bulk
loads, no one should use it in a situation like you described.

Even if the client was able to read from it's own buffer, the edits
didn't make it to the region server so the other clients wouldn't be
able to see that new data either. Now let's suppose the client died
before flushing, well you would be serving data that actually never
existed!

I think we should just fix the documentation.

J-D

On Thu, May 19, 2011 at 8:02 AM, fazool mein <fa...@gmail.com> wrote:
> Hi,
>
> I am going through the Hbase code to understand its properties better.
>
> There is something called 'write buffer' on the client side. Say it is
> enabled. Now, assume a client puts value v under key k, and immediately
> reads k.
>
> As I understand from the code, the put will be stored in the client side
> write buffer, while the read will go to the region server, returing an older
> value, instead of v.
>
> Doesn't this violate the ACID semantics (visibility in particular ) of Hbase
> given at: http://hbase.apache.org/acid-semantics.html
>
> <quote>
>
> When a client receives a "success" response for any mutation, that mutation
> is immediately visible to both that client and any client with whom it later
> communicates through side channels.
>
> </quote>
>
> Thanks.
>
> Regards
>