You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Bruno Dumon <br...@outerthought.org> on 2010/01/26 10:42:33 UTC

Atomic update of a single row

Hi,

At various places I have read that row writes are atomic.

However, from a curious look at the code of the put method in
HRegion.java, it seems like the updates of a put operation are written
to the WAL only for one column family at a time. Is this understanding
correct, so would it be more correct to say that the writes are
actually atomic per column family within a row?

On a related note, it would be nice if one could do both put and
delete operations on one row in an atomic manner.

Thanks,

Bruno

-- 
Bruno Dumon
Outerthought
http://outerthought.org/

Re: Atomic update of a single row

Posted by Ram Kulbak <ra...@gmail.com>.
I think that the scanning logic was fixed in 0.20.3 (memstore is now cloned).
It's actually GETs that are still not atomic, try running
TestHRegion.testWritesWhileGetting while increasing numQualifiers to
1000.

Regards,
Yoram

On Wed, Jan 27, 2010 at 8:48 AM, Ryan Rawson <ry...@gmail.com> wrote:
> Under scanners and log recovery there is no guarantee to row
> atomicity.  This is to be fixed in 0.21 when log recovery is now a
> real possibility (thanks to HDFS-0.21) and scanners need to be fixed
> since the current get code might be replaced with a 1 row scan call.
>
> -ryan
>
> On Tue, Jan 26, 2010 at 12:53 PM, Bruno Dumon <br...@outerthought.org> wrote:
>> The lock will in any case cause that writes don't happen concurrently.
>>
>> But if a region server were to die between the updates to two column
>> families of one row (that are done in one Put operation), would the
>> update then be partially applied?
>>
>> And that makes me also wonder: do these locks also apply to reads?
>> Thus, will all the updates to one row that are part of one Put
>> operation become visible 'atomicly' to readers?
>>
>> Thanks for any clarification.
>>
>> Bruno.
>>
>> On Tue, Jan 26, 2010 at 8:02 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>>> In get and put inside HRegion we call that line
>>>
>>> Integer lid = getLock(lockid, row);
>>>
>>> Even if you don't provide a row lock, it will create one for you and
>>> do the locking stuff. That happens before everything else, so is it
>>> fair to say that row reads are atomic?
>>>
>>> J-D
>>>
>>> On Tue, Jan 26, 2010 at 1:42 AM, Bruno Dumon <br...@outerthought.org> wrote:
>>>> Hi,
>>>>
>>>> At various places I have read that row writes are atomic.
>>>>
>>>> However, from a curious look at the code of the put method in
>>>> HRegion.java, it seems like the updates of a put operation are written
>>>> to the WAL only for one column family at a time. Is this understanding
>>>> correct, so would it be more correct to say that the writes are
>>>> actually atomic per column family within a row?
>>>>
>>>> On a related note, it would be nice if one could do both put and
>>>> delete operations on one row in an atomic manner.
>>>>
>>>> Thanks,
>>>>
>>>> Bruno
>>>>
>>
>

Re: Atomic update of a single row

Posted by Ryan Rawson <ry...@gmail.com>.
Under scanners and log recovery there is no guarantee to row
atomicity.  This is to be fixed in 0.21 when log recovery is now a
real possibility (thanks to HDFS-0.21) and scanners need to be fixed
since the current get code might be replaced with a 1 row scan call.

-ryan

On Tue, Jan 26, 2010 at 12:53 PM, Bruno Dumon <br...@outerthought.org> wrote:
> The lock will in any case cause that writes don't happen concurrently.
>
> But if a region server were to die between the updates to two column
> families of one row (that are done in one Put operation), would the
> update then be partially applied?
>
> And that makes me also wonder: do these locks also apply to reads?
> Thus, will all the updates to one row that are part of one Put
> operation become visible 'atomicly' to readers?
>
> Thanks for any clarification.
>
> Bruno.
>
> On Tue, Jan 26, 2010 at 8:02 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>> In get and put inside HRegion we call that line
>>
>> Integer lid = getLock(lockid, row);
>>
>> Even if you don't provide a row lock, it will create one for you and
>> do the locking stuff. That happens before everything else, so is it
>> fair to say that row reads are atomic?
>>
>> J-D
>>
>> On Tue, Jan 26, 2010 at 1:42 AM, Bruno Dumon <br...@outerthought.org> wrote:
>>> Hi,
>>>
>>> At various places I have read that row writes are atomic.
>>>
>>> However, from a curious look at the code of the put method in
>>> HRegion.java, it seems like the updates of a put operation are written
>>> to the WAL only for one column family at a time. Is this understanding
>>> correct, so would it be more correct to say that the writes are
>>> actually atomic per column family within a row?
>>>
>>> On a related note, it would be nice if one could do both put and
>>> delete operations on one row in an atomic manner.
>>>
>>> Thanks,
>>>
>>> Bruno
>>>
>

Re: Atomic update of a single row

Posted by Bruno Dumon <br...@outerthought.org>.
The lock will in any case cause that writes don't happen concurrently.

But if a region server were to die between the updates to two column
families of one row (that are done in one Put operation), would the
update then be partially applied?

And that makes me also wonder: do these locks also apply to reads?
Thus, will all the updates to one row that are part of one Put
operation become visible 'atomicly' to readers?

Thanks for any clarification.

Bruno.

On Tue, Jan 26, 2010 at 8:02 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
> In get and put inside HRegion we call that line
>
> Integer lid = getLock(lockid, row);
>
> Even if you don't provide a row lock, it will create one for you and
> do the locking stuff. That happens before everything else, so is it
> fair to say that row reads are atomic?
>
> J-D
>
> On Tue, Jan 26, 2010 at 1:42 AM, Bruno Dumon <br...@outerthought.org> wrote:
>> Hi,
>>
>> At various places I have read that row writes are atomic.
>>
>> However, from a curious look at the code of the put method in
>> HRegion.java, it seems like the updates of a put operation are written
>> to the WAL only for one column family at a time. Is this understanding
>> correct, so would it be more correct to say that the writes are
>> actually atomic per column family within a row?
>>
>> On a related note, it would be nice if one could do both put and
>> delete operations on one row in an atomic manner.
>>
>> Thanks,
>>
>> Bruno
>>

Re: Atomic update of a single row

Posted by Jean-Daniel Cryans <jd...@apache.org>.
In get and put inside HRegion we call that line

Integer lid = getLock(lockid, row);

Even if you don't provide a row lock, it will create one for you and
do the locking stuff. That happens before everything else, so is it
fair to say that row reads are atomic?

J-D

On Tue, Jan 26, 2010 at 1:42 AM, Bruno Dumon <br...@outerthought.org> wrote:
> Hi,
>
> At various places I have read that row writes are atomic.
>
> However, from a curious look at the code of the put method in
> HRegion.java, it seems like the updates of a put operation are written
> to the WAL only for one column family at a time. Is this understanding
> correct, so would it be more correct to say that the writes are
> actually atomic per column family within a row?
>
> On a related note, it would be nice if one could do both put and
> delete operations on one row in an atomic manner.
>
> Thanks,
>
> Bruno
>
> --
> Bruno Dumon
> Outerthought
> http://outerthought.org/
>