You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Aaron Zimmerman <az...@sproutsocial.com> on 2013/04/25 22:09:15 UTC

writing and reading from a region at once

Hi,

  If a region is being written to, and a scanner takes a lease out on the region, what will happen to the writes?  Is there a concept of "Transaction Isolation Levels"?   

  I don't see errors in Puts while the tables are being scanned?  But it seems that I'm losing writes somewhere, is it possible the writes could fail silently?
  
thanks,

Aaron Zimmerman


Re: writing and reading from a region at once

Posted by Anoop John <an...@gmail.com>.
>But it seems that I'm losing writes somewhere, is it possible the writes
could fail silently

Which version you are using?  How you say writes missed silently?  The
current read, which was going on, has not returned the row that you just
wrote?  Or you have created a new scan after wards and in that also the
written data is missing?

-Anoop-

On Fri, Apr 26, 2013 at 3:04 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Inline.
>
> J-D
>
>
> On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman <
> azimmerman@sproutsocial.com> wrote:
>
> > Hi,
> >
> >   If a region is being written to, and a scanner takes a lease out on the
> > region, what will happen to the writes?  Is there a concept of
> "Transaction
> > Isolation Levels"?
> >
>
> There's MVCC, so reads can happen while someone else is writing. What you
> should expect from HBase is "read committed".
>
>
> >
> >   I don't see errors in Puts while the tables are being scanned?  But it
> > seems that I'm losing writes somewhere, is it possible the writes could
> > fail silently?
> >
>
>  Is it temporary while you're scanning or there's really data missing at
> the end of the day? The former might happen on some older HBase versions
> while the latter should never happen unless you lower the durability level
> yourself and have machine failures.
>
> J-D
>

Re: writing and reading from a region at once

Posted by Aaron Zimmerman <az...@sproutsocial.com>.
Perfect - The readPoint / WriteNumber stuff looks like what I was looking for. 

thanks!

AZ


On Apr 29, 2013, at 11:26 AM, ramkrishna vasudevan <ra...@gmail.com> wrote:

> Check this out
> http://www.slideshare.net/cloudera/3-learning-h-base-internals-lars-hofhansl-salesforce-final
> 
> Regards
> Ram
> 
> 
> On Mon, Apr 29, 2013 at 7:59 PM, Aaron Zimmerman <
> azimmerman@sproutsocial.com> wrote:
> 
>> Thanks J-D.  I'm interested in learning how HBase handles MVCC - do you
>> know of any resources explaining?  I have data streaming into hbase around
>> 6 million records a day, and I'm scanning the tables pretty much constantly
>> with mapreduce jobs that rollup and store the data elsewhere.  So I want to
>> understand better exactly which rows are present in a given scan.
>> 
>> If I create a scan that reads an entire region, I would expect a "read
>> committed" level to lock that region during the read (which might take a
>> few minutes).    So what happens to the rows that are inserted during the
>> scan?
>> 
>> AZ
>> 
>> 
>> On Apr 25, 2013, at 4:34 PM, Jean-Daniel Cryans <jd...@apache.org>
>> wrote:
>> 
>>> Inline.
>>> 
>>> J-D
>>> 
>>> 
>>> On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman <
>>> azimmerman@sproutsocial.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> If a region is being written to, and a scanner takes a lease out on the
>>>> region, what will happen to the writes?  Is there a concept of
>> "Transaction
>>>> Isolation Levels"?
>>>> 
>>> 
>>> There's MVCC, so reads can happen while someone else is writing. What you
>>> should expect from HBase is "read committed".
>>> 
>>> 
>>>> 
>>>> I don't see errors in Puts while the tables are being scanned?  But it
>>>> seems that I'm losing writes somewhere, is it possible the writes could
>>>> fail silently?
>>>> 
>>> 
>>> Is it temporary while you're scanning or there's really data missing at
>>> the end of the day? The former might happen on some older HBase versions
>>> while the latter should never happen unless you lower the durability
>> level
>>> yourself and have machine failures.
>>> 
>>> J-D
>> 
>> 


Re: writing and reading from a region at once

Posted by ramkrishna vasudevan <ra...@gmail.com>.
Check this out
http://www.slideshare.net/cloudera/3-learning-h-base-internals-lars-hofhansl-salesforce-final

Regards
Ram


On Mon, Apr 29, 2013 at 7:59 PM, Aaron Zimmerman <
azimmerman@sproutsocial.com> wrote:

> Thanks J-D.  I'm interested in learning how HBase handles MVCC - do you
> know of any resources explaining?  I have data streaming into hbase around
> 6 million records a day, and I'm scanning the tables pretty much constantly
> with mapreduce jobs that rollup and store the data elsewhere.  So I want to
> understand better exactly which rows are present in a given scan.
>
> If I create a scan that reads an entire region, I would expect a "read
> committed" level to lock that region during the read (which might take a
> few minutes).    So what happens to the rows that are inserted during the
> scan?
>
> AZ
>
>
> On Apr 25, 2013, at 4:34 PM, Jean-Daniel Cryans <jd...@apache.org>
> wrote:
>
> > Inline.
> >
> > J-D
> >
> >
> > On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman <
> > azimmerman@sproutsocial.com> wrote:
> >
> >> Hi,
> >>
> >>  If a region is being written to, and a scanner takes a lease out on the
> >> region, what will happen to the writes?  Is there a concept of
> "Transaction
> >> Isolation Levels"?
> >>
> >
> > There's MVCC, so reads can happen while someone else is writing. What you
> > should expect from HBase is "read committed".
> >
> >
> >>
> >>  I don't see errors in Puts while the tables are being scanned?  But it
> >> seems that I'm losing writes somewhere, is it possible the writes could
> >> fail silently?
> >>
> >
> > Is it temporary while you're scanning or there's really data missing at
> > the end of the day? The former might happen on some older HBase versions
> > while the latter should never happen unless you lower the durability
> level
> > yourself and have machine failures.
> >
> > J-D
>
>

Re: writing and reading from a region at once

Posted by Aaron Zimmerman <az...@sproutsocial.com>.
Thanks J-D.  I'm interested in learning how HBase handles MVCC - do you know of any resources explaining?  I have data streaming into hbase around 6 million records a day, and I'm scanning the tables pretty much constantly with mapreduce jobs that rollup and store the data elsewhere.  So I want to understand better exactly which rows are present in a given scan. 

If I create a scan that reads an entire region, I would expect a "read committed" level to lock that region during the read (which might take a few minutes).    So what happens to the rows that are inserted during the scan? 

AZ


On Apr 25, 2013, at 4:34 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:

> Inline.
> 
> J-D
> 
> 
> On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman <
> azimmerman@sproutsocial.com> wrote:
> 
>> Hi,
>> 
>>  If a region is being written to, and a scanner takes a lease out on the
>> region, what will happen to the writes?  Is there a concept of "Transaction
>> Isolation Levels"?
>> 
> 
> There's MVCC, so reads can happen while someone else is writing. What you
> should expect from HBase is "read committed".
> 
> 
>> 
>>  I don't see errors in Puts while the tables are being scanned?  But it
>> seems that I'm losing writes somewhere, is it possible the writes could
>> fail silently?
>> 
> 
> Is it temporary while you're scanning or there's really data missing at
> the end of the day? The former might happen on some older HBase versions
> while the latter should never happen unless you lower the durability level
> yourself and have machine failures.
> 
> J-D


Re: writing and reading from a region at once

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Inline.

J-D


On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman <
azimmerman@sproutsocial.com> wrote:

> Hi,
>
>   If a region is being written to, and a scanner takes a lease out on the
> region, what will happen to the writes?  Is there a concept of "Transaction
> Isolation Levels"?
>

There's MVCC, so reads can happen while someone else is writing. What you
should expect from HBase is "read committed".


>
>   I don't see errors in Puts while the tables are being scanned?  But it
> seems that I'm losing writes somewhere, is it possible the writes could
> fail silently?
>

 Is it temporary while you're scanning or there's really data missing at
the end of the day? The former might happen on some older HBase versions
while the latter should never happen unless you lower the durability level
yourself and have machine failures.

J-D