You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by JQ Hadoop <jq...@gmail.com> on 2009/12/01 03:17:53 UTC

Re: Durability of HBase

I wonder if you can point me to the message or the title of the
message. It seems that I cannot find the message ...

Thanks,
-JQ

On Mon, Nov 30, 2009 at 4:36 PM, Berk D. Demir <bd...@mindcast.org> wrote:
> Short answer is "yes you can still lose data".
> It's about HDFS and HDFS-265 will solve this.
> Similar question was in the list a couple of days ago.
> Here's Ryan's answer.
> http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200911.mbox/browser
>
> On Mon, Nov 30, 2009 at 00:18, JQ Hadoop <jq...@gmail.com> wrote:
>> I have a question regarding the durability of HBase. After I have put
>> a record to HBase and have received a confirmation of the put from the
>> region server (assuming both autoFlush and writeToWAL are set to
>> true), can I be sure that my record will be there no matter what? I've
>> checked the code and found that the region server will first append a
>> log record to WAL (a SequenceFile) and then may call the sync()
>> function; however, it appears to me that this does not guarantee the
>> log record is in HDFS even if the sync() function is called. That is
>> to say, in case of RegionServer crash, the records put into HBase may
>> be lost; am I missing anything?
>>
>> Thanks,
>> -JQ
>>
>

Re: Durability of HBase

Posted by Jean-Daniel Cryans <jd...@apache.org>.
I think it was "newbie question on stopping Hbase".

So in 0.20 and prior, the only way the WAL was really useful was if a
WAL file was closed (that's why we keep them in small size ~64MB).
Data loss in the face of machine failure is real.

In Hadoop 0.21, which includes the popular HDFS-265, we currently use
the "hflush" feature which, once called, guarantees us that the
appended edits are sent to 3 replicas so that all data is durably
persisted (unless the 3 nodes die around the same time). Checkout the
latest HBase trunk with the 0.21 Hadoop branch to test it, but I can
already tell you that it works very very well.

J-D

On Mon, Nov 30, 2009 at 6:17 PM, JQ Hadoop <jq...@gmail.com> wrote:
> I wonder if you can point me to the message or the title of the
> message. It seems that I cannot find the message ...
>
> Thanks,
> -JQ
>
> On Mon, Nov 30, 2009 at 4:36 PM, Berk D. Demir <bd...@mindcast.org> wrote:
>> Short answer is "yes you can still lose data".
>> It's about HDFS and HDFS-265 will solve this.
>> Similar question was in the list a couple of days ago.
>> Here's Ryan's answer.
>> http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200911.mbox/browser
>>
>> On Mon, Nov 30, 2009 at 00:18, JQ Hadoop <jq...@gmail.com> wrote:
>>> I have a question regarding the durability of HBase. After I have put
>>> a record to HBase and have received a confirmation of the put from the
>>> region server (assuming both autoFlush and writeToWAL are set to
>>> true), can I be sure that my record will be there no matter what? I've
>>> checked the code and found that the region server will first append a
>>> log record to WAL (a SequenceFile) and then may call the sync()
>>> function; however, it appears to me that this does not guarantee the
>>> log record is in HDFS even if the sync() function is called. That is
>>> to say, in case of RegionServer crash, the records put into HBase may
>>> be lost; am I missing anything?
>>>
>>> Thanks,
>>> -JQ
>>>
>>
>