You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Rod Cope <ro...@openlogic.com> on 2010/01/26 18:36:24 UTC
Problem with flushing and identical timestamps
Hi,
I¹m seeing behavior on 0.20.2 and 0.20.3 that doesn¹t seem quite right and
would like to know if this is by design, a bug, or something I¹m doing
wrong.
Background:
When I do a put that includes a timestamp like this (conceptually I know
this is not the actual API), it works just fine.
put ³table², ³family², ³column², ³bbb², 12345
Then, if I do another put in the same client code using the same timestamp
like this...
put ³table², ³family², ³column², ³aaa², 12345
...and I create a scanner, grab a Result, and iterate over all values using
list(), I get this...
³table², ³family², ³column², ³aaa², 12345
So far, so good. Now, if I truncate the table from the shell and run a new
program that does a flush() on the table between the two put¹s, but does it
in the same client program back-to-back, I also get the same results from
list().
-----
Problem:
Here¹s where the trouble starts. I truncate the table and run a new program
that puts ³bbb², flushes the table, and quits. Here¹s what I get from
list():
³table², ³family², ³column², ³bbb², 12345
Then I run another program that puts ³aaa², flushes, and quits. Here¹s what
I get from list():
³table², ³family², ³column², ³aaa², 12345
³table², ³family², ³column², ³bbb², 12345
And if I then run a third program that puts ³ccc², flushes, and quits, I get
this from list():
³table², ³family², ³column², ³ccc², 12345
³table², ³family², ³column², ³bbb², 12345
³table², ³family², ³column², ³aaa², 12345
I¹m getting three different values for identical
table/family/qualifier/timestamp tuples. Does this seem right? There also
doesn¹t seem to be a defined sort order, probably because the timestamps are
identical.
Also, if instead of using list(), I use getMap(), then I always only get a
single result. The single result is always the last item in the lists above
(i.e., ³bbb² then ³bbb² then ³aaa²). I get identical results from using
getNoVersionMap().
I suspect that this same behavior could occur when HBase decides to flush on
its own, but I could be wrong. As you can imagine, this can cause problems
because clients can¹t know from the results of calling list() which value is
³right² or ³newest². They also can¹t rely on getMap() or getNoVersionMap()
because the single result that gets returned is not necessarily ³right² or
³newest².
I¹ve reproduced everything above in a stand-alone installation and also with
a 7 regionserver cluster with the final 0.20.3. I started down this
debugging path originally because I ran into this problem on the 7
regionserver cluster with one table of 100+ regions. I was flushing
programmatically at the end of some large imports because I'm doing
setWriteToWAL(false) for load performance.
Am I doing something wrong? Did I miss an HBase assumption about flushing
and/or identical timestamps?
Any help would be much appreciated.
Thanks,
Rod
--
Rod Cope
CTO & Founder
OpenLogic, Inc.
Re: Problem with flushing and identical timestamps
Posted by Rod Cope <ro...@openlogic.com>.
Thanks for the extra details, St.Ack.
I mean quit my client. Let me know if any additional testing would help
narrow it down.
Rod
On 1/26/10 Tuesday, January 26, 20105:48 PM, "Stack" <st...@duboce.net>
wrote:
> Here's a bit more on the below:
> https://issues.apache.org/jira/browse/HBASE-1485.
>
> When you say "...puts ³aaa², flushes, and quits", you mean quit your
> client or stop/start hbase?
>
> The getMap behavior seems off (Here's where you'd slot in
> hbase-1485?). I've added the below to hbase-1485 since it has nice
> detail.
>
> St.Ack
>
> On Tue, Jan 26, 2010 at 9:36 AM, Rod Cope <ro...@openlogic.com> wrote:
>> Hi,
>>
>> I¹m seeing behavior on 0.20.2 and 0.20.3 that doesn¹t seem quite right and
>> would like to know if this is by design, a bug, or something I¹m doing
>> wrong.
>>
>> Background:
>>
>> When I do a put that includes a timestamp like this (conceptually I know
>> this is not the actual API), it works just fine.
>> put ³table², ³family², ³column², ³bbb², 12345
>>
>> Then, if I do another put in the same client code using the same timestamp
>> like this...
>> put ³table², ³family², ³column², ³aaa², 12345
>>
>> ...and I create a scanner, grab a Result, and iterate over all values using
>> list(), I get this...
>> ³table², ³family², ³column², ³aaa², 12345
>>
>> So far, so good. Now, if I truncate the table from the shell and run a new
>> program that does a flush() on the table between the two put¹s, but does it
>> in the same client program back-to-back, I also get the same results from
>> list().
>>
>> -----
>>
>> Problem:
>>
>> Here¹s where the trouble starts. I truncate the table and run a new program
>> that puts ³bbb², flushes the table, and quits. Here¹s what I get from
>> list():
>> ³table², ³family², ³column², ³bbb², 12345
>>
>> Then I run another program that puts ³aaa², flushes, and quits. Here¹s what
>> I get from list():
>> ³table², ³family², ³column², ³aaa², 12345
>> ³table², ³family², ³column², ³bbb², 12345
>>
>> And if I then run a third program that puts ³ccc², flushes, and quits, I get
>> this from list():
>> ³table², ³family², ³column², ³ccc², 12345
>> ³table², ³family², ³column², ³bbb², 12345
>> ³table², ³family², ³column², ³aaa², 12345
>>
>> I¹m getting three different values for identical
>> table/family/qualifier/timestamp tuples. Does this seem right? There also
>> doesn¹t seem to be a defined sort order, probably because the timestamps are
>> identical.
>>
>> Also, if instead of using list(), I use getMap(), then I always only get a
>> single result. The single result is always the last item in the lists above
>> (i.e., ³bbb² then ³bbb² then ³aaa²). I get identical results from using
>> getNoVersionMap().
>>
>> I suspect that this same behavior could occur when HBase decides to flush on
>> its own, but I could be wrong. As you can imagine, this can cause problems
>> because clients can¹t know from the results of calling list() which value is
>> ³right² or ³newest². They also can¹t rely on getMap() or getNoVersionMap()
>> because the single result that gets returned is not necessarily ³right² or
>> ³newest².
>>
>> I¹ve reproduced everything above in a stand-alone installation and also with
>> a 7 regionserver cluster with the final 0.20.3. I started down this
>> debugging path originally because I ran into this problem on the 7
>> regionserver cluster with one table of 100+ regions. I was flushing
>> programmatically at the end of some large imports because I'm doing
>> setWriteToWAL(false) for load performance.
>>
>> Am I doing something wrong? Did I miss an HBase assumption about flushing
>> and/or identical timestamps?
>>
>> Any help would be much appreciated.
>>
>> Thanks,
>> Rod
>>
>> --
>>
>> Rod Cope
>> CTO & Founder
>> OpenLogic, Inc.
>>
>>
Re: Problem with flushing and identical timestamps
Posted by Stack <st...@duboce.net>.
Here's a bit more on the below:
https://issues.apache.org/jira/browse/HBASE-1485.
When you say "...puts ³aaa², flushes, and quits", you mean quit your
client or stop/start hbase?
The getMap behavior seems off (Here's where you'd slot in
hbase-1485?). I've added the below to hbase-1485 since it has nice
detail.
St.Ack
On Tue, Jan 26, 2010 at 9:36 AM, Rod Cope <ro...@openlogic.com> wrote:
> Hi,
>
> I¹m seeing behavior on 0.20.2 and 0.20.3 that doesn¹t seem quite right and
> would like to know if this is by design, a bug, or something I¹m doing
> wrong.
>
> Background:
>
> When I do a put that includes a timestamp like this (conceptually I know
> this is not the actual API), it works just fine.
> put ³table², ³family², ³column², ³bbb², 12345
>
> Then, if I do another put in the same client code using the same timestamp
> like this...
> put ³table², ³family², ³column², ³aaa², 12345
>
> ...and I create a scanner, grab a Result, and iterate over all values using
> list(), I get this...
> ³table², ³family², ³column², ³aaa², 12345
>
> So far, so good. Now, if I truncate the table from the shell and run a new
> program that does a flush() on the table between the two put¹s, but does it
> in the same client program back-to-back, I also get the same results from
> list().
>
> -----
>
> Problem:
>
> Here¹s where the trouble starts. I truncate the table and run a new program
> that puts ³bbb², flushes the table, and quits. Here¹s what I get from
> list():
> ³table², ³family², ³column², ³bbb², 12345
>
> Then I run another program that puts ³aaa², flushes, and quits. Here¹s what
> I get from list():
> ³table², ³family², ³column², ³aaa², 12345
> ³table², ³family², ³column², ³bbb², 12345
>
> And if I then run a third program that puts ³ccc², flushes, and quits, I get
> this from list():
> ³table², ³family², ³column², ³ccc², 12345
> ³table², ³family², ³column², ³bbb², 12345
> ³table², ³family², ³column², ³aaa², 12345
>
> I¹m getting three different values for identical
> table/family/qualifier/timestamp tuples. Does this seem right? There also
> doesn¹t seem to be a defined sort order, probably because the timestamps are
> identical.
>
> Also, if instead of using list(), I use getMap(), then I always only get a
> single result. The single result is always the last item in the lists above
> (i.e., ³bbb² then ³bbb² then ³aaa²). I get identical results from using
> getNoVersionMap().
>
> I suspect that this same behavior could occur when HBase decides to flush on
> its own, but I could be wrong. As you can imagine, this can cause problems
> because clients can¹t know from the results of calling list() which value is
> ³right² or ³newest². They also can¹t rely on getMap() or getNoVersionMap()
> because the single result that gets returned is not necessarily ³right² or
> ³newest².
>
> I¹ve reproduced everything above in a stand-alone installation and also with
> a 7 regionserver cluster with the final 0.20.3. I started down this
> debugging path originally because I ran into this problem on the 7
> regionserver cluster with one table of 100+ regions. I was flushing
> programmatically at the end of some large imports because I'm doing
> setWriteToWAL(false) for load performance.
>
> Am I doing something wrong? Did I miss an HBase assumption about flushing
> and/or identical timestamps?
>
> Any help would be much appreciated.
>
> Thanks,
> Rod
>
> --
>
> Rod Cope
> CTO & Founder
> OpenLogic, Inc.
>
>
Re: Problem with flushing and identical timestamps
Posted by Rod Cope <ro...@openlogic.com>.
Thanks, J-D. Especially for the incredibly quick response!
Rod
On 1/26/10 Tuesday, January 26, 201011:32 AM, "Jean-Daniel Cryans"
<jd...@apache.org> wrote:
> Rod,
>
> This is a known issue. It relates to HBASE-29 but IIRC there was a
> more specific jira about it.
>
> Basically the workaround is to not set the timestamp in a way that the
> same one could come twice like you described.
>
> J-D
>
> On Tue, Jan 26, 2010 at 9:36 AM, Rod Cope <ro...@openlogic.com> wrote:
>> Hi,
>>
>> I¹m seeing behavior on 0.20.2 and 0.20.3 that doesn¹t seem quite right and
>> would like to know if this is by design, a bug, or something I¹m doing
>> wrong.
>>
>> Background:
>>
>> When I do a put that includes a timestamp like this (conceptually I know
>> this is not the actual API), it works just fine.
>> put ³table², ³family², ³column², ³bbb², 12345
>>
>> Then, if I do another put in the same client code using the same timestamp
>> like this...
>> put ³table², ³family², ³column², ³aaa², 12345
>>
>> ...and I create a scanner, grab a Result, and iterate over all values using
>> list(), I get this...
>> ³table², ³family², ³column², ³aaa², 12345
>>
>> So far, so good. Now, if I truncate the table from the shell and run a new
>> program that does a flush() on the table between the two put¹s, but does it
>> in the same client program back-to-back, I also get the same results from
>> list().
>>
>> -----
>>
>> Problem:
>>
>> Here¹s where the trouble starts. I truncate the table and run a new program
>> that puts ³bbb², flushes the table, and quits. Here¹s what I get from
>> list():
>> ³table², ³family², ³column², ³bbb², 12345
>>
>> Then I run another program that puts ³aaa², flushes, and quits. Here¹s what
>> I get from list():
>> ³table², ³family², ³column², ³aaa², 12345
>> ³table², ³family², ³column², ³bbb², 12345
>>
>> And if I then run a third program that puts ³ccc², flushes, and quits, I get
>> this from list():
>> ³table², ³family², ³column², ³ccc², 12345
>> ³table², ³family², ³column², ³bbb², 12345
>> ³table², ³family², ³column², ³aaa², 12345
>>
>> I¹m getting three different values for identical
>> table/family/qualifier/timestamp tuples. Does this seem right? There also
>> doesn¹t seem to be a defined sort order, probably because the timestamps are
>> identical.
>>
>> Also, if instead of using list(), I use getMap(), then I always only get a
>> single result. The single result is always the last item in the lists above
>> (i.e., ³bbb² then ³bbb² then ³aaa²). I get identical results from using
>> getNoVersionMap().
>>
>> I suspect that this same behavior could occur when HBase decides to flush on
>> its own, but I could be wrong. As you can imagine, this can cause problems
>> because clients can¹t know from the results of calling list() which value is
>> ³right² or ³newest². They also can¹t rely on getMap() or getNoVersionMap()
>> because the single result that gets returned is not necessarily ³right² or
>> ³newest².
>>
>> I¹ve reproduced everything above in a stand-alone installation and also with
>> a 7 regionserver cluster with the final 0.20.3. I started down this
>> debugging path originally because I ran into this problem on the 7
>> regionserver cluster with one table of 100+ regions. I was flushing
>> programmatically at the end of some large imports because I'm doing
>> setWriteToWAL(false) for load performance.
>>
>> Am I doing something wrong? Did I miss an HBase assumption about flushing
>> and/or identical timestamps?
>>
>> Any help would be much appreciated.
>>
>> Thanks,
>> Rod
>>
>> --
>>
>> Rod Cope
>> CTO & Founder
>> OpenLogic, Inc.
>>
>>
--
Rod Cope | CTO and Founder
rod.cope@openlogic.com
Follow me on Twitter @RodCope
720 240 4501 | phone
720 240 4557 | fax
1 888 OpenLogic | toll free
www.openlogic.com
Follow OpenLogic on Twitter @openlogic
Re: Problem with flushing and identical timestamps
Posted by Jean-Daniel Cryans <jd...@apache.org>.
Rod,
This is a known issue. It relates to HBASE-29 but IIRC there was a
more specific jira about it.
Basically the workaround is to not set the timestamp in a way that the
same one could come twice like you described.
J-D
On Tue, Jan 26, 2010 at 9:36 AM, Rod Cope <ro...@openlogic.com> wrote:
> Hi,
>
> I¹m seeing behavior on 0.20.2 and 0.20.3 that doesn¹t seem quite right and
> would like to know if this is by design, a bug, or something I¹m doing
> wrong.
>
> Background:
>
> When I do a put that includes a timestamp like this (conceptually I know
> this is not the actual API), it works just fine.
> put ³table², ³family², ³column², ³bbb², 12345
>
> Then, if I do another put in the same client code using the same timestamp
> like this...
> put ³table², ³family², ³column², ³aaa², 12345
>
> ...and I create a scanner, grab a Result, and iterate over all values using
> list(), I get this...
> ³table², ³family², ³column², ³aaa², 12345
>
> So far, so good. Now, if I truncate the table from the shell and run a new
> program that does a flush() on the table between the two put¹s, but does it
> in the same client program back-to-back, I also get the same results from
> list().
>
> -----
>
> Problem:
>
> Here¹s where the trouble starts. I truncate the table and run a new program
> that puts ³bbb², flushes the table, and quits. Here¹s what I get from
> list():
> ³table², ³family², ³column², ³bbb², 12345
>
> Then I run another program that puts ³aaa², flushes, and quits. Here¹s what
> I get from list():
> ³table², ³family², ³column², ³aaa², 12345
> ³table², ³family², ³column², ³bbb², 12345
>
> And if I then run a third program that puts ³ccc², flushes, and quits, I get
> this from list():
> ³table², ³family², ³column², ³ccc², 12345
> ³table², ³family², ³column², ³bbb², 12345
> ³table², ³family², ³column², ³aaa², 12345
>
> I¹m getting three different values for identical
> table/family/qualifier/timestamp tuples. Does this seem right? There also
> doesn¹t seem to be a defined sort order, probably because the timestamps are
> identical.
>
> Also, if instead of using list(), I use getMap(), then I always only get a
> single result. The single result is always the last item in the lists above
> (i.e., ³bbb² then ³bbb² then ³aaa²). I get identical results from using
> getNoVersionMap().
>
> I suspect that this same behavior could occur when HBase decides to flush on
> its own, but I could be wrong. As you can imagine, this can cause problems
> because clients can¹t know from the results of calling list() which value is
> ³right² or ³newest². They also can¹t rely on getMap() or getNoVersionMap()
> because the single result that gets returned is not necessarily ³right² or
> ³newest².
>
> I¹ve reproduced everything above in a stand-alone installation and also with
> a 7 regionserver cluster with the final 0.20.3. I started down this
> debugging path originally because I ran into this problem on the 7
> regionserver cluster with one table of 100+ regions. I was flushing
> programmatically at the end of some large imports because I'm doing
> setWriteToWAL(false) for load performance.
>
> Am I doing something wrong? Did I miss an HBase assumption about flushing
> and/or identical timestamps?
>
> Any help would be much appreciated.
>
> Thanks,
> Rod
>
> --
>
> Rod Cope
> CTO & Founder
> OpenLogic, Inc.
>
>