You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Toby White <to...@googlemail.com> on 2008/10/20 17:11:20 UTC

duplicated hbase timestamps

I'm seeing a strange effect on my hbase instance. Sometimes, on  
requesting the full history of a column, I get back individual cells  
several times over.

That is, I'm getting results like this:

base(main):006:0* get 'my_table', 'scw9npU7Q4ma_khXqlDGXg', {COLUMN =>  
'value:', VERSIONS=>4000}
timestamp=1224504133000, value=1013.0
timestamp=1224502749000, value=1012.0
timestamp=1224502749000, value=1012.0
timestamp=1224499880000, value=1011.0
timestamp=1224499880000, value=1011.0
timestamp=1224499880000, value=1011.0
timestamp=1224415961000, value=1010.0
timestamp=1224415961000, value=1010.0
timestamp=1224415961000, value=1010.0
timestamp=1224415701000, value=1009.0
timestamp=1224415701000, value=1009.0
timestamp=1224415701000, value=1009.0
timestamp=1224414200000, value=1008.0
timestamp=1224414200000, value=1008.0
timestamp=1224414200000, value=1008.0

This happens both through the hbase shell as shown here, and when  
communicating with the server via thrift.

In either case, the cells are reported either as shown above; that is,  
with each cell simply repeated several times (in this case, 3) or  
sometimes with the series repeated; something like this:

base(main):006:0* get 'golddigger', 'scw9npU7Q4ma_khXqlDGXg', {COLUMN  
=> 'value:', VERSIONS=>4000}
timestamp=1224504133000, value=1013.0
timestamp=1224502749000, value=1012.0
timestamp=1224499880000, value=1011.0
timestamp=1224415961000, value=1010.0
timestamp=1224415701000, value=1009.0
timestamp=1224414200000, value=1008.0
timestamp=1224504133000, value=1013.0
timestamp=1224502749000, value=1012.0
timestamp=1224499880000, value=1011.0
timestamp=1224415961000, value=1010.0
timestamp=1224415701000, value=1009.0
timestamp=1224414200000, value=1008.0

or sometimes a combination of both ie an entire series, each cell  
repeated a couple of times, and then the whole lot repeated again.

This doesn't happen with all rows, only some of them, apparently at  
random. Sometimes, restarting hbase & the underlying hdf makes the  
problem go away; sometimes, it doesn't, and the issue persists.

This is with hbase 0.18.0 on hadoop 0.18.1

Is this a known issue?

Re: duplicated hbase timestamps

Posted by Toby White <to...@googlemail.com>.

On 14 Dec 2008, at 22:41, stack wrote:

> The below looks like the known issue, HBASE-29 'HStore#get and  
> HStore#getFull may not return expected values by timestamp when  
> there is more than one MapFile'.  What do you think Toby?   
> Basically, if updates do not go in in chronological order, you'll  
> get unexpected results.  We need to fix this but first need to get  
> ourselves set up with some smarter internals before we can address  
> it (Though, that said, I'd think your particular case shouldn't be  
> that hard to make work).

Reviving this old thread - this issue is becoming a blocker for us; we  
need the ability to wipe all existing cells associated with an  
existing key, and rewrite a whole new set cells at varying timestamps  
under the same key. At the moment this isn't working reliably, and old  
cells that should have been deleted are still cropping up as discussed  
previously.

You say it shouldn't be that hard to make work - how difficult is "not  
that hard"? Is this on anyone else's roadmap? Could you point me at  
the bits of code which are involved?

We could work around this by changing our schema; the schemas outlined  
in:

<http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200902.mbox/%3c810021d31dcf8151.499c6185@ina.fr%3e 
 >

are relevant. We're not storing URLs, but our timestamp handling is  
basically case 2; we could use case 1 which ought not to suffer from  
the bug, since we can simply overwrite cells with new versions, albeit  
at the cost of rewriting a fair bit of our code. It seems a shame to  
throw away HBase's native timestamp handling, though.

Toby

>
>
> St.Ack
>
> Toby White wrote:
>> Sorry for the very slow response - local priorities changed and I  
>> didn't have a chance to respond properly before.
>>
>> The issue described previously is still occurring (brief recap -  
>> hbase is reporting cells with duplicate timestamps, see the quoted  
>> output below.)
>>
>> I originally saw this with 0.18.0 - I've now checked, and I still  
>> see it with 0.18.1 (both on hadoop 0.18.1) and current  
>> trunk:r725828 (with hdfs upgraded to run on hadoop 0.19.0)
>>
>> This is running in pseudo-distributed mode.
>>
>> I've been able to narrow down the trigger a bit. I can't cause it  
>> to happen entirely reproducibly, but it seems
>> to occur only when I've done the following:
>>
>> * Create a row;
>> * Add lots of data at different timestamps into one column (thrift  
>> mutateRowTs or shell put)
>> * Delete all data in that column, or indeed the entire row (thrift  
>> deleteAll or deleteAllRow or shell deleteall)
>> * at this point, hbase reports that the row has indeed been removed.
>> * Recreate the row, and put data back into the same column, at the  
>> same timestamps, but with potentially different values (thrift  
>> mutateRowTs / shell put)
>> * On reading the row, Hbase seems to see and report back both the  
>> newly-added data, and the data previously deleted (thrift getVer /  
>> shell get)
>>
>> On a row where this has happened once, it seems to happen almost  
>> always thereafter, each time appending a whole new set of data. So,  
>> if you're adding/removing 100 cells at a time, then the total  
>> number of cells hbase reports back will grow by 100 every time you  
>> repeat the cycle.
>>
>> On a row where it hasn't happened yet, the delete behaviour seems  
>> usually correct.
>>
>> The problem is observable working either through the Python thrift  
>> interface, or directly through the Hbase JRuby shell.
>>
>> The HDFS filesystem on which I'm observing this is fairly small -  
>> under 100Mb compressed - I can forward it for debugging off list if  
>> that's helpful. I'd be grateful for any help sorting this out.
>>
>> Toby
>>
>> On 20 Oct 2008, at 17:51, Jean-Daniel Cryans wrote:
>>
>>> Toby,
>>>
>>> Can you tell us more about your setup? Numbers of machines, if NTP  
>>> is
>>> installed and running, number of regions in your table and other  
>>> useful
>>> stuff.
>>>
>>> Thx,
>>>
>>> J-D
>>>
>>> On Mon, Oct 20, 2008 at 11:11 AM, Toby White
>>> <to...@googlemail.com>wrote:
>>>
>>>> I'm seeing a strange effect on my hbase instance. Sometimes, on  
>>>> requesting
>>>> the full history of a column, I get back individual cells several  
>>>> times
>>>> over.
>>>>
>>>> That is, I'm getting results like this:
>>>>
>>>> base(main):006:0* get 'my_table', 'scw9npU7Q4ma_khXqlDGXg',  
>>>> {COLUMN =>
>>>> 'value:', VERSIONS=>4000}
>>>> timestamp=1224504133000, value=1013.0
>>>> timestamp=1224502749000, value=1012.0
>>>> timestamp=1224502749000, value=1012.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224414200000, value=1008.0
>>>> timestamp=1224414200000, value=1008.0
>>>> timestamp=1224414200000, value=1008.0
>>>>
>>>> This happens both through the hbase shell as shown here, and when
>>>> communicating with the server via thrift.
>>>>
>>>> In either case, the cells are reported either as shown above;  
>>>> that is, with
>>>> each cell simply repeated several times (in this case, 3) or  
>>>> sometimes with
>>>> the series repeated; something like this:
>>>>
>>>> base(main):006:0* get 'golddigger', 'scw9npU7Q4ma_khXqlDGXg',  
>>>> {COLUMN =>
>>>> 'value:', VERSIONS=>4000}
>>>> timestamp=1224504133000, value=1013.0
>>>> timestamp=1224502749000, value=1012.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224414200000, value=1008.0
>>>> timestamp=1224504133000, value=1013.0
>>>> timestamp=1224502749000, value=1012.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224414200000, value=1008.0
>>>>
>>>> or sometimes a combination of both ie an entire series, each cell  
>>>> repeated
>>>> a couple of times, and then the whole lot repeated again.
>>>>
>>>> This doesn't happen with all rows, only some of them, apparently  
>>>> at random.
>>>> Sometimes, restarting hbase & the underlying hdf makes the  
>>>> problem go away;
>>>> sometimes, it doesn't, and the issue persists.
>>>>
>>>> This is with hbase 0.18.0 on hadoop 0.18.1
>>>>
>>>> Is this a known issue?
>>>>
>>
>

Re: duplicated hbase timestamps

Posted by stack <st...@duboce.net>.

Toby White wrote:
> Thanks - yes the symptoms do look very similar to HBASE-29. However, 
> by my reading, that problem ought to go away after compaction; but in 
> this case it doesn't, after compaction the duplicate cells are all 
> still there. (I *think* the order in which they are reported sometimes 
> changes after compaction, but I can't tell reliably.) I might be 
> misunderstanding though.

On 'major' compaction, these happen every 24 hours by default (and are 
different from 'minor' compactions triggered by count of files in 
filesystem), we'll clear any cells beyond the designated MAX_VERSIONS 
and anything older than designated TTL.   For the latter, its the cell's 
timestamp/version that we use.  Another tangle is that 'our' keys are 
made of row/column/timestamp and edits first go into a sorted Map.  If 
two edits with same r/c/t, then the latter will overwrite the older 
(unless there have been flushes between inserts -- have there been in 
your case?).

I'm thinking each edit needs to carry two timestamps to solve HBASE-29, 
the user designated one and another that records actual insert time.  
The latter is used during major compactions figuring cell TTL and its 
used distingushing two edits of same r/c/t so we don't overwrite older 
vintage edits.

St.Ack

>
> Toby
>
> On 14 Dec 2008, at 22:41, stack wrote:
>
>> The below looks like the known issue, HBASE-29 'HStore#get and 
>> HStore#getFull may not return expected values by timestamp when there 
>> is more than one MapFile'.  What do you think Toby?  Basically, if 
>> updates do not go in in chronological order, you'll get unexpected 
>> results.  We need to fix this but first need to get ourselves set up 
>> with some smarter internals before we can address it (Though, that 
>> said, I'd think your particular case shouldn't be that hard to make 
>> work).
>>
>> St.Ack
>>
>> Toby White wrote:
>>> Sorry for the very slow response - local priorities changed and I 
>>> didn't have a chance to respond properly before.
>>>
>>> The issue described previously is still occurring (brief recap - 
>>> hbase is reporting cells with duplicate timestamps, see the quoted 
>>> output below.)
>>>
>>> I originally saw this with 0.18.0 - I've now checked, and I still 
>>> see it with 0.18.1 (both on hadoop 0.18.1) and current trunk:r725828 
>>> (with hdfs upgraded to run on hadoop 0.19.0)
>>>
>>> This is running in pseudo-distributed mode.
>>>
>>> I've been able to narrow down the trigger a bit. I can't cause it to 
>>> happen entirely reproducibly, but it seems
>>> to occur only when I've done the following:
>>>
>>> * Create a row;
>>> * Add lots of data at different timestamps into one column (thrift 
>>> mutateRowTs or shell put)
>>> * Delete all data in that column, or indeed the entire row (thrift 
>>> deleteAll or deleteAllRow or shell deleteall)
>>> * at this point, hbase reports that the row has indeed been removed.
>>> * Recreate the row, and put data back into the same column, at the 
>>> same timestamps, but with potentially different values (thrift 
>>> mutateRowTs / shell put)
>>> * On reading the row, Hbase seems to see and report back both the 
>>> newly-added data, and the data previously deleted (thrift getVer / 
>>> shell get)
>>>
>>> On a row where this has happened once, it seems to happen almost 
>>> always thereafter, each time appending a whole new set of data. So, 
>>> if you're adding/removing 100 cells at a time, then the total number 
>>> of cells hbase reports back will grow by 100 every time you repeat 
>>> the cycle.
>>>
>>> On a row where it hasn't happened yet, the delete behaviour seems 
>>> usually correct.
>>>
>>> The problem is observable working either through the Python thrift 
>>> interface, or directly through the Hbase JRuby shell.
>>>
>>> The HDFS filesystem on which I'm observing this is fairly small - 
>>> under 100Mb compressed - I can forward it for debugging off list if 
>>> that's helpful. I'd be grateful for any help sorting this out.
>>>
>>> Toby
>>>
>>> On 20 Oct 2008, at 17:51, Jean-Daniel Cryans wrote:
>>>
>>>> Toby,
>>>>
>>>> Can you tell us more about your setup? Numbers of machines, if NTP is
>>>> installed and running, number of regions in your table and other 
>>>> useful
>>>> stuff.
>>>>
>>>> Thx,
>>>>
>>>> J-D
>>>>
>>>> On Mon, Oct 20, 2008 at 11:11 AM, Toby White
>>>> <to...@googlemail.com>wrote:
>>>>
>>>>> I'm seeing a strange effect on my hbase instance. Sometimes, on 
>>>>> requesting
>>>>> the full history of a column, I get back individual cells several 
>>>>> times
>>>>> over.
>>>>>
>>>>> That is, I'm getting results like this:
>>>>>
>>>>> base(main):006:0* get 'my_table', 'scw9npU7Q4ma_khXqlDGXg', 
>>>>> {COLUMN =>
>>>>> 'value:', VERSIONS=>4000}
>>>>> timestamp=1224504133000, value=1013.0
>>>>> timestamp=1224502749000, value=1012.0
>>>>> timestamp=1224502749000, value=1012.0
>>>>> timestamp=1224499880000, value=1011.0
>>>>> timestamp=1224499880000, value=1011.0
>>>>> timestamp=1224499880000, value=1011.0
>>>>> timestamp=1224415961000, value=1010.0
>>>>> timestamp=1224415961000, value=1010.0
>>>>> timestamp=1224415961000, value=1010.0
>>>>> timestamp=1224415701000, value=1009.0
>>>>> timestamp=1224415701000, value=1009.0
>>>>> timestamp=1224415701000, value=1009.0
>>>>> timestamp=1224414200000, value=1008.0
>>>>> timestamp=1224414200000, value=1008.0
>>>>> timestamp=1224414200000, value=1008.0
>>>>>
>>>>> This happens both through the hbase shell as shown here, and when
>>>>> communicating with the server via thrift.
>>>>>
>>>>> In either case, the cells are reported either as shown above; that 
>>>>> is, with
>>>>> each cell simply repeated several times (in this case, 3) or 
>>>>> sometimes with
>>>>> the series repeated; something like this:
>>>>>
>>>>> base(main):006:0* get 'golddigger', 'scw9npU7Q4ma_khXqlDGXg', 
>>>>> {COLUMN =>
>>>>> 'value:', VERSIONS=>4000}
>>>>> timestamp=1224504133000, value=1013.0
>>>>> timestamp=1224502749000, value=1012.0
>>>>> timestamp=1224499880000, value=1011.0
>>>>> timestamp=1224415961000, value=1010.0
>>>>> timestamp=1224415701000, value=1009.0
>>>>> timestamp=1224414200000, value=1008.0
>>>>> timestamp=1224504133000, value=1013.0
>>>>> timestamp=1224502749000, value=1012.0
>>>>> timestamp=1224499880000, value=1011.0
>>>>> timestamp=1224415961000, value=1010.0
>>>>> timestamp=1224415701000, value=1009.0
>>>>> timestamp=1224414200000, value=1008.0
>>>>>
>>>>> or sometimes a combination of both ie an entire series, each cell 
>>>>> repeated
>>>>> a couple of times, and then the whole lot repeated again.
>>>>>
>>>>> This doesn't happen with all rows, only some of them, apparently 
>>>>> at random.
>>>>> Sometimes, restarting hbase & the underlying hdf makes the problem 
>>>>> go away;
>>>>> sometimes, it doesn't, and the issue persists.
>>>>>
>>>>> This is with hbase 0.18.0 on hadoop 0.18.1
>>>>>
>>>>> Is this a known issue?
>>>>>
>>>
>>
>

Re: duplicated hbase timestamps

Posted by Toby White <to...@googlemail.com>.

Thanks - yes the symptoms do look very similar to HBASE-29. However,  
by my reading, that problem ought to go away after compaction; but in  
this case it doesn't, after compaction the duplicate cells are all  
still there. (I *think* the order in which they are reported sometimes  
changes after compaction, but I can't tell reliably.) I might be  
misunderstanding though.

Toby

On 14 Dec 2008, at 22:41, stack wrote:

> The below looks like the known issue, HBASE-29 'HStore#get and  
> HStore#getFull may not return expected values by timestamp when  
> there is more than one MapFile'.  What do you think Toby?   
> Basically, if updates do not go in in chronological order, you'll  
> get unexpected results.  We need to fix this but first need to get  
> ourselves set up with some smarter internals before we can address  
> it (Though, that said, I'd think your particular case shouldn't be  
> that hard to make work).
>
> St.Ack
>
> Toby White wrote:
>> Sorry for the very slow response - local priorities changed and I  
>> didn't have a chance to respond properly before.
>>
>> The issue described previously is still occurring (brief recap -  
>> hbase is reporting cells with duplicate timestamps, see the quoted  
>> output below.)
>>
>> I originally saw this with 0.18.0 - I've now checked, and I still  
>> see it with 0.18.1 (both on hadoop 0.18.1) and current  
>> trunk:r725828 (with hdfs upgraded to run on hadoop 0.19.0)
>>
>> This is running in pseudo-distributed mode.
>>
>> I've been able to narrow down the trigger a bit. I can't cause it  
>> to happen entirely reproducibly, but it seems
>> to occur only when I've done the following:
>>
>> * Create a row;
>> * Add lots of data at different timestamps into one column (thrift  
>> mutateRowTs or shell put)
>> * Delete all data in that column, or indeed the entire row (thrift  
>> deleteAll or deleteAllRow or shell deleteall)
>> * at this point, hbase reports that the row has indeed been removed.
>> * Recreate the row, and put data back into the same column, at the  
>> same timestamps, but with potentially different values (thrift  
>> mutateRowTs / shell put)
>> * On reading the row, Hbase seems to see and report back both the  
>> newly-added data, and the data previously deleted (thrift getVer /  
>> shell get)
>>
>> On a row where this has happened once, it seems to happen almost  
>> always thereafter, each time appending a whole new set of data. So,  
>> if you're adding/removing 100 cells at a time, then the total  
>> number of cells hbase reports back will grow by 100 every time you  
>> repeat the cycle.
>>
>> On a row where it hasn't happened yet, the delete behaviour seems  
>> usually correct.
>>
>> The problem is observable working either through the Python thrift  
>> interface, or directly through the Hbase JRuby shell.
>>
>> The HDFS filesystem on which I'm observing this is fairly small -  
>> under 100Mb compressed - I can forward it for debugging off list if  
>> that's helpful. I'd be grateful for any help sorting this out.
>>
>> Toby
>>
>> On 20 Oct 2008, at 17:51, Jean-Daniel Cryans wrote:
>>
>>> Toby,
>>>
>>> Can you tell us more about your setup? Numbers of machines, if NTP  
>>> is
>>> installed and running, number of regions in your table and other  
>>> useful
>>> stuff.
>>>
>>> Thx,
>>>
>>> J-D
>>>
>>> On Mon, Oct 20, 2008 at 11:11 AM, Toby White
>>> <to...@googlemail.com>wrote:
>>>
>>>> I'm seeing a strange effect on my hbase instance. Sometimes, on  
>>>> requesting
>>>> the full history of a column, I get back individual cells several  
>>>> times
>>>> over.
>>>>
>>>> That is, I'm getting results like this:
>>>>
>>>> base(main):006:0* get 'my_table', 'scw9npU7Q4ma_khXqlDGXg',  
>>>> {COLUMN =>
>>>> 'value:', VERSIONS=>4000}
>>>> timestamp=1224504133000, value=1013.0
>>>> timestamp=1224502749000, value=1012.0
>>>> timestamp=1224502749000, value=1012.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224414200000, value=1008.0
>>>> timestamp=1224414200000, value=1008.0
>>>> timestamp=1224414200000, value=1008.0
>>>>
>>>> This happens both through the hbase shell as shown here, and when
>>>> communicating with the server via thrift.
>>>>
>>>> In either case, the cells are reported either as shown above;  
>>>> that is, with
>>>> each cell simply repeated several times (in this case, 3) or  
>>>> sometimes with
>>>> the series repeated; something like this:
>>>>
>>>> base(main):006:0* get 'golddigger', 'scw9npU7Q4ma_khXqlDGXg',  
>>>> {COLUMN =>
>>>> 'value:', VERSIONS=>4000}
>>>> timestamp=1224504133000, value=1013.0
>>>> timestamp=1224502749000, value=1012.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224414200000, value=1008.0
>>>> timestamp=1224504133000, value=1013.0
>>>> timestamp=1224502749000, value=1012.0
>>>> timestamp=1224499880000, value=1011.0
>>>> timestamp=1224415961000, value=1010.0
>>>> timestamp=1224415701000, value=1009.0
>>>> timestamp=1224414200000, value=1008.0
>>>>
>>>> or sometimes a combination of both ie an entire series, each cell  
>>>> repeated
>>>> a couple of times, and then the whole lot repeated again.
>>>>
>>>> This doesn't happen with all rows, only some of them, apparently  
>>>> at random.
>>>> Sometimes, restarting hbase & the underlying hdf makes the  
>>>> problem go away;
>>>> sometimes, it doesn't, and the issue persists.
>>>>
>>>> This is with hbase 0.18.0 on hadoop 0.18.1
>>>>
>>>> Is this a known issue?
>>>>
>>
>

Re: duplicated hbase timestamps

Posted by stack <st...@duboce.net>.

The below looks like the known issue, HBASE-29 'HStore#get and 
HStore#getFull may not return expected values by timestamp when there is 
more than one MapFile'.  What do you think Toby?  Basically, if updates 
do not go in in chronological order, you'll get unexpected results.  We 
need to fix this but first need to get ourselves set up with some 
smarter internals before we can address it (Though, that said, I'd think 
your particular case shouldn't be that hard to make work).

St.Ack

Toby White wrote:
> Sorry for the very slow response - local priorities changed and I 
> didn't have a chance to respond properly before.
>
> The issue described previously is still occurring (brief recap - hbase 
> is reporting cells with duplicate timestamps, see the quoted output 
> below.)
>
> I originally saw this with 0.18.0 - I've now checked, and I still see 
> it with 0.18.1 (both on hadoop 0.18.1) and current trunk:r725828 (with 
> hdfs upgraded to run on hadoop 0.19.0)
>
> This is running in pseudo-distributed mode.
>
> I've been able to narrow down the trigger a bit. I can't cause it to 
> happen entirely reproducibly, but it seems
> to occur only when I've done the following:
>
> * Create a row;
> * Add lots of data at different timestamps into one column (thrift 
> mutateRowTs or shell put)
> * Delete all data in that column, or indeed the entire row (thrift 
> deleteAll or deleteAllRow or shell deleteall)
> * at this point, hbase reports that the row has indeed been removed.
> * Recreate the row, and put data back into the same column, at the 
> same timestamps, but with potentially different values (thrift 
> mutateRowTs / shell put)
> * On reading the row, Hbase seems to see and report back both the 
> newly-added data, and the data previously deleted (thrift getVer / 
> shell get)
>
> On a row where this has happened once, it seems to happen almost 
> always thereafter, each time appending a whole new set of data. So, if 
> you're adding/removing 100 cells at a time, then the total number of 
> cells hbase reports back will grow by 100 every time you repeat the 
> cycle.
>
> On a row where it hasn't happened yet, the delete behaviour seems 
> usually correct.
>
> The problem is observable working either through the Python thrift 
> interface, or directly through the Hbase JRuby shell.
>
> The HDFS filesystem on which I'm observing this is fairly small - 
> under 100Mb compressed - I can forward it for debugging off list if 
> that's helpful. I'd be grateful for any help sorting this out.
>
> Toby
>
> On 20 Oct 2008, at 17:51, Jean-Daniel Cryans wrote:
>
>> Toby,
>>
>> Can you tell us more about your setup? Numbers of machines, if NTP is
>> installed and running, number of regions in your table and other useful
>> stuff.
>>
>> Thx,
>>
>> J-D
>>
>> On Mon, Oct 20, 2008 at 11:11 AM, Toby White
>> <to...@googlemail.com>wrote:
>>
>>> I'm seeing a strange effect on my hbase instance. Sometimes, on 
>>> requesting
>>> the full history of a column, I get back individual cells several times
>>> over.
>>>
>>> That is, I'm getting results like this:
>>>
>>> base(main):006:0* get 'my_table', 'scw9npU7Q4ma_khXqlDGXg', {COLUMN =>
>>> 'value:', VERSIONS=>4000}
>>> timestamp=1224504133000, value=1013.0
>>> timestamp=1224502749000, value=1012.0
>>> timestamp=1224502749000, value=1012.0
>>> timestamp=1224499880000, value=1011.0
>>> timestamp=1224499880000, value=1011.0
>>> timestamp=1224499880000, value=1011.0
>>> timestamp=1224415961000, value=1010.0
>>> timestamp=1224415961000, value=1010.0
>>> timestamp=1224415961000, value=1010.0
>>> timestamp=1224415701000, value=1009.0
>>> timestamp=1224415701000, value=1009.0
>>> timestamp=1224415701000, value=1009.0
>>> timestamp=1224414200000, value=1008.0
>>> timestamp=1224414200000, value=1008.0
>>> timestamp=1224414200000, value=1008.0
>>>
>>> This happens both through the hbase shell as shown here, and when
>>> communicating with the server via thrift.
>>>
>>> In either case, the cells are reported either as shown above; that 
>>> is, with
>>> each cell simply repeated several times (in this case, 3) or 
>>> sometimes with
>>> the series repeated; something like this:
>>>
>>> base(main):006:0* get 'golddigger', 'scw9npU7Q4ma_khXqlDGXg', 
>>> {COLUMN =>
>>> 'value:', VERSIONS=>4000}
>>> timestamp=1224504133000, value=1013.0
>>> timestamp=1224502749000, value=1012.0
>>> timestamp=1224499880000, value=1011.0
>>> timestamp=1224415961000, value=1010.0
>>> timestamp=1224415701000, value=1009.0
>>> timestamp=1224414200000, value=1008.0
>>> timestamp=1224504133000, value=1013.0
>>> timestamp=1224502749000, value=1012.0
>>> timestamp=1224499880000, value=1011.0
>>> timestamp=1224415961000, value=1010.0
>>> timestamp=1224415701000, value=1009.0
>>> timestamp=1224414200000, value=1008.0
>>>
>>> or sometimes a combination of both ie an entire series, each cell 
>>> repeated
>>> a couple of times, and then the whole lot repeated again.
>>>
>>> This doesn't happen with all rows, only some of them, apparently at 
>>> random.
>>> Sometimes, restarting hbase & the underlying hdf makes the problem 
>>> go away;
>>> sometimes, it doesn't, and the issue persists.
>>>
>>> This is with hbase 0.18.0 on hadoop 0.18.1
>>>
>>> Is this a known issue?
>>>
>

Re: duplicated hbase timestamps

Posted by Toby White <to...@googlemail.com>.

Sorry for the very slow response - local priorities changed and I  
didn't have a chance to respond properly before.

The issue described previously is still occurring (brief recap - hbase  
is reporting cells with duplicate timestamps, see the quoted output  
below.)

I originally saw this with 0.18.0 - I've now checked, and I still see  
it with 0.18.1 (both on hadoop 0.18.1) and current trunk:r725828 (with  
hdfs upgraded to run on hadoop 0.19.0)

This is running in pseudo-distributed mode.

I've been able to narrow down the trigger a bit. I can't cause it to  
happen entirely reproducibly, but it seems
to occur only when I've done the following:

* Create a row;
* Add lots of data at different timestamps into one column (thrift  
mutateRowTs or shell put)
* Delete all data in that column, or indeed the entire row (thrift  
deleteAll or deleteAllRow or shell deleteall)
* at this point, hbase reports that the row has indeed been removed.
* Recreate the row, and put data back into the same column, at the  
same timestamps, but with potentially different values (thrift  
mutateRowTs / shell put)
* On reading the row, Hbase seems to see and report back both the  
newly-added data, and the data previously deleted (thrift getVer /  
shell get)

On a row where this has happened once, it seems to happen almost  
always thereafter, each time appending a whole new set of data. So, if  
you're adding/removing 100 cells at a time, then the total number of  
cells hbase reports back will grow by 100 every time you repeat the  
cycle.

On a row where it hasn't happened yet, the delete behaviour seems  
usually correct.

The problem is observable working either through the Python thrift  
interface, or directly through the Hbase JRuby shell.

The HDFS filesystem on which I'm observing this is fairly small -  
under 100Mb compressed - I can forward it for debugging off list if  
that's helpful. I'd be grateful for any help sorting this out.

Toby

On 20 Oct 2008, at 17:51, Jean-Daniel Cryans wrote:

> Toby,
>
> Can you tell us more about your setup? Numbers of machines, if NTP is
> installed and running, number of regions in your table and other  
> useful
> stuff.
>
> Thx,
>
> J-D
>
> On Mon, Oct 20, 2008 at 11:11 AM, Toby White
> <to...@googlemail.com>wrote:
>
>> I'm seeing a strange effect on my hbase instance. Sometimes, on  
>> requesting
>> the full history of a column, I get back individual cells several  
>> times
>> over.
>>
>> That is, I'm getting results like this:
>>
>> base(main):006:0* get 'my_table', 'scw9npU7Q4ma_khXqlDGXg', {COLUMN  
>> =>
>> 'value:', VERSIONS=>4000}
>> timestamp=1224504133000, value=1013.0
>> timestamp=1224502749000, value=1012.0
>> timestamp=1224502749000, value=1012.0
>> timestamp=1224499880000, value=1011.0
>> timestamp=1224499880000, value=1011.0
>> timestamp=1224499880000, value=1011.0
>> timestamp=1224415961000, value=1010.0
>> timestamp=1224415961000, value=1010.0
>> timestamp=1224415961000, value=1010.0
>> timestamp=1224415701000, value=1009.0
>> timestamp=1224415701000, value=1009.0
>> timestamp=1224415701000, value=1009.0
>> timestamp=1224414200000, value=1008.0
>> timestamp=1224414200000, value=1008.0
>> timestamp=1224414200000, value=1008.0
>>
>> This happens both through the hbase shell as shown here, and when
>> communicating with the server via thrift.
>>
>> In either case, the cells are reported either as shown above; that  
>> is, with
>> each cell simply repeated several times (in this case, 3) or  
>> sometimes with
>> the series repeated; something like this:
>>
>> base(main):006:0* get 'golddigger', 'scw9npU7Q4ma_khXqlDGXg',  
>> {COLUMN =>
>> 'value:', VERSIONS=>4000}
>> timestamp=1224504133000, value=1013.0
>> timestamp=1224502749000, value=1012.0
>> timestamp=1224499880000, value=1011.0
>> timestamp=1224415961000, value=1010.0
>> timestamp=1224415701000, value=1009.0
>> timestamp=1224414200000, value=1008.0
>> timestamp=1224504133000, value=1013.0
>> timestamp=1224502749000, value=1012.0
>> timestamp=1224499880000, value=1011.0
>> timestamp=1224415961000, value=1010.0
>> timestamp=1224415701000, value=1009.0
>> timestamp=1224414200000, value=1008.0
>>
>> or sometimes a combination of both ie an entire series, each cell  
>> repeated
>> a couple of times, and then the whole lot repeated again.
>>
>> This doesn't happen with all rows, only some of them, apparently at  
>> random.
>> Sometimes, restarting hbase & the underlying hdf makes the problem  
>> go away;
>> sometimes, it doesn't, and the issue persists.
>>
>> This is with hbase 0.18.0 on hadoop 0.18.1
>>
>> Is this a known issue?
>>

Re: duplicated hbase timestamps

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Toby,

Can you tell us more about your setup? Numbers of machines, if NTP is
installed and running, number of regions in your table and other useful
stuff.

Thx,

J-D

On Mon, Oct 20, 2008 at 11:11 AM, Toby White
<to...@googlemail.com>wrote:

> I'm seeing a strange effect on my hbase instance. Sometimes, on requesting
> the full history of a column, I get back individual cells several times
> over.
>
> That is, I'm getting results like this:
>
> base(main):006:0* get 'my_table', 'scw9npU7Q4ma_khXqlDGXg', {COLUMN =>
> 'value:', VERSIONS=>4000}
> timestamp=1224504133000, value=1013.0
> timestamp=1224502749000, value=1012.0
> timestamp=1224502749000, value=1012.0
> timestamp=1224499880000, value=1011.0
> timestamp=1224499880000, value=1011.0
> timestamp=1224499880000, value=1011.0
> timestamp=1224415961000, value=1010.0
> timestamp=1224415961000, value=1010.0
> timestamp=1224415961000, value=1010.0
> timestamp=1224415701000, value=1009.0
> timestamp=1224415701000, value=1009.0
> timestamp=1224415701000, value=1009.0
> timestamp=1224414200000, value=1008.0
> timestamp=1224414200000, value=1008.0
> timestamp=1224414200000, value=1008.0
>
> This happens both through the hbase shell as shown here, and when
> communicating with the server via thrift.
>
> In either case, the cells are reported either as shown above; that is, with
> each cell simply repeated several times (in this case, 3) or sometimes with
> the series repeated; something like this:
>
> base(main):006:0* get 'golddigger', 'scw9npU7Q4ma_khXqlDGXg', {COLUMN =>
> 'value:', VERSIONS=>4000}
> timestamp=1224504133000, value=1013.0
> timestamp=1224502749000, value=1012.0
> timestamp=1224499880000, value=1011.0
> timestamp=1224415961000, value=1010.0
> timestamp=1224415701000, value=1009.0
> timestamp=1224414200000, value=1008.0
> timestamp=1224504133000, value=1013.0
> timestamp=1224502749000, value=1012.0
> timestamp=1224499880000, value=1011.0
> timestamp=1224415961000, value=1010.0
> timestamp=1224415701000, value=1009.0
> timestamp=1224414200000, value=1008.0
>
> or sometimes a combination of both ie an entire series, each cell repeated
> a couple of times, and then the whole lot repeated again.
>
> This doesn't happen with all rows, only some of them, apparently at random.
> Sometimes, restarting hbase & the underlying hdf makes the problem go away;
> sometimes, it doesn't, and the issue persists.
>
> This is with hbase 0.18.0 on hadoop 0.18.1
>
> Is this a known issue?
>