You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by kishore g <g....@gmail.com> on 2009/12/11 19:47:48 UTC

any problems in overriding timestamp

Hi,

We have the following requirement ,

We get messages and we make insert them into hbase and query must return the
most recent insert.This works perfectly fine in the case where messages are
in order. But sometimes mesasges come out of order and this may make the
query return a older record, but we want the record sorted by message
generated time stamp rather than insert timestamp

What is the best solution, i can think of only one solution

   1. Provide stamp while inserting instead of hbase using its default

Are there any other solutions and will there be any problems in future
overriding timestamp. Also what happens if two inserts happen at the same
time?

thanks,
Kishore G

Re: any problems in overriding timestamp

Posted by Andrew Purtell <ap...@apache.org>.
Hi Kishore,

> Provide stamp while inserting instead of hbase using its default

Yes.

> will there be any problems in future overriding timestamp

No. 

However, on the more general notion of problems which may occur because
of overriding timestamp: One issue that came up recently is this: If you
insert into the future, but delete using the present, then the delete
marker will not subsume the future value(s) and the delete effectively
won't happen. See? This can cause confusion. 

> Also what happens if two inserts happen at the same time?

Depending on the number of versions you have configured on the column
family (default is 3), they will both be stored. However which is
returned first in a multiversion query is the first inserted, likewise
what is returned for a single value query (most recent only), assuming
that the timestamp in question represents "most recent". Given an
environment with many clients contributing values, I'm not sure you 
will be able to know which is first inserted, so it will amount to a
coin toss. 

Timestamps are microsecond resolution to help avoid this and when
HBase manages the timestamp such conflicts do not happen.

   - Andy




________________________________
From: kishore g <g....@gmail.com>
To: hbase-user@hadoop.apache.org
Sent: Fri, December 11, 2009 10:47:48 AM
Subject: any problems in overriding timestamp

Hi,

We have the following requirement ,

We get messages and we make insert them into hbase and query must return the
most recent insert.This works perfectly fine in the case where messages are
in order. But sometimes mesasges come out of order and this may make the
query return a older record, but we want the record sorted by message
generated time stamp rather than insert timestamp

What is the best solution, i can think of only one solution

   1. Provide stamp while inserting instead of hbase using its default

Are there any other solutions and will there be any problems in future
overriding timestamp. Also what happens if two inserts happen at the same
time?

thanks,
Kishore G