You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Nikhil Gupta <gu...@gmail.com> on 2009/08/28 22:59:18 UTC

Getting all the different timestamp values for a column

Hi,

I am new to HBase, so please forgive if this question is too basic.

We are building a system that stores time series data for different terms in
column qualifiers.
So, we plan to store data in the form :
KeyID -> "ID1:ID2"
Family:Qualifier -> "family_name:qualifier_term"
Values -> Traffic for that term with different timestamps

We will pull this data to display a trends kind of chart on front end.
Now if I want to get all the values for different timestamps for a
particular column while scanning, how can I do that via a scanner in the
most elegant manner ?

[assuming that I know exact values for id1:id2 but not for qualifier_term.]

Thanks
-nikhil
http://stanford.edu/~nikgupta

Re: Getting all the different timestamp values for a column

Posted by stack <st...@duboce.net>.
On Wed, Sep 2, 2009 at 11:44 AM, Nikhil Gupta <gu...@gmail.com> wrote:

>
> However, this returns a cell with latest timestamp even when no value for
> the requested timestamp exists in the table. Is this expected behavior?
>
>
Yes.  Thats how it used to 'work'.



> The HBase architecture wiki doc example data model says that "...Thus a
> request for the value of the *"contents:"* column at time stamp t8 would
> return no value..."
>
>
This doc. is wrong regards how 0.19 worked.

0.20.0 works more like you'd expect.  If you provide explicit ts, and no
such value, you will get a null response.  Use TimeRange if you want the
behavior you see in 0.19.

St.Ack

Re: Getting all the different timestamp values for a column

Posted by stack <st...@duboce.net>.
On Wed, Sep 2, 2009 at 11:44 AM, Nikhil Gupta <gu...@gmail.com> wrote:

>
> However, we are not moving to newer HBase till there are final and
> compatible releases of Pig/Hadoop too (no patching needed).


Hadoop is already 0.20.0.  Not sure of state of PIG and its working on
0.20.0.  What you mean by no patching needed?
St.Ack

Re: Getting all the different timestamp values for a column

Posted by Nikhil Gupta <gu...@gmail.com>.
Thanks for the help Stack & Jonathan.

Later I realized that I should also have looked up
http://svn.apache.org/repos/asf/hadoop/hbase/trunk/src/test/org/apache/hadoop/hbase/TestScanMultipleVersions.javabefore
asking on the list.

However, we are not moving to newer HBase till there are final and
compatible releases of Pig/Hadoop too (no patching needed). So for now, I
tried achieving the task by iterating over a number of known timestamps like
this -
cell = table.get(rowID, colName, timestamp, 1);

However, this returns a cell with latest timestamp even when no value for
the requested timestamp exists in the table. Is this expected behavior?

The HBase architecture wiki doc example data model says that "...Thus a
request for the value of the *"contents:"* column at time stamp t8 would
return no value..."

Thanks!
-nikhil
http://sahyog.org/fun

On Mon, Aug 31, 2009 at 8:57 AM, Jonathan Gray <jl...@streamy.com> wrote:

> One small correction...
>
> Scan.setTimeStamp(long) specifies:
>
> * Get versions of columns with the specified timestamp.
>
> That is, you will only get results with that exact timestamp.
>
> If you want to get all stamps:
>
> Scan.setTimeRange(long, long) specifies:
>
> * Get versions of columns only within the specified timestamp range,
> [minStamp, maxStamp).
>
> So to get all versions:
>
> scan.setTimeRange(0, Long.MAX_VALUE)
>
>
> stack wrote:
>
>> You have to be using 0.20.0 RC2 to do the below.  The below also presumes
>> that each value has a different timestamp; i.e. you don't expect to add
>> multiple values against same timestamp.
>>
>> // Presumes you already have a table instance named 'table'.
>> Scan scan = new Scan(Bytes.toBytes("ID1:ID2"));
>> scan.addColumn(Bytes.toBytes("family_name"),
>> Bytes.toBytes("qualifier_term"));
>> scan.setTimeStamp(ts); // This gets all from the timestamp and older.  Use
>> setTimeRange if you want to set a range.
>> // Optionally, if you want to cap versions returned, see setMaxVersions.
>> ResultScanner scanner = table.getScanner(scan);
>> .....
>>
>> The above is how I'd do it.  Elegance is a tag I'm rarely associated with
>> so
>> there may be a better way...
>>
>> Hopefuly this helps
>> St.Ack
>>
>>
>> On Fri, Aug 28, 2009 at 1:59 PM, Nikhil Gupta <gu...@gmail.com> wrote:
>>
>>  Hi,
>>>
>>> I am new to HBase, so please forgive if this question is too basic.
>>>
>>> We are building a system that stores time series data for different terms
>>> in
>>> column qualifiers.
>>> So, we plan to store data in the form :
>>> KeyID -> "ID1:ID2"
>>> Family:Qualifier -> "family_name:qualifier_term"
>>> Values -> Traffic for that term with different timestamps
>>>
>>> We will pull this data to display a trends kind of chart on front end.
>>> Now if I want to get all the values for different timestamps for a
>>> particular column while scanning, how can I do that via a scanner in the
>>> most elegant manner ?
>>>
>>> [assuming that I know exact values for id1:id2 but not for
>>> qualifier_term.]
>>>
>>> Thanks
>>> -nikhil
>>> http://stanford.edu/~nikgupta <http://stanford.edu/%7Enikgupta> <
>>> http://stanford.edu/%7Enikgupta>
>>>
>>>
>>

Re: Getting all the different timestamp values for a column

Posted by Jonathan Gray <jl...@streamy.com>.
One small correction...

Scan.setTimeStamp(long) specifies:

* Get versions of columns with the specified timestamp.

That is, you will only get results with that exact timestamp.

If you want to get all stamps:

Scan.setTimeRange(long, long) specifies:

* Get versions of columns only within the specified timestamp range, 
[minStamp, maxStamp).

So to get all versions:

scan.setTimeRange(0, Long.MAX_VALUE)

stack wrote:
> You have to be using 0.20.0 RC2 to do the below.  The below also presumes
> that each value has a different timestamp; i.e. you don't expect to add
> multiple values against same timestamp.
> 
> // Presumes you already have a table instance named 'table'.
> Scan scan = new Scan(Bytes.toBytes("ID1:ID2"));
> scan.addColumn(Bytes.toBytes("family_name"),
> Bytes.toBytes("qualifier_term"));
> scan.setTimeStamp(ts); // This gets all from the timestamp and older.  Use
> setTimeRange if you want to set a range.
> // Optionally, if you want to cap versions returned, see setMaxVersions.
> ResultScanner scanner = table.getScanner(scan);
> .....
> 
> The above is how I'd do it.  Elegance is a tag I'm rarely associated with so
> there may be a better way...
> 
> Hopefuly this helps
> St.Ack
> 
> 
> On Fri, Aug 28, 2009 at 1:59 PM, Nikhil Gupta <gu...@gmail.com> wrote:
> 
>> Hi,
>>
>> I am new to HBase, so please forgive if this question is too basic.
>>
>> We are building a system that stores time series data for different terms
>> in
>> column qualifiers.
>> So, we plan to store data in the form :
>> KeyID -> "ID1:ID2"
>> Family:Qualifier -> "family_name:qualifier_term"
>> Values -> Traffic for that term with different timestamps
>>
>> We will pull this data to display a trends kind of chart on front end.
>> Now if I want to get all the values for different timestamps for a
>> particular column while scanning, how can I do that via a scanner in the
>> most elegant manner ?
>>
>> [assuming that I know exact values for id1:id2 but not for qualifier_term.]
>>
>> Thanks
>> -nikhil
>> http://stanford.edu/~nikgupta <http://stanford.edu/%7Enikgupta>
>>
> 

Re: Getting all the different timestamp values for a column

Posted by stack <st...@duboce.net>.
You have to be using 0.20.0 RC2 to do the below.  The below also presumes
that each value has a different timestamp; i.e. you don't expect to add
multiple values against same timestamp.

// Presumes you already have a table instance named 'table'.
Scan scan = new Scan(Bytes.toBytes("ID1:ID2"));
scan.addColumn(Bytes.toBytes("family_name"),
Bytes.toBytes("qualifier_term"));
scan.setTimeStamp(ts); // This gets all from the timestamp and older.  Use
setTimeRange if you want to set a range.
// Optionally, if you want to cap versions returned, see setMaxVersions.
ResultScanner scanner = table.getScanner(scan);
.....

The above is how I'd do it.  Elegance is a tag I'm rarely associated with so
there may be a better way...

Hopefuly this helps
St.Ack


On Fri, Aug 28, 2009 at 1:59 PM, Nikhil Gupta <gu...@gmail.com> wrote:

> Hi,
>
> I am new to HBase, so please forgive if this question is too basic.
>
> We are building a system that stores time series data for different terms
> in
> column qualifiers.
> So, we plan to store data in the form :
> KeyID -> "ID1:ID2"
> Family:Qualifier -> "family_name:qualifier_term"
> Values -> Traffic for that term with different timestamps
>
> We will pull this data to display a trends kind of chart on front end.
> Now if I want to get all the values for different timestamps for a
> particular column while scanning, how can I do that via a scanner in the
> most elegant manner ?
>
> [assuming that I know exact values for id1:id2 but not for qualifier_term.]
>
> Thanks
> -nikhil
> http://stanford.edu/~nikgupta <http://stanford.edu/%7Enikgupta>
>