You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ioakim Perros <im...@gmail.com> on 2012/08/28 02:01:21 UTC

Retrieving 2 separate timestamps' values

Hi,

Is there any way of retrieving two values with totally different 
timestamps from a table?

I am using timestamps as iteration counts, and I would like to be able 
to get at each iteration (besides the previous iteration results from 
table) some pre-computed amounts I save at some columns with timestamp 
0, avoiding the cost of retrieving all table's versions.

The only way I have come up with is to save the pre-computed amounts 
redundantly at all timestamps up to the maximum possible.

Does anyone have an idea on a more efficient way of dealing with this?

Thanks and regards,
IP

Re: Retrieving 2 separate timestamps' values

Posted by Ioakim Perros <im...@gmail.com>.
Unfortunately the way I am reading/writing data from/to parts of my table would be incompatible with this solution. 

In any case, thank you very much for your time.

On Aug 28, 2012, at 4:10, Mohit Anchlia <mo...@gmail.com> wrote:

> Have you thought of making your row key as key+timestamp? And then you can
> do scan on the columns itself?
> 
> On Mon, Aug 27, 2012 at 5:53 PM, Ioakim Perros <im...@gmail.com> wrote:
> 
>> Of course, thank you for responding.
>> 
>> I have an iterative procedure where I get and put data from/to an HBase
>> table, and I am setting at each Put the timestamp equal to each iteration's
>> number, as it is efficient to check for convergence in this way (by just
>> retrieving the 2 last versions of my columns).
>> 
>> Some amounts of my equations are the same through iterations, and I save
>> them (serialized) at two specific columns of my table with timestamp equal
>> to zero. The rest of my table's columns contain the (serialized)
>> alternating results of my iterations.
>> 
>> The thing is that the cached amounts are necessary to be read at each and
>> every iteration, but it would not be efficient to scan all versions of all
>> columns of my table, just to retrieve the previous iteration's results plus
>> the initially saved cached amounts.
>> 
>> For example, being at iteration 30 I would like to retrieve only columns 3
>> and 4 with timestamp 29 and columns 0 and 1 with timestamp 0.
>> 
>> With the current HBase's API, I am not sure if this is possible and the
>> solution I described at my previous message (by storing columns 0 and 1 at
>> all timestamps up to 40 for example) seems inefficient.
>> 
>> Any ideas?
>> 
>> Thanks and regards,
>> IP
>> 
>> 
>> On 08/28/2012 03:33 AM, Mohit Anchlia wrote:
>> 
>>> You timestamp as in version? Can you describe your scenario with more
>>> concrete example?
>>> 
>>> On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros <im...@gmail.com>
>>> wrote:
>>> 
>>> Hi,
>>>> 
>>>> Is there any way of retrieving two values with totally different
>>>> timestamps from a table?
>>>> 
>>>> I am using timestamps as iteration counts, and I would like to be able to
>>>> get at each iteration (besides the previous iteration results from table)
>>>> some pre-computed amounts I save at some columns with timestamp 0,
>>>> avoiding
>>>> the cost of retrieving all table's versions.
>>>> 
>>>> The only way I have come up with is to save the pre-computed amounts
>>>> redundantly at all timestamps up to the maximum possible.
>>>> 
>>>> Does anyone have an idea on a more efficient way of dealing with this?
>>>> 
>>>> Thanks and regards,
>>>> IP
>>>> 
>>>> 
>> 

Re: Retrieving 2 separate timestamps' values

Posted by Mohit Anchlia <mo...@gmail.com>.
Have you thought of making your row key as key+timestamp? And then you can
do scan on the columns itself?

On Mon, Aug 27, 2012 at 5:53 PM, Ioakim Perros <im...@gmail.com> wrote:

> Of course, thank you for responding.
>
> I have an iterative procedure where I get and put data from/to an HBase
> table, and I am setting at each Put the timestamp equal to each iteration's
> number, as it is efficient to check for convergence in this way (by just
> retrieving the 2 last versions of my columns).
>
> Some amounts of my equations are the same through iterations, and I save
> them (serialized) at two specific columns of my table with timestamp equal
> to zero. The rest of my table's columns contain the (serialized)
> alternating results of my iterations.
>
> The thing is that the cached amounts are necessary to be read at each and
> every iteration, but it would not be efficient to scan all versions of all
> columns of my table, just to retrieve the previous iteration's results plus
> the initially saved cached amounts.
>
> For example, being at iteration 30 I would like to retrieve only columns 3
> and 4 with timestamp 29 and columns 0 and 1 with timestamp 0.
>
> With the current HBase's API, I am not sure if this is possible and the
> solution I described at my previous message (by storing columns 0 and 1 at
> all timestamps up to 40 for example) seems inefficient.
>
> Any ideas?
>
> Thanks and regards,
> IP
>
>
> On 08/28/2012 03:33 AM, Mohit Anchlia wrote:
>
>> You timestamp as in version? Can you describe your scenario with more
>> concrete example?
>>
>> On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros <im...@gmail.com>
>> wrote:
>>
>> Hi,
>>>
>>> Is there any way of retrieving two values with totally different
>>> timestamps from a table?
>>>
>>> I am using timestamps as iteration counts, and I would like to be able to
>>> get at each iteration (besides the previous iteration results from table)
>>> some pre-computed amounts I save at some columns with timestamp 0,
>>> avoiding
>>> the cost of retrieving all table's versions.
>>>
>>> The only way I have come up with is to save the pre-computed amounts
>>> redundantly at all timestamps up to the maximum possible.
>>>
>>> Does anyone have an idea on a more efficient way of dealing with this?
>>>
>>> Thanks and regards,
>>> IP
>>>
>>>
>

Re: Retrieving 2 separate timestamps' values

Posted by Ioakim Perros <im...@gmail.com>.
Of course, thank you for responding.

I have an iterative procedure where I get and put data from/to an HBase 
table, and I am setting at each Put the timestamp equal to each 
iteration's number, as it is efficient to check for convergence in this 
way (by just retrieving the 2 last versions of my columns).

Some amounts of my equations are the same through iterations, and I save 
them (serialized) at two specific columns of my table with timestamp 
equal to zero. The rest of my table's columns contain the (serialized) 
alternating results of my iterations.

The thing is that the cached amounts are necessary to be read at each 
and every iteration, but it would not be efficient to scan all versions 
of all columns of my table, just to retrieve the previous iteration's 
results plus the initially saved cached amounts.

For example, being at iteration 30 I would like to retrieve only columns 
3 and 4 with timestamp 29 and columns 0 and 1 with timestamp 0.

With the current HBase's API, I am not sure if this is possible and the 
solution I described at my previous message (by storing columns 0 and 1 
at all timestamps up to 40 for example) seems inefficient.

Any ideas?

Thanks and regards,
IP

On 08/28/2012 03:33 AM, Mohit Anchlia wrote:
> You timestamp as in version? Can you describe your scenario with more
> concrete example?
>
> On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros <im...@gmail.com> wrote:
>
>> Hi,
>>
>> Is there any way of retrieving two values with totally different
>> timestamps from a table?
>>
>> I am using timestamps as iteration counts, and I would like to be able to
>> get at each iteration (besides the previous iteration results from table)
>> some pre-computed amounts I save at some columns with timestamp 0, avoiding
>> the cost of retrieving all table's versions.
>>
>> The only way I have come up with is to save the pre-computed amounts
>> redundantly at all timestamps up to the maximum possible.
>>
>> Does anyone have an idea on a more efficient way of dealing with this?
>>
>> Thanks and regards,
>> IP
>>


Re: Retrieving 2 separate timestamps' values

Posted by Mohit Anchlia <mo...@gmail.com>.
You timestamp as in version? Can you describe your scenario with more
concrete example?

On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros <im...@gmail.com> wrote:

> Hi,
>
> Is there any way of retrieving two values with totally different
> timestamps from a table?
>
> I am using timestamps as iteration counts, and I would like to be able to
> get at each iteration (besides the previous iteration results from table)
> some pre-computed amounts I save at some columns with timestamp 0, avoiding
> the cost of retrieving all table's versions.
>
> The only way I have come up with is to save the pre-computed amounts
> redundantly at all timestamps up to the maximum possible.
>
> Does anyone have an idea on a more efficient way of dealing with this?
>
> Thanks and regards,
> IP
>