You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by John Lilley <jo...@redpoint.net> on 2015/12/18 01:50:57 UTC

Questions: history of deleted records, controlling timestamps

Greetings,

I've been reading about Phoenix with an eye toward implementing a "versioned database" on Hadoop.  It looks pretty slick, especially the ability to query at past timestamp.  But I can't figure out what happens with deleted records.  Are all versions deleted, or can I still go back in time and see the versions before the delete?

Also I would like to be able to make a set of changes "at the same timestamp" to a get a changeset-like ability similar to a VCS.  It looks like the APIs allow for setting of the effective timestamp for all change operations; is that true?

Thanks
John Lilley


Re: Questions: history of deleted records, controlling timestamps

Posted by Thomas D'Silva <td...@salesforce.com>.
John,

You can use a connection with a scn to ensure all changes are written
with the specified time stamp
https://phoenix.apache.org/faq.html#Can_phoenix_work_on_tables_with_arbitrary_timestamp_as_flexible_as_HBase_API

We are also working on transaction support using Tephra for the
upcoming 4.7 release. All changes within a transaction will either
complete successfully or fail. However we do not support adding
additional metadata to a transaction. Transactions also cannot be used
with the SCN feature.

Thanks,
Thomas

On Fri, Dec 18, 2015 at 6:56 AM, John Lilley <jo...@redpoint.net> wrote:
> Thanks Thomas!
>
> What I'm trying to accomplish with "a set of changes at the same timestamp" is several things.  Basically I'm trying to implement a "versioned database" in which a set of changes to tables are grouped into a "changeset" that we can tag with additional information:
> -- Associate changes across multiple tables (I know this could be done by adding an additional index to the tables, but using timestamp for this purpose would kill two birds with one stone)
> -- Have a clear "transactionlike" event on which we can hang additional meta-data
> -- Make the set of changes all appear to have happened "at the same time"
> -- Hopefully, be able to undo all of the changes of a changeset.
>
> Thanks
> John
>
> -----Original Message-----
> From: Thomas D'Silva [mailto:tdsilva@salesforce.com]
> Sent: Thursday, December 17, 2015 7:56 PM
> To: user@phoenix.apache.org
> Subject: Re: Questions: history of deleted records, controlling timestamps
>
> John,
>
> If you enable KEEP_DELETED_CELLS on the underlying HBase table you will be able to see deleted data (See http://hbase.apache.org/0.94/book/cf.keep.deleted.html ) Could you describe what you mean by making a set of changes at the same timestamp?
>
> Thanks,
> Thomas
>
> On Thu, Dec 17, 2015 at 4:50 PM, John Lilley <jo...@redpoint.net> wrote:
>> Greetings,
>>
>>
>>
>> I’ve been reading about Phoenix with an eye toward implementing a
>> “versioned database” on Hadoop.  It looks pretty slick, especially the
>> ability to query at past timestamp.  But I can’t figure out what
>> happens with deleted records.  Are all versions deleted, or can I
>> still go back in time and see the versions before the delete?
>>
>>
>>
>> Also I would like to be able to make a set of changes “at the same
>> timestamp” to a get a changeset-like ability similar to a VCS.  It
>> looks like the APIs allow for setting of the effective timestamp for
>> all change operations; is that true?
>>
>>
>>
>> Thanks
>>
>> John Lilley
>>
>>

RE: Questions: history of deleted records, controlling timestamps

Posted by John Lilley <jo...@redpoint.net>.
Thanks Thomas!

What I'm trying to accomplish with "a set of changes at the same timestamp" is several things.  Basically I'm trying to implement a "versioned database" in which a set of changes to tables are grouped into a "changeset" that we can tag with additional information:
-- Associate changes across multiple tables (I know this could be done by adding an additional index to the tables, but using timestamp for this purpose would kill two birds with one stone)
-- Have a clear "transactionlike" event on which we can hang additional meta-data
-- Make the set of changes all appear to have happened "at the same time"
-- Hopefully, be able to undo all of the changes of a changeset.

Thanks
John

-----Original Message-----
From: Thomas D'Silva [mailto:tdsilva@salesforce.com] 
Sent: Thursday, December 17, 2015 7:56 PM
To: user@phoenix.apache.org
Subject: Re: Questions: history of deleted records, controlling timestamps

John,

If you enable KEEP_DELETED_CELLS on the underlying HBase table you will be able to see deleted data (See http://hbase.apache.org/0.94/book/cf.keep.deleted.html ) Could you describe what you mean by making a set of changes at the same timestamp?

Thanks,
Thomas

On Thu, Dec 17, 2015 at 4:50 PM, John Lilley <jo...@redpoint.net> wrote:
> Greetings,
>
>
>
> I’ve been reading about Phoenix with an eye toward implementing a 
> “versioned database” on Hadoop.  It looks pretty slick, especially the 
> ability to query at past timestamp.  But I can’t figure out what 
> happens with deleted records.  Are all versions deleted, or can I 
> still go back in time and see the versions before the delete?
>
>
>
> Also I would like to be able to make a set of changes “at the same 
> timestamp” to a get a changeset-like ability similar to a VCS.  It 
> looks like the APIs allow for setting of the effective timestamp for 
> all change operations; is that true?
>
>
>
> Thanks
>
> John Lilley
>
>

Re: Questions: history of deleted records, controlling timestamps

Posted by Thomas D'Silva <td...@salesforce.com>.
John,

If you enable KEEP_DELETED_CELLS on the underlying HBase table you
will be able to see deleted data (See
http://hbase.apache.org/0.94/book/cf.keep.deleted.html )
Could you describe what you mean by making a set of changes at the
same timestamp?

Thanks,
Thomas

On Thu, Dec 17, 2015 at 4:50 PM, John Lilley <jo...@redpoint.net> wrote:
> Greetings,
>
>
>
> I’ve been reading about Phoenix with an eye toward implementing a “versioned
> database” on Hadoop.  It looks pretty slick, especially the ability to query
> at past timestamp.  But I can’t figure out what happens with deleted
> records.  Are all versions deleted, or can I still go back in time and see
> the versions before the delete?
>
>
>
> Also I would like to be able to make a set of changes “at the same
> timestamp” to a get a changeset-like ability similar to a VCS.  It looks
> like the APIs allow for setting of the effective timestamp for all change
> operations; is that true?
>
>
>
> Thanks
>
> John Lilley
>
>