You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@omid.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2019/05/24 21:14:00 UTC

[jira] [Comment Edited] (OMID-147) Discuss better/faster ways of garbage collection during HBase major compactions

    [ https://issues.apache.org/jira/browse/OMID-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16847832#comment-16847832 ] 

Lars Hofhansl edited comment on OMID-147 at 5/24/19 9:13 PM:
-------------------------------------------------------------

[~ohads] and I had a brief discussion about this.

Ohad explained to me that the problem is an earlier transaction that might still see a snapshot before the latest transaction committed the delete. Hence we need to make sure that all prior transactions are finished and the only current guarantee for that is a transaction timestamp older than the low watermark.


In order to help with that we could introduce the notion of the maximum runtime of a transaction. That is reasonable anyway, not transaction is expected to run forever.
 This could be configurable - and default to (say) 1h.

The the low water mark then is the timestamp of the last transaction evicted from the conflictMap or 1h whatever comes first.
 Any attempt to act on a transaction after 1h after its start should then also fail and throw an exception and a transaction that is not committed after 1h should be considered failed.

 


was (Author: lhofhansl):
[~ohads] and I had a brief discussion about this.
We could introduce the notion of the maximum runtime of a transaction. That is reasonable anyway, not transaction is expected to run forever.
This could be configurable and default to (say) 1h.

The the low water mark is the timestamp of the last transaction evicted from the conflictMap or 1h whatever comes first.
Any attempt to act on a transaction after 1h after its start should then also fail and throw an exception and a transaction that is not committed after 1h should be considered failed.


> Discuss better/faster ways of garbage collection during HBase major compactions
> -------------------------------------------------------------------------------
>
>                 Key: OMID-147
>                 URL: https://issues.apache.org/jira/browse/OMID-147
>             Project: Apache Omid
>          Issue Type: Wish
>    Affects Versions: 1.0.1
>            Reporter: Lars Hofhansl
>            Priority: Major
>
> *Not for 1.0.1*
> In our use of HBase/Phoenix we very frequently need to delete a lot of data (customers leave, we have GDPR requests and various other reasons).
> We need to be able to ensure that data marked for deletion in HBase is removed no later than some specific point in time. Currently this is hard to achieve - see OMID-142.
> So let's have a discussion here, about how this could happen. Either by some manual step, or -preferably - by disconnecting the conflictMap from when a transaction's deleted data becomes eligible for physical removal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)