You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Roman Puchkovskiy (Jira)" <ji...@apache.org> on 2023/04/18 12:24:00 UTC

[jira] [Commented] (IGNITE-19267) Implement local Low Watermark propagation

    [ https://issues.apache.org/jira/browse/IGNITE-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713572#comment-17713572 ] 

Roman Puchkovskiy commented on IGNITE-19267:
--------------------------------------------

The patch looks good to me

> Implement local Low Watermark propagation
> -----------------------------------------
>
>                 Key: IGNITE-19267
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19267
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Assignee: Kirill Tkalenko
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>          Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> According to the IEP-91, we can delete old data, once it becomes older than the certain threshold. At the moment, we can consider this threshold to be shared between different tables, but be unique on individual nodes. It's called a Low Watermark (LW).
> The way the value is chosen is the following:
>  * There's the {_}data availability time{_}, that can be configured by the user. This is a cluster configuration. It has a value of, for example, 45 minutes. Valid values - {{{}[0, +INF){}}}.
>  * There's a {_}GC frequency{_}, that's also a cluster configuration. For example, 5 minuter. Range of valid values should be more strict.
>  * Last LW value is persisted in Vault.
>  * Every 5 minutes, we assign a new "lwCandidate{{{} = now() - 45min - maxClockSkew"{}}}.
>  ** If there are no running transactions with timestamp below {{{}lwCandidate{}}}, we promote the candidate into a real LW value.
>  ** Otherwise, we trigger GC with timestamp of the transaction with the oldest timestamp (and promoting LW to that timestamp), and raising the bar every time* that transaction is completed. Eventually, we will reach the point where there are no running transactions with timestamp below {{{}lwCandidate{}}}.
>  * it's not necessary to do it every time. But, once the timestamp of the oldest RO transaction is above or equal to {{{}lwCandidate{}}}, we must guarantee its promotion. Everything else is optimization.
>  * If there's a new RO transaction with timestamp below {{{}lwCandidate{}}}, we fail it.
>  * When it comes to running GC - you never delete anything if LW < safeTime. This does not prevent us from propagating new LW value locally, we just don't drop the data until we have the rights to do so.
> Promoted LW value cannot become smaller no matter what. All data below LW is considered to be invalid, maybe broken and completely invisible to user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)