You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yuan Mei (Jira)" <ji...@apache.org> on 2021/10/10 14:21:00 UTC

[jira] [Assigned] (FLINK-23170) Write metadata after materialization

     [ https://issues.apache.org/jira/browse/FLINK-23170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuan Mei reassigned FLINK-23170:
--------------------------------

    Assignee: Yuan Mei

> Write metadata after materialization
> ------------------------------------
>
>                 Key: FLINK-23170
>                 URL: https://issues.apache.org/jira/browse/FLINK-23170
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / State Backends
>            Reporter: Roman Khachatryan
>            Assignee: Yuan Mei
>            Priority: Major
>             Fix For: 1.15.0
>
>
> Currently, changelog state backend writes state metadata on first state access. It is written to the changelog
>  On materialization, the changelog can be truncated, so the metadata needs to be written again.
>  
> Below is a proposed solution using the existing metadtaWritten flag.
> An alternative would be to write metadata at the end of the materialized stream.
>  Yet another approach is to write metadata to a separate file (however, it seems less optimal than at the end of the materialized stream and not so easy as writing again).
> There are several questions to answer:
>  - *When to mark* the metadata as not written (i.e. reset the metadataWritten flag)?
>  ** After starting the materialization - so that any subsequent data is preceded by metadata
>  - *When to request* the write (i.e. call append)
>      At any point (mat. start / mat. end / checkpoint start). It doesn't matter for correctness - see the next points.
>  Scheduling append earlier means:
>  -- including metadata in changelog twice unnecesserily (won't hurt correctness)
>  -- writing for nothing if materialization fails
> Scheduling append later means slowing down the checkpoint
>  So at materialization end seem to be a better tradeoff.
>  - *What* metadata to write?
>       Only for data which were changed after materialization started (so the flag is enough)
>  - *Where* in changelog to write it to?
>      No choice but to the end of the changelog. Because of updating SQN, the metadata will appear at the beginning of the state object returned by persist(sqn) called after materialization completes.
>  - *How to wait for write completion* (before completing checkpoint)?
>  Once appended, the future returned from persist() call should include it already
>   
> So to achieve this it's enough to call appendMetadata() for each changed state upon materialization start, or finish, or 1st checkpoint after it.
> —
>  Another related change is to skip writing metadata on recovery (only if state was read from the changelog). 
>  This can be achieved by setting the flag when requesting the state from ChangeLogApplier.
>  *Please create a separate ticket for that if not implementing in this one.*
> —
>  Note: with TM-side state ownership, actual log truncation may be delayed after materialization (until all the checkpoints using the log are subsumed). This should not affect the above logic.
>   
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)