You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/10/28 09:17:39 UTC

[GitHub] [flink] tillrohrmann commented on pull request #13648: [FLINK-19632] Introduce a new ResultPartitionType for Approximate Local Recovery

tillrohrmann commented on pull request #13648:
URL: https://github.com/apache/flink/pull/13648#issuecomment-717803324


   Sorry for joining the discussion so late but a couple of questions came up when discussing scheduler changes with Yuan offline. I wanted to ask why we need a special `ResultPartitionType` for the approximate local recovery? Shouldn't it be conceptually possible that we support the normal and approximative recovery behaviour with the same pipelined partitions? If we say that we can reconnect to every pipelined result partition (including dropping partially consumed results), then it can be the responsibility of the scheduler to make sure that producers are restarted as well in order to ensure exactly/at-least once processing guarantees. If not, then we would simply consume from where we have left off.
   
   As far as I understand the existing `ResultPartitionType.PIPELINED(_BOUNDED)` cannot be used because we release the result partition if the downstream consumer disconnects. I believe that this is not a strict contract of pipelined result partitions but more of an implementation artefact. Couldn't we solve the problem of disappearing pipelined result partitions by binding the lifecyle of a pipelined result partition to the lifecycle of a `Task`? We could say that a `Task` can only terminate once the pipelined result partition has been consumed. Moreover, a `Task` will clean up the result partition if it fails or gets canceled. That way, we have a clearly defined lifecycle and make sure that these results get cleaned up (iff the `Task` reaches a terminal state).
   
   I would love to hear your feedback @pnowojski, @zhijiangW and @rkhachatryan and also learn more about you reasoning to introduce a new result partition type.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org