You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2019/12/29 07:23:48 UTC

[GitHub] [airflow] jaketf edited a comment on issue #6210: [AIRFLOW-5567] BaseReschedulePokeOperator

jaketf edited a comment on issue #6210: [AIRFLOW-5567] BaseReschedulePokeOperator
URL: https://github.com/apache/airflow/pull/6210#issuecomment-569481697
 
 
   I think the conversation here has been educational and thanks all for chiming in.
   Apologies for being MIA on this for so long.
   However, during the time away I've reflected on:
   What exactly is our problem statement? Originally I set out to: "Provide a primitive to construct operators that allow retry / failing of starting a long running external job without blocking a worker for the entire duration of that long running job" For this, We might consider some very different approach:
   Instead of thinking about this as a rescheduling pokes problem, think of it as special kind of SubDag pattern we want to better support: achieve the desired behavior by having a start task and a rescheduling pokes sensor for completion task inside a SubDag. (This might be DOA / short-sighted as I think SubDagOperator task actually blocks a worker to monitor the SubDag completion). But could we perhaps refocus this effort on improving how SubDags monitor for completion? 
   
   However, there was some discussion that adding support for stateful tasks in airflow would have broader impact than just this rescheduling case.
   
   From my read through all these threads it seems the key open questions for the rescheduling approach similar to this PR are:
   - Per @JonnyIncognito  we need to get a strong consensus around scope of idempotency. Does each (rescheduled) task instance have to be idempotent, or does a task need to be idempotent before succeeding or failing?
   - Where should rescheduling logic lie in the class structure? (it seems like it should be moved to BaseOperator).
   - Is a TaskState model/table preferable to the changes it would take for XCom? and what other use cases should be considered in its schema??
   - Should we explore persisting information in the context object.
   
   I'm not that familiar w/ AIP process and if it's something a small group of people from the community get aligned on in a meeting/slack before filing or something and individual just proposes.
   @Fokko @JonnyIncognito @dstandish @mik-laj you all seem to have vested interest in this would you be open to scheduling a meeting or some time to discuss synchronously over slack?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services