You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/26 16:21:08 UTC

[GitHub] [airflow] potiuk commented on issue #9555: AIP-5 Remote DagFetcher

potiuk commented on issue #9555:
URL: https://github.com/apache/airflow/issues/9555#issuecomment-751371128


   I believe this should be discussed at the devlist on whether and how to implement this one. This is a big change to Airflow and the current consensus is that fetching Dags is "external" to Airflow - there are multiple solutions for fetching the DAGs (Git Sync, GCS/S3 sync, shared volumes, sync sidecars in Kubernetes etc. I think we are rather far from reaching understanding on:
   
   * whether we should do anything at all in Airflow
   * whether it should handle pull, push or both 
   * whether it should be part of a scheduler or an external entity doing the sync
   * should we have an API for that (if the push model is to be supported)
   * How to deal with "code packages" (i.e. how to assure atomicity of several DAGs + dependencies)
   
   And last but not least - how it plays together with [Dag Versioning](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-36+DAG+Versioning).
   
   DAG versioning which is another AIP, much closer to being fully fleshed out and it is much closer to reach the consensus - it was dropped from 2.0 release only because we wanted to make sure we deliver 2.0 this year. Some of the questions there (especially atomicity of changes in several dependent files) are common between DAGFetcher and DAGVersioning and need to be answered together I believe.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org