You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/09/20 23:16:22 UTC

[GitHub] [airflow] kimyen edited a comment on issue #18317: Better Backfill User Experience

kimyen edited a comment on issue #18317:
URL: https://github.com/apache/airflow/issues/18317#issuecomment-923415688


   We have implemented 2 versions of backfill tools for our users. I want to share them here and suggest another version that may be more general to add to Airflow.
   
   ### Backfill Queue
   
   In this version, we were using CeleryExecutor.
   
   #### User experience
   
   Users go to a UI page to trigger a backfill. They would submit a form specifying the dag, the list of tasks, the date range, and  some flags (run backwards, mark success, ...)
   
   After the submission is accepted, the user go to another page to watch the status of the backfill (queued, running with % of success, and list of "aborted" backfill dues to un-recoverable errors or backfill deadlock.) From this page, the author of the backfill can also abort the backfill that is in queued or is running or retry a backfill from where it fails.
   
   #### Extra components added
   
   - Redis queues to hold the queued backfill submissions, running backfill chunks, and aborted backfills.
   - Backfill UI plugins which renders the form and the status page.
   - Backfill API which handle read and write to the Redis queue, used by the UI component.
   - Backfill "worker" which dequeue and run the backfill (via Python) with timeout and retry.
   
   #### Disadvantages
   
   - Only runs 1 backfill at a time, many backfill stuck in queue
   - When backfill is aborted, all running DAG runs remains in `running` state and still require a manual action
   
   ### Backfill on demand
   
   In this version, we use the KubernetesExecutor.
   
   #### User experience
   
   User issues a backfill (in our case, it is in form of a chatops command, which is a http request to the chatops server. The chatops server have access to our K8s cluster, and bring up a pod that runs backfill.) 
   
   User can get logs and abort the backfill via different chatops command (chatops server APIs).
   
   #### Extra components added
   
   - K8s Backfill template wrapped by `if backfill enabled` (default to `False`)
   - chatops server (which we already have to handle deploying DAG changes)
   
   #### Disadvantages
   
   - When backfill is aborted, all on-going DAG runs remains in `running` state and still require a manual action
   
   ### Proposed solution
   
   #### User experience
   
   User goes to the DAG page and trigger a backfill with data range and other backfill flags
   
   #### Extra components
   
   - K8s Backfill template as described above. The UI would be using this template to bring up a new pod that runs backfill based on the user's input
   - UI pages that allows user to select data range and flags
   - UI pages that show status of the backfill or error messages from the backfill pod
   
   
   **_I have wanted to share our ideas and hope to convert with Airflow's implementation so that we do not need to manage our own patch. We are happy to see this issue being assigned. Please let me know if there are more information that I can share to help pushing this further. Otherwise, we will be happy with whichever implementation Airflow chose to support better backfill experience._** 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org