You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jobs@airflow.apache.org by GitBox <gi...@apache.org> on 2023/01/10 08:08:21 UTC
[GitHub] [airflow]: Workflow run "Build images for Disable GitSync + Persistence combo in the Helm Chart https://github.com/apache/airflow/pull/28822 " failed!
The GitHub Actions job "Build images for Disable GitSync + Persistence combo in the Helm Chart https://github.com/apache/airflow/pull/28822
" on airflow.git has failed.
Run started by GitHub user potiuk (triggered by potiuk).
Head commit for run:
0caed7a3ba09e1691b9009ad2bd728d5ea2a9d87 / Jarek Potiuk <ja...@potiuk.com>
Disable GitSync + Persistence combo in the Helm Chart
Git Sync and Persistence for DAGs makes very little sense together
and is largely misleading our users on what it does.
Git Sync provides atomicity of DAG folder synchronisation via
checking out a complete copy of the DAGs folder and swapping
symbolic link pointing to it. It does not play well with
networked persistence.
It makes it super-easy by users unaware how git-sync and
persistence work under-the-hood to walk into several traps:
* git sync on persistent remote volumes such as EFS generate a LOT
of extra traffic due to the way how git sync works (it creates
second working folder for dags and replaces symbolic link to folders
which effectively forces full sync of whole DAG folder for all
involved instances with every commit
* due to that sync that gets distributed over multiple clients of
persistent volumes it looses the atomicity property of git sync
and the above case where there are burst of synchronisation betwween
multiple nodes, it is very likely to trigger inconsistent DAG parsing
* the problem amplifies when the network volumes are distributed among
multiple nodes and there are some networking limits (for example
not provisioned IOPS in EFS). The amount of traffic generated at
sync might cause even more inconsistencies - only solvable by paying
extra IOPS (where it would not be needed normally)
* users might be tricked into trying to use gitSync and also update
DAGs using persistence (so basically combine the development friendly
dag distribution over persistent volumes and production-ready
git-sync - without being aware that git-sync will override the
manually synced DAGS when swapping the symbolic links
Closes: #27545
Closes: #27476
Closes: #27080
Related: #27124
Report URL: https://github.com/apache/airflow/actions/runs/3881381091
With regards,
GitHub Actions via GitBox
---------------------------------------------------------------------
To unsubscribe, e-mail: jobs-unsubscribe@airflow.apache.org
For additional commands, e-mail: jobs-help@airflow.apache.org