You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jobs@airflow.apache.org by GitBox <gi...@apache.org> on 2023/01/10 08:08:21 UTC

[GitHub] [airflow]: Workflow run "Build images for Disable GitSync + Persistence combo in the Helm Chart https://github.com/apache/airflow/pull/28822 " failed!

The GitHub Actions job "Build images for Disable GitSync + Persistence combo in the Helm Chart https://github.com/apache/airflow/pull/28822
" on airflow.git has failed.
Run started by GitHub user potiuk (triggered by potiuk).

Head commit for run:
0caed7a3ba09e1691b9009ad2bd728d5ea2a9d87 / Jarek Potiuk <ja...@potiuk.com>
Disable GitSync + Persistence combo in the Helm Chart

Git Sync and Persistence for DAGs makes very little sense together
and is largely misleading our users on what it does.

Git Sync provides atomicity of DAG folder synchronisation via
checking out a complete copy of the DAGs folder and swapping
symbolic link pointing to it. It does not play well with
networked persistence.

It makes it super-easy by users unaware how git-sync and
persistence work under-the-hood to walk into several traps:

* git sync on persistent remote volumes such as EFS generate a LOT
  of extra traffic due to the way how git sync works (it creates
  second working folder for dags and replaces symbolic link to folders
  which effectively forces full sync of whole DAG folder for all
  involved instances with every commit
* due to that sync that gets distributed over multiple clients of
  persistent volumes it looses the atomicity property of git sync
  and the above case where there are burst of synchronisation betwween
  multiple nodes, it is very likely to trigger inconsistent DAG parsing
* the problem amplifies when the network volumes are distributed among
  multiple nodes and there are some networking limits (for example
  not provisioned IOPS in EFS). The amount of traffic generated at
  sync might cause even more inconsistencies - only solvable by paying
  extra IOPS (where it would not be needed normally)
* users might be tricked into trying to use gitSync and also update
  DAGs using persistence (so basically combine the development friendly
  dag distribution over persistent volumes and production-ready
  git-sync - without being aware that git-sync will override the
  manually synced DAGS when swapping the symbolic links

Closes: #27545
Closes: #27476
Closes: #27080

Related: #27124

Report URL: https://github.com/apache/airflow/actions/runs/3881381091

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: jobs-unsubscribe@airflow.apache.org
For additional commands, e-mail: jobs-help@airflow.apache.org