You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/26 17:11:19 UTC
[GitHub] [airflow] CSammy opened a new issue #19844: catchup=False is ignored to some extent, backfill happens
CSammy opened a new issue #19844:
URL: https://github.com/apache/airflow/issues/19844
### Apache Airflow version
2.2.2 (latest released)
### Operating System
Debian GNU/Linux 10 (buster) / official Airflow Docker image
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon==2.4.0
apache-airflow-providers-celery==2.1.0
apache-airflow-providers-cncf-kubernetes==2.1.0
apache-airflow-providers-docker==2.3.0
apache-airflow-providers-elasticsearch==2.1.0
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-google==6.1.0
apache-airflow-providers-grpc==2.0.1
apache-airflow-providers-hashicorp==2.1.1
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-microsoft-azure==3.3.0
apache-airflow-providers-mysql==2.1.1
apache-airflow-providers-odbc==2.0.1
apache-airflow-providers-postgres==2.3.0
apache-airflow-providers-redis==2.0.1
apache-airflow-providers-sendgrid==2.0.1
apache-airflow-providers-sftp==2.2.0
apache-airflow-providers-slack==4.1.0
apache-airflow-providers-sqlite==2.0.1
apache-airflow-providers-ssh==2.3.0
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
Deployment via Helm chart on GKE. Helm chart v 1.3.0, pinned Docker tag to `2.2.2-python3.9`. Isolated namespace on Kubernetes 1.16.
Customization:
- git-sync activated
I can provide the full output of `airflow info` if desired.
Since the question arose in previous conversation: Executor is the `CeleryExecutor`.
### What happened
In a DAG with KubernetesPodOperators, following settings were used:
```python
schedule_interval="0 0 * * 6",
start_date=datetime.datetime(2021, 11, 1),
catchup=False,
```
When running the DAG via the Airflow UI, backfill jobs for the dates `2021-11-13` and `2021-11-20` are created and run.
### What you expected to happen
I expected one job for today being created and run, no backfill jobs.
### How to reproduce
Complete DAG file:
```python
import datetime
import os
from airflow import DAG
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
with DAG(
dag_id="debug_dag",
# Saturday midnight
schedule_interval="0 0 * * 6",
start_date=datetime.datetime(2021, 11, 1),
catchup=False,
tags=["debug dag for catchup tests"],
default_args=default_args,
) as dag:
gcp_test_task = KubernetesPodOperator(
# Task name in Airflow 2 UI
task_id="gcp-test-task",
# Pod name
name="task-gcp-test-task",
"image": "google/cloud-sdk:slim",
cmds=["sleep", "300"],
"namespace": os.environ["K8S_NAMESPACE"],
# K8s service account linked to the GCP service account
"service_account_name": "airflow2-dag-default",
"image_pull_policy": "Always",
"get_logs": True,
)
gcp_test_task
```
Click on the "Run" button to see backfill jobs being created.
### Anything else
This behaviour has been reproducible with multiple DAGs having this `schedule_interval` and `start_date`.
It is not reproducible in the same way however with `schedule_interval="10 3 * * *", start_date=datetime.datetime(2021, 11, 1), catchup=False`. For this one, it shows "Next Run: 2021-11-25 03:10:00" (which is still not what I expected, but it is not backfilling the entire month).
Possibly this is a misunderstanding about scheduling and/or backfill on my part.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #19844: catchup=False is ignored to some extent, backfill happens
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #19844:
URL: https://github.com/apache/airflow/issues/19844#issuecomment-980163993
Thanks for opening your first issue here! Be sure to follow the issue template!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org