You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/05/27 07:39:23 UTC

[GitHub] [airflow] hafid-d opened a new issue #16107: Airflow backfilling can't be disable

hafid-d opened a new issue #16107:
URL: https://github.com/apache/airflow/issues/16107


   **Apache Airflow version**: 2.0.2
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   - **Cloud provider or hardware configuration**:
   - **OS** :  Ubuntu 18.04.3 
   - **Install tools**: celery = 4.4.7, redis = 3.5.3
   
   **What happened**:.
   Noticed a weird behavior with Airflow backfilling: my previous dags are still queing and running even after doing the following :
   
   - Setting catchup_by_default=False in airflow.cfg
   - Setting catchup=False in the DAG definition
   - Using LatestOnlyOperator
   
   **What you expected to happen**: 
   I expect the old dags not to be running again.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hafid-d edited a comment on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
hafid-d edited a comment on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-851827881


   @motherhubbard tried using 2.1.0 but still have the issue :-/ did u update anything else? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] GergelyKalmar commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
GergelyKalmar commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-871418376


   It works when enabling the DAG but when the next scheduled run comes you should see a lot of instances being created (I think one for every second in the given minute). I know it is not exactly backfilling but it kind of looked like it on first sight (just because of the many instances).
   
   I've checked your job and I could reproduce the weird behavior with https://github.com/aws/aws-mwaa-local-runner.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal edited a comment on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
eladkal edited a comment on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-871295711


   > That is in line with what I observed! What is weird though is that Airflow would schedule a separate job for every second even if backfilling is disabled.
   
   So yeah it's a bug in croniter - but your cron expression is valid one it's just doesn't say what it should :) 
   I suggest to open an issue with https://github.com/taichino/croniter/issues/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] GergelyKalmar edited a comment on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
GergelyKalmar edited a comment on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-871418376


   It works as expected when enabling the DAG but when the next scheduled run comes you should see a lot of instances being created (I think one for every second in the given minute). I know it is not exactly backfilling but it kind of looked like it on first sight (just because of the many instances).
   
   I've checked your job and I could reproduce the weird behavior with https://github.com/aws/aws-mwaa-local-runner.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-894720854


   This issue has been closed because it has not received response from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-890259879


   This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-871328572


   @GergelyKalmar can you be more specific about what is the bug?
   
   ```
   from airflow.models import DAG
   from datetime import datetime
   from airflow.operators.bash import BashOperator
   
   with DAG(
       dag_id="16107",
       schedule_interval='5/10 * * * * *',
       start_date=datetime(2018, 1, 1),
       catchup=False
   ) as dag:
   
       BashOperator(task_id='try', bash_command='echo {{ ds }}')
   ```
   
   It behaves as expected:
   ![Screen Shot 2021-06-30 at 14 38 07](https://user-images.githubusercontent.com/45845474/123954221-d2db7000-d9b0-11eb-8a6b-67e74933c7b1.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hafid-d commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
hafid-d commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-851827881


   @motherhubbard tried using 2.1.0 but still have the issue :-/ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] GergelyKalmar commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
GergelyKalmar commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-871287227


   That is in line with what I observed! What is weird though is that Airflow would schedule a separate job for every second even if backfilling is disabled.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hafid-d commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
hafid-d commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-896185090


   Upgraded to 2.1.2 and still facing the issue even with `catchup_by_default = False` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-855593361


   Can you please add a reproduce example?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-869835607


   > the cron expressions were like `5/10 * * * * *`, notice it has one too many stars in it
   
   Could you open an issue for that? This is an invalid expression (I think) IMO we should prevent this from even being accepted. (I think this is a bug in `croniter`, but we should be able to perform additional validation in Airflow if they don’t want to fix it.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] GergelyKalmar commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
GergelyKalmar commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-869559200


   I had a similar issue – in my case it was a mistyped cron that seemingly caused backfills. The trigger seemed to work fine but the intervals were wrong (the cron expressions were like `5/10 * * * * *`, notice it has one too many stars in it).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] motherhubbard commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
motherhubbard commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-850353835


   I had this issue also in 2.0.2. Ive just bumped to 2.1.0 and it seems to be fixed in there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #16107:
URL: https://github.com/apache/airflow/issues/16107


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-869835607


   > the cron expressions were like `5/10 * * * * *`, notice it has one too many stars in it
   
   Could you open an issue for that? This is an invalid expression (I think) IMO we should prevent this from even being accepted. (I think this is a bug in `croniter`, but we should be able to perform additional validation in Airflow if they don’t want to fix it.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-871295711


   > That is in line with what I observed! What is weird though is that Airflow would schedule a separate job for every second even if backfilling is disabled.
   
   So yeah it's a bug in croniter - but your cron expression is valid one it's just doesn't say what it should :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #16107: Airflow backfilling can't be disabled

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #16107:
URL: https://github.com/apache/airflow/issues/16107#issuecomment-871215170


   > the cron expressions were like 5/10 * * * * *, notice it has one too many stars in it
   
   This is a valid cron exp. The 6th parameter means [year](http://www.nncron.ru/help/EN/working/cron-format.htm).
   However there is a bug in croniter it seems they use the 6th parameter wrong and it means **seconds**: https://github.com/taichino/croniter/issues/76
   ```
   >>> from croniter import croniter
   >>> croniter('5/10 * * * * *')
   <croniter.croniter.croniter object at 0x10b07c190>
   >>> croniter('5/10 * * * * 1')
   <croniter.croniter.croniter object at 0x10af92750>
   >>> croniter('5/10 * * * * 59')
   <croniter.croniter.croniter object at 0x10b07c190>
   >>> croniter('5/10 * * * * 60')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/usr/local/lib/python3.7/site-packages/croniter/croniter.py", line 154, in __init__
       self.expanded, self.nth_weekday_of_month = self.expand(expr_format, hash_id=hash_id)
     File "/usr/local/lib/python3.7/site-packages/croniter/croniter.py", line 759, in expand
       return cls._expand(expr_format, hash_id=hash_id)
     File "/usr/local/lib/python3.7/site-packages/croniter/croniter.py", line 725, in _expand
       expr_format))
   croniter.croniter.CroniterBadCronError: [5/10 * * * * 60] is not acceptable, out of range
   >>> croniter('5/10 * * * * 2000')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/usr/local/lib/python3.7/site-packages/croniter/croniter.py", line 154, in __init__
       self.expanded, self.nth_weekday_of_month = self.expand(expr_format, hash_id=hash_id)
     File "/usr/local/lib/python3.7/site-packages/croniter/croniter.py", line 759, in expand
       return cls._expand(expr_format, hash_id=hash_id)
     File "/usr/local/lib/python3.7/site-packages/croniter/croniter.py", line 725, in _expand
       expr_format))
   croniter.croniter.CroniterBadCronError: [5/10 * * * * 2000] is not acceptable, out of range
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org