You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/06/25 19:34:55 UTC

[GitHub] [airflow] IanDoarn opened a new issue #16667: Scheduled runs on Monday UTC marked success but no tasks are run

IanDoarn opened a new issue #16667:
URL: https://github.com/apache/airflow/issues/16667


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   <!--
   
   IMPORTANT!!!
   
   PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
   NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
   
   PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
   
   Please complete the next sections or the issue will be closed.
   These questions are the first thing we need to know to understand the context.
   
   -->
   
   **Apache Airflow version**: 2.0.1
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):  
   ```
   Client Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2018-10-10T16:38:01Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"darwin/amd64"}
   Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.16", GitCommit:"7a98bb2b7c9112935387825f2fce1b7d40b76236", GitTreeState:"clean", BuildDate:"2021-02-17T11:52:32Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
   ```
   
   **Environment**: 
   
   - **Cloud provider or hardware configuration**: Cloud
   - **OS** (e.g. from /etc/os-release): Docker debian 
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**:
   - **Others**: Docker python:3.8-buster, deployed to internal k8s clusters
   
   **What happened**:
   DagRuns scheduled to run on Monday UTC time via a cron schedule are marked as success but the tasks are never run.
   
   ![image](https://user-images.githubusercontent.com/22099881/123475237-b90ce800-d5c0-11eb-939d-4f4dc0cb7d19.png)
   ![image](https://user-images.githubusercontent.com/22099881/123475413-fc675680-d5c0-11eb-92ac-16b64623c797.png)
   
   
   airflow cfg values:
   
   `default_timezone = utc`
   
   
   
   
   **What you expected to happen**:
   
   DAGs should trigger on Mondays but do not. Below is an excerpt from one of the dags we are running.
   
   ```python
   
   default_args = {
       'owner': 'prepayde',
       'email': ['foobar@baz.com'],
       'start_date': days_ago(1),
       'retries': 0
   }
   
   with DAG(
       dag_id=f"DUMMY_DAG",
       default_args=default_args,
       catchup=False,
       max_active_runs=1,
       schedule_interval='30 1 * * 2-6',
       tags=[CONFIG['client_name'].lower()],
   ) as dag:
       dag.doc_md = __doc__
   
       start_task = BashOperator(
           task_id="start_task",
           bash_command="echo start",
       )
   
       end_task = BashOperator(
           task_id="end_task",
           bash_command="echo end",
       )
   
       start_task >> end_task
   
   ```
   
   **How to reproduce it**:
   <!---
   
   As minimally and precisely as possible. Keep in mind we do not have access to your cluster or dags.
   
   If you are using kubernetes, please attempt to recreate the issue using minikube or kind.
   
   ## Install minikube/kind
   
   - Minikube https://minikube.sigs.k8s.io/docs/start/
   - Kind https://kind.sigs.k8s.io/docs/user/quick-start/
   
   If this is a UI bug, please provide a screenshot of the bug or a link to a youtube video of the bug in action
   
   You can include images using the .md style of
   ![alt text](http://url/to/img.png)
   
   To record a screencast, mac users can use QuickTime and then create an unlisted youtube video with the resulting .mov file.
   
   --->
   
   
   **Anything else we need to know**:
   Our scheduler config
   
   ```
   [scheduler]
   # Task instances listen for external kill signal (when you clear tasks
   # from the CLI or the UI), this defines the frequency at which they should
   # listen (in seconds).
   job_heartbeat_sec = 5
   
   # How often (in seconds) to check and tidy up 'running' TaskInstancess
   # that no longer have a matching DagRun
   clean_tis_without_dagrun_interval = 15.0
   
   # The scheduler constantly tries to trigger new tasks (look at the
   # scheduler section in the docs for more information). This defines
   # how often the scheduler should run (in seconds).
   scheduler_heartbeat_sec = 5
   
   # The number of times to try to schedule each DAG file
   # -1 indicates unlimited number
   num_runs = -1
   
   # The number of seconds to wait between consecutive DAG file processing
   processor_poll_interval = 1
   
   # after how much time (seconds) a new DAGs should be picked up from the filesystem
   min_file_process_interval = 0
   
   # How often (in seconds) to scan the DAGs directory for new files. Default to 5 minutes.
   dag_dir_list_interval = 300
   
   # How often should stats be printed to the logs. Setting to 0 will disable printing stats
   print_stats_interval = 30
   
   # How often (in seconds) should pool usage stats be sent to statsd (if statsd_on is enabled)
   pool_metrics_interval = 5.0
   
   # If the last scheduler heartbeat happened more than scheduler_health_check_threshold
   # ago (in seconds), scheduler is considered unhealthy.
   # This is used by the health check in the "/health" endpoint
   scheduler_health_check_threshold = 30
   
   # How often (in seconds) should the scheduler check for orphaned tasks and SchedulerJobs
   orphaned_tasks_check_interval = 300.0
   child_process_log_directory = $AIRFLOW_HOME/logs/scheduler
   
   # Local task jobs periodically heartbeat to the DB. If the job has
   # not heartbeat in this many seconds, the scheduler will mark the
   # associated task instance as failed and will re-schedule the task.
   scheduler_zombie_task_threshold = 300
   
   # Turn off scheduler catchup by setting this to ``False``.
   # Default behavior is unchanged and
   # Command Line Backfills still work, but the scheduler
   # will not do scheduler catchup if this is ``False``,
   # however it can be set on a per DAG basis in the
   # DAG definition (catchup)
   catchup_by_default = True
   
   # This changes the batch size of queries in the scheduling main loop.
   # If this is too high, SQL query performance may be impacted by one
   # or more of the following:
   # - reversion to full table scan
   # - complexity of query predicate
   # - excessive locking
   # Additionally, you may hit the maximum allowable query length for your db.
   # Set this to 0 for no limit (not advised)
   max_tis_per_query = 512
   
   # Should the scheduler issue ``SELECT ... FOR UPDATE`` in relevant queries.
   # If this is set to False then you should not run more than a single
   # scheduler at once
   use_row_level_locking = True
   
   # Max number of DAGs to create DagRuns for per scheduler loop
   #
   # Default: 10
   # max_dagruns_to_create_per_loop =
   
   # How many DagRuns should a scheduler examine (and lock) when scheduling
   # and queuing tasks.
   #
   # Default: 20
   # max_dagruns_per_loop_to_schedule =
   
   # Should the Task supervisor process perform a "mini scheduler" to attempt to schedule more tasks of the
   # same DAG. Leaving this on will mean tasks in the same DAG execute quicker, but might starve out other
   # dags in some circumstances
   #
   # Default: True
   # schedule_after_task_execution =
   
   # The scheduler can run multiple processes in parallel to parse dags.
   # This defines how many processes will run.
   parsing_processes = 2
   
   # Turn off scheduler use of cron intervals by setting this to False.
   # DAGs submitted manually in the web UI or with trigger_dag will still run.
   use_job_schedule = True
   
   # Allow externally triggered DagRuns for Execution Dates in the future
   # Only has effect if schedule_interval is set to None in DAG
   allow_trigger_in_future = False
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] willsims14 removed a comment on issue #16667: Scheduled DagRun runs on Monday UTC marked success but no tasks are run

Posted by GitBox <gi...@apache.org>.
willsims14 removed a comment on issue #16667:
URL: https://github.com/apache/airflow/issues/16667#issuecomment-868791188


   Works on my machine
   ![index](https://user-images.githubusercontent.com/20601403/123476908-efe3fd80-d5c2-11eb-8af9-a2d98170a2af.jpg)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] IanDoarn closed issue #16667: Scheduled DagRun runs on Monday UTC marked success but no tasks are run

Posted by GitBox <gi...@apache.org>.
IanDoarn closed issue #16667:
URL: https://github.com/apache/airflow/issues/16667


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] IanDoarn commented on issue #16667: Scheduled DagRun runs on Monday UTC marked success but no tasks are run

Posted by GitBox <gi...@apache.org>.
IanDoarn commented on issue #16667:
URL: https://github.com/apache/airflow/issues/16667#issuecomment-869735181


   Issue was solved by setting start_date to a datetime() sometime in the past well before the current date
   
   https://github.com/apache/airflow/issues/16694


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] IanDoarn closed issue #16667: Scheduled DagRun runs on Monday UTC marked success but no tasks are run

Posted by GitBox <gi...@apache.org>.
IanDoarn closed issue #16667:
URL: https://github.com/apache/airflow/issues/16667


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] IanDoarn commented on issue #16667: Scheduled DagRun runs on Monday UTC marked success but no tasks are run

Posted by GitBox <gi...@apache.org>.
IanDoarn commented on issue #16667:
URL: https://github.com/apache/airflow/issues/16667#issuecomment-869735181


   Issue was solved by setting start_date to a datetime() sometime in the past well before the current date
   
   https://github.com/apache/airflow/issues/16694


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] willsims14 commented on issue #16667: Scheduled DagRun runs on Monday UTC marked success but no tasks are run

Posted by GitBox <gi...@apache.org>.
willsims14 commented on issue #16667:
URL: https://github.com/apache/airflow/issues/16667#issuecomment-868791188


   Works on my machine
   ![index](https://user-images.githubusercontent.com/20601403/123476908-efe3fd80-d5c2-11eb-8af9-a2d98170a2af.jpg)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #16667: Scheduled runs on Monday UTC marked success but no tasks are run

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #16667:
URL: https://github.com/apache/airflow/issues/16667#issuecomment-868789545


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org