You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "dungnguyen (JIRA)" <ji...@apache.org> on 2019/07/09 18:53:00 UTC

[jira] [Created] (AIRFLOW-4921) scheduler stuck with schedule is 2.30am and timezone is daylight saving time

dungnguyen created AIRFLOW-4921:
-----------------------------------

             Summary: scheduler stuck with schedule is 2.30am and timezone is daylight saving time
                 Key: AIRFLOW-4921
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4921
             Project: Apache Airflow
          Issue Type: Bug
          Components: scheduler
    Affects Versions: 1.10.3
            Reporter: dungnguyen


Pls correct me if I'm wrong, but I'm still able to reproduce the issue using below code

 
{code:java}
import pendulum
from airflow import DAG
from airflow.jobs import SchedulerJob
from datetime import timedelta, datetime

tz = pendulum.timezone('America/New_York')

# Set-up DAG
test_dag = DAG(
  dag_id='foo',
  start_date=datetime(2017, 3, 6, tzinfo=tz),
  schedule_interval='30 02 * * *',
  catchup=True
)

# manually trigger create_dag_run
s = SchedulerJob()
for _ in range(465):
  dag_run = s.create_dag_run(test_dag)
  print(dag_run, datetime.now())
{code}
 

The output will be stuck on the date of daylight saving change

 
{code:java}
[2019-07-09 14:46:38,014] {__init__.py:51} INFO - Using executor SequentialExecutor
<DagRun foo @ 2017-03-06 07:30:00+00:00: scheduled__2017-03-06T07:30:00+00:00, externally triggered: False> 2019-07-09 14:46:38.268086
<DagRun foo @ 2017-03-07 07:30:00+00:00: scheduled__2017-03-07T07:30:00+00:00, externally triggered: False> 2019-07-09 14:46:38.460748
<DagRun foo @ 2017-03-08 07:30:00+00:00: scheduled__2017-03-08T07:30:00+00:00, externally triggered: False> 2019-07-09 14:46:38.649766
<DagRun foo @ 2017-03-09 07:30:00+00:00: scheduled__2017-03-09T07:30:00+00:00, externally triggered: False> 2019-07-09 14:46:38.838063
<DagRun foo @ 2017-03-10 07:30:00+00:00: scheduled__2017-03-10T07:30:00+00:00, externally triggered: False> 2019-07-09 14:46:39.030536
<DagRun foo @ 2017-03-11 07:30:00+00:00: scheduled__2017-03-11T07:30:00+00:00, externally triggered: False> 2019-07-09 14:46:39.219713
<DagRun foo @ 2017-03-12 06:30:00+00:00: scheduled__2017-03-12T06:30:00+00:00, externally triggered: False> 2019-07-09 14:46:39.408905

{code}
 

The suspect I think is a loop in airflow/jobs/scheduler_job.py. This code is stuck when dag.following_schedule(next_run_date) return value with no change

 
{code:java}
# make sure backfills are also considered
last_run = dag.get_last_dagrun(session=session)
if last_run and next_run_date:
  while next_run_date <= last_run.execution_date:
    next_run_date = dag.following_schedule(next_run_date)
{code}
 

Pls let me know if I need to provide more information.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)