You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@airflow.apache.org by Sandeep S <30...@gmail.com> on 2020/10/02 12:02:04 UTC

Re: Issues running only one active instance in a DAG

Hi All,

I am having a production issue running only one instance of DAG at a time.
If the DAG is running one instance, a 2nd instance does not kick off. But
if any of the task fails in the active DAG instance, the DAG gets marked
failed but a 2nd instance kicks off after 5 mins(5 mins scheduled time for
DAG.

Please help.

Regards
Sandeep

On Mon, Sep 28, 2020 at 1:18 PM Tavares Forby <tf...@qti.qualcomm.com>
wrote:

>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi All,
>
>
>
>
>
>
>
>
>
>
>
> I am having a few issues with Airflow and task instances greater than
> 750.  I am getting one consistent error and one error that happens random
> (understand, it's technically not random).
>
>
>
>
>
>
>
>
>
>
>
> Consistent error:
>
>
>
>
> [2020-09-25 12:28:01,703] {scheduler_job.py:237}
>
> WARNING - Killing PID 119970 [2020-09-25 12:29:17,110]
> {scheduler_job.py:237} WARNING - Killing PID 121013 [2020-09-25
> 12:29:17,110] {scheduler_job.py:237} WARNING - Killing PID 121013
> [2020-09-25 12:30:12,171] {scheduler_job.py:237} WARNING - Killing PID
>
> 123243
>
>
>
>
>
>
>
>
>
>
>
>
>
> Random error:
>
>
>
>
>  [2020-09-27
>
> 19:37:25,127] {scheduler_job.py:771} INFO - Examining DAG run <DagRun
> tutorial_large_design_debug7 @ 2020-09-28 02:37:24+00:00:
> manual__2020-09-28T02:37:24+00:00, externally triggered: True> [2020-09-27
> 19:37:26,749] {logging_mixin.py:112} INFO - [2020-09-27
>
> 19:37:26,749] {dagrun.py:408} INFO - (MySQLdb.
>
> *exceptions.IntegrityError) (1062, "Duplicate entry 'echo__a*-tutorial_large_design_debug7-2020-
> 09-28 02:37:24.000000' for key 'PRIMARY'")
>
> [SQL: INSERT INTO task_instance (task_id, dag_id, execution_date,
> start_date, end_date, duration, state, try_number, max_tries, hostname,
> unixname, job_id, pool, pool_slots, queue, priority_weight, operator, qu
> eued_dttm, pid, executor_config) VALUES (%s,
>
> %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
> %s)]
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Please
>
> help! thanks!
>
>
>
>
>
>
>

Re: Issues running only one active instance in a DAG

Posted by Sunil Khaire <su...@gmail.com>.
Hi Sandeep ,

Looks good , you can skip max_acitve_runs.

Thanks ,
Sunil K

On Fri, 2 Oct 2020 at 7:20 PM Sandeep Shetty <sh...@gmail.com> wrote:

> Hi Sunil,
>
> Can you please confirm if the below parameters should be at default
> argument or DAG level:
>
> max_active_runs=1
> 'depends_on_past': True
> wait_on_downstream  = True
>
> Regards
> Sandeep
>
> On Fri, Oct 2, 2020 at 9:39 AM Sunil Khaire <su...@gmail.com>
> wrote:
>
>> Please use wait_on_downstream True.
>>
>> This should fix the issue.
>>
>> Thanks ,
>> Sunil K
>>
>> On Fri, 2 Oct 2020 at 6:58 PM Sandeep Shetty <sh...@gmail.com>
>> wrote:
>>
>>> Hi Sunil,
>>>
>>> Let me add more details:
>>> Used Case: The DAG has multiple tasks and is scheduled to run every 5
>>> mins.
>>> Actual result: The DAG kicks off a 2nd run every time there is a failure
>>> in the 1st run. The status of 1st DAG is Failed but 2nd run kicks off after
>>> 5 mins.
>>> Expected Result: The DAG should not kick off a 2nd run unless the first
>>> run completes successfully.
>>>
>>> DAG Code:
>>>
>>> default_args = {
>>>
>>>     'owner': 'xxx',
>>>
>>>     'depends_on_past': True,
>>>
>>>     'start_date': datetime(2020, 6, 15, tzinfo=local_tz),
>>>
>>>     'email': NOTIFY_EMAIL,
>>>
>>>     'email_on_failure': True,
>>>
>>> #    'email_on_retry': True,
>>>
>>> #    'retries': 1,
>>>
>>>     'domain': 'Mediasupplychain'
>>>
>>> #    'retry_delay': timedelta(minutes=30)
>>>
>>> }
>>>
>>>
>>>
>>>
>>>
>>> dag = DAG(DAG_NAME,
>>>
>>>           default_args=default_args,
>>>
>>>           schedule_interval= '0 */3 * * *',
>>>
>>>           catchup=False,
>>>
>>>           max_active_runs=1)
>>>
>>>
>>> Airflow screenshot:
>>> [image: image.png]
>>>
>>> On Fri, Oct 2, 2020 at 9:11 AM Sunil Khaire <su...@gmail.com>
>>> wrote:
>>>
>>>> Hi , Sandeep ,
>>>>
>>>> Its not quite clear what you want. But if I understood correctly may be
>>>> you can try depend_on_past as True or max_active_runs at dag level.
>>>>
>>>>
>>>> Thanks ,
>>>> Sunil Khaire
>>>>
>>>> On Fri, 2 Oct 2020 at 5:32 PM Sandeep S <30...@gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I am having a production issue running only one instance of DAG at a
>>>>> time. If the DAG is running one instance, a 2nd instance does not kick off.
>>>>> But if any of the task fails in the active DAG instance, the DAG gets
>>>>> marked failed but a 2nd instance kicks off after 5 mins(5 mins scheduled
>>>>> time for DAG.
>>>>>
>>>>> Please help.
>>>>>
>>>>> Regards
>>>>> Sandeep
>>>>>
>>>>> On Mon, Sep 28, 2020 at 1:18 PM Tavares Forby <tf...@qti.qualcomm.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am having a few issues with Airflow and task instances greater than
>>>>>> 750.  I am getting one consistent error and one error that happens random
>>>>>> (understand, it's technically not random).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Consistent error:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> [2020-09-25 12:28:01,703] {scheduler_job.py:237}
>>>>>>
>>>>>> WARNING - Killing PID 119970 [2020-09-25 12:29:17,110]
>>>>>> {scheduler_job.py:237} WARNING - Killing PID 121013 [2020-09-25
>>>>>> 12:29:17,110] {scheduler_job.py:237} WARNING - Killing PID 121013
>>>>>> [2020-09-25 12:30:12,171] {scheduler_job.py:237} WARNING - Killing PID
>>>>>>
>>>>>> 123243
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Random error:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>  [2020-09-27
>>>>>>
>>>>>> 19:37:25,127] {scheduler_job.py:771} INFO - Examining DAG run <DagRun
>>>>>> tutorial_large_design_debug7 @ 2020-09-28 02:37:24+00:00:
>>>>>> manual__2020-09-28T02:37:24+00:00, externally triggered: True> [2020-09-27
>>>>>> 19:37:26,749] {logging_mixin.py:112} INFO - [2020-09-27
>>>>>>
>>>>>> 19:37:26,749] {dagrun.py:408} INFO - (MySQLdb.
>>>>>>
>>>>>> *exceptions.IntegrityError) (1062, "Duplicate entry 'echo__a*-tutorial_large_design_debug7-2020-
>>>>>> 09-28 02:37:24.000000' for key 'PRIMARY'")
>>>>>>
>>>>>> [SQL: INSERT INTO task_instance (task_id, dag_id, execution_date,
>>>>>> start_date, end_date, duration, state, try_number, max_tries, hostname,
>>>>>> unixname, job_id, pool, pool_slots, queue, priority_weight, operator, qu
>>>>>> eued_dttm, pid, executor_config) VALUES (%s,
>>>>>>
>>>>>> %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
>>>>>> %s, %s)]
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Please
>>>>>>
>>>>>> help! thanks!
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: Issues running only one active instance in a DAG

Posted by Sunil Khaire <su...@gmail.com>.
Please use wait_on_downstream True.

This should fix the issue.

Thanks ,
Sunil K

On Fri, 2 Oct 2020 at 6:58 PM Sandeep Shetty <sh...@gmail.com> wrote:

> Hi Sunil,
>
> Let me add more details:
> Used Case: The DAG has multiple tasks and is scheduled to run every 5 mins.
> Actual result: The DAG kicks off a 2nd run every time there is a failure
> in the 1st run. The status of 1st DAG is Failed but 2nd run kicks off after
> 5 mins.
> Expected Result: The DAG should not kick off a 2nd run unless the first
> run completes successfully.
>
> DAG Code:
>
> default_args = {
>
>     'owner': 'xxx',
>
>     'depends_on_past': True,
>
>     'start_date': datetime(2020, 6, 15, tzinfo=local_tz),
>
>     'email': NOTIFY_EMAIL,
>
>     'email_on_failure': True,
>
> #    'email_on_retry': True,
>
> #    'retries': 1,
>
>     'domain': 'Mediasupplychain'
>
> #    'retry_delay': timedelta(minutes=30)
>
> }
>
>
>
>
>
> dag = DAG(DAG_NAME,
>
>           default_args=default_args,
>
>           schedule_interval= '0 */3 * * *',
>
>           catchup=False,
>
>           max_active_runs=1)
>
>
> Airflow screenshot:
> [image: image.png]
>
> On Fri, Oct 2, 2020 at 9:11 AM Sunil Khaire <su...@gmail.com>
> wrote:
>
>> Hi , Sandeep ,
>>
>> Its not quite clear what you want. But if I understood correctly may be
>> you can try depend_on_past as True or max_active_runs at dag level.
>>
>>
>> Thanks ,
>> Sunil Khaire
>>
>> On Fri, 2 Oct 2020 at 5:32 PM Sandeep S <30...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I am having a production issue running only one instance of DAG at a
>>> time. If the DAG is running one instance, a 2nd instance does not kick off.
>>> But if any of the task fails in the active DAG instance, the DAG gets
>>> marked failed but a 2nd instance kicks off after 5 mins(5 mins scheduled
>>> time for DAG.
>>>
>>> Please help.
>>>
>>> Regards
>>> Sandeep
>>>
>>> On Mon, Sep 28, 2020 at 1:18 PM Tavares Forby <tf...@qti.qualcomm.com>
>>> wrote:
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Hi All,
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I am having a few issues with Airflow and task instances greater than
>>>> 750.  I am getting one consistent error and one error that happens random
>>>> (understand, it's technically not random).
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Consistent error:
>>>>
>>>>
>>>>
>>>>
>>>> [2020-09-25 12:28:01,703] {scheduler_job.py:237}
>>>>
>>>> WARNING - Killing PID 119970 [2020-09-25 12:29:17,110]
>>>> {scheduler_job.py:237} WARNING - Killing PID 121013 [2020-09-25
>>>> 12:29:17,110] {scheduler_job.py:237} WARNING - Killing PID 121013
>>>> [2020-09-25 12:30:12,171] {scheduler_job.py:237} WARNING - Killing PID
>>>>
>>>> 123243
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Random error:
>>>>
>>>>
>>>>
>>>>
>>>>  [2020-09-27
>>>>
>>>> 19:37:25,127] {scheduler_job.py:771} INFO - Examining DAG run <DagRun
>>>> tutorial_large_design_debug7 @ 2020-09-28 02:37:24+00:00:
>>>> manual__2020-09-28T02:37:24+00:00, externally triggered: True> [2020-09-27
>>>> 19:37:26,749] {logging_mixin.py:112} INFO - [2020-09-27
>>>>
>>>> 19:37:26,749] {dagrun.py:408} INFO - (MySQLdb.
>>>>
>>>> *exceptions.IntegrityError) (1062, "Duplicate entry 'echo__a*-tutorial_large_design_debug7-2020-
>>>> 09-28 02:37:24.000000' for key 'PRIMARY'")
>>>>
>>>> [SQL: INSERT INTO task_instance (task_id, dag_id, execution_date,
>>>> start_date, end_date, duration, state, try_number, max_tries, hostname,
>>>> unixname, job_id, pool, pool_slots, queue, priority_weight, operator, qu
>>>> eued_dttm, pid, executor_config) VALUES (%s,
>>>>
>>>> %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
>>>> %s)]
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Please
>>>>
>>>> help! thanks!
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: Issues running only one active instance in a DAG

Posted by Sunil Khaire <su...@gmail.com>.
Hi , Sandeep ,

Its not quite clear what you want. But if I understood correctly may be you
can try depend_on_past as True or max_active_runs at dag level.


Thanks ,
Sunil Khaire

On Fri, 2 Oct 2020 at 5:32 PM Sandeep S <30...@gmail.com> wrote:

> Hi All,
>
> I am having a production issue running only one instance of DAG at a time.
> If the DAG is running one instance, a 2nd instance does not kick off. But
> if any of the task fails in the active DAG instance, the DAG gets marked
> failed but a 2nd instance kicks off after 5 mins(5 mins scheduled time for
> DAG.
>
> Please help.
>
> Regards
> Sandeep
>
> On Mon, Sep 28, 2020 at 1:18 PM Tavares Forby <tf...@qti.qualcomm.com>
> wrote:
>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Hi All,
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> I am having a few issues with Airflow and task instances greater than
>> 750.  I am getting one consistent error and one error that happens random
>> (understand, it's technically not random).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Consistent error:
>>
>>
>>
>>
>> [2020-09-25 12:28:01,703] {scheduler_job.py:237}
>>
>> WARNING - Killing PID 119970 [2020-09-25 12:29:17,110]
>> {scheduler_job.py:237} WARNING - Killing PID 121013 [2020-09-25
>> 12:29:17,110] {scheduler_job.py:237} WARNING - Killing PID 121013
>> [2020-09-25 12:30:12,171] {scheduler_job.py:237} WARNING - Killing PID
>>
>> 123243
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Random error:
>>
>>
>>
>>
>>  [2020-09-27
>>
>> 19:37:25,127] {scheduler_job.py:771} INFO - Examining DAG run <DagRun
>> tutorial_large_design_debug7 @ 2020-09-28 02:37:24+00:00:
>> manual__2020-09-28T02:37:24+00:00, externally triggered: True> [2020-09-27
>> 19:37:26,749] {logging_mixin.py:112} INFO - [2020-09-27
>>
>> 19:37:26,749] {dagrun.py:408} INFO - (MySQLdb.
>>
>> *exceptions.IntegrityError) (1062, "Duplicate entry 'echo__a*-tutorial_large_design_debug7-2020-
>> 09-28 02:37:24.000000' for key 'PRIMARY'")
>>
>> [SQL: INSERT INTO task_instance (task_id, dag_id, execution_date,
>> start_date, end_date, duration, state, try_number, max_tries, hostname,
>> unixname, job_id, pool, pool_slots, queue, priority_weight, operator, qu
>> eued_dttm, pid, executor_config) VALUES (%s,
>>
>> %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
>> %s)]
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Please
>>
>> help! thanks!
>>
>>
>>
>>
>>
>>
>>
>
>

Re: Issues running only one active instance in a DAG

Posted by Sandeep S <30...@gmail.com>.
Hi All,

>
> I am having a production issue running only one instance of DAG at a time.
> If the DAG is running one instance, a 2nd instance does not kick off. But
> if any of the task fails in the active DAG instance, the DAG gets marked
> failed but a 2nd instance kicks off after 5 mins(5 mins scheduled time for
> DAG.
>
> Please help.
>
> Regards
> Sandeep
>
> On Mon, Sep 28, 2020 at 1:18 PM Tavares Forby <tf...@qti.qualcomm.com>
> wrote:
>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Hi All,
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> I am having a few issues with Airflow and task instances greater than
>> 750.  I am getting one consistent error and one error that happens random
>> (understand, it's technically not random).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Consistent error:
>>
>>
>>
>>
>> [2020-09-25 12:28:01,703] {scheduler_job.py:237}
>>
>> WARNING - Killing PID 119970 [2020-09-25 12:29:17,110]
>> {scheduler_job.py:237} WARNING - Killing PID 121013 [2020-09-25
>> 12:29:17,110] {scheduler_job.py:237} WARNING - Killing PID 121013
>> [2020-09-25 12:30:12,171] {scheduler_job.py:237} WARNING - Killing PID
>>
>> 123243
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Random error:
>>
>>
>>
>>
>>  [2020-09-27
>>
>> 19:37:25,127] {scheduler_job.py:771} INFO - Examining DAG run <DagRun
>> tutorial_large_design_debug7 @ 2020-09-28 02:37:24+00:00:
>> manual__2020-09-28T02:37:24+00:00, externally triggered: True> [2020-09-27
>> 19:37:26,749] {logging_mixin.py:112} INFO - [2020-09-27
>>
>> 19:37:26,749] {dagrun.py:408} INFO - (MySQLdb.
>>
>> *exceptions.IntegrityError) (1062, "Duplicate entry 'echo__a*-tutorial_large_design_debug7-2020-
>> 09-28 02:37:24.000000' for key 'PRIMARY'")
>>
>> [SQL: INSERT INTO task_instance (task_id, dag_id, execution_date,
>> start_date, end_date, duration, state, try_number, max_tries, hostname,
>> unixname, job_id, pool, pool_slots, queue, priority_weight, operator, qu
>> eued_dttm, pid, executor_config) VALUES (%s,
>>
>> %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
>> %s)]
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Please
>>
>> help! thanks!
>>
>>
>>
>>
>>
>>
>>
>
>