You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/04/05 08:00:53 UTC

[jira] [Commented] (AIRFLOW-111) DAG concurrency is not honored

    [ https://issues.apache.org/jira/browse/AIRFLOW-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956480#comment-15956480 ] 

ASF subversion and git services commented on AIRFLOW-111:
---------------------------------------------------------

Commit 9070a82775691e08fb1b95c28fbc2cc5ee7b843d in incubator-airflow's branch refs/heads/v1-8-test from [~saguziel]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=9070a82 ]

[AIRFLOW-111] Include queued tasks in scheduler concurrency check

The concurrency argument in dags appears to not be
obeyed because the
scheduler does not check the concurrency properly
when checking tasks.
The tasks do not run, but this leads to a lot of
scheduler churn.

Closes #2214 from saguziel/aguziel-fix-concurrency

(cherry picked from commit 3ff5abee3f9d29e545e021c2c060e9c9f3045236)
Signed-off-by: Bolke de Bruin <bo...@xs4all.nl>


> DAG concurrency is not honored
> ------------------------------
>
>                 Key: AIRFLOW-111
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-111
>             Project: Apache Airflow
>          Issue Type: Sub-task
>          Components: celery, scheduler
>    Affects Versions: Airflow 1.6.2, Airflow 1.7.1.2
>         Environment: Version of Airflow: 1.6.2
> Airflow configuration: Running a Scheduler with LocalExecutor
> Operating System: 3.13.0-74-generic #118-Ubuntu SMP Thu Dec 17 22:52:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> Python Version: 2.7.6
> Screen shots of your DAG's status:
>            Reporter: Shenghu Yang
>             Fix For: 1.8.1
>
>
> Description of Issue
> In airflow.cfg, we set: max_active_runs_per_dag = 1
> In our dag, we set the dag_args['concurrency'] = 8, however, when the scheduler starts to run, we can see this concurrency is not being honored, airflow scheduler will run up to num of the 'parallelism' (we set as 25) task instances for the ONE run dag_run.
> What did you expect to happen?
> dag_args['concurrency'] = 8 is honored, e.g. only run at most 8 task instances concurrently.
> What happened instead?
> when the dag starts to run, we can see the concurrency is not being honored, airflow scheduler/celery worker will run up to the 'parallelism' (we set as 25) task instances.
> Here is how you can reproduce this issue on your machine:
> create a dag which contains nothing but 25 parallelized tasks.
> set the dag dag_args['concurrency'] = 8
> set the airflow parallelism = 25, and max_active_runs_per_dag = 1
> then run: airflow scheduler
> you will see all 25 task instance are scheduled to run, not 8.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)