You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kamil Bregula (Jira)" <ji...@apache.org> on 2020/03/01 10:37:00 UTC

[jira] [Resolved] (AIRFLOW-6532) Fetch celery states using batch method instead Pool

     [ https://issues.apache.org/jira/browse/AIRFLOW-6532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kamil Bregula resolved AIRFLOW-6532.
------------------------------------
    Resolution: Duplicate

> Fetch celery states using batch method instead Pool
> ---------------------------------------------------
>
>                 Key: AIRFLOW-6532
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6532
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: executors
>    Affects Versions: 1.10.7
>            Reporter: Kamil Bregula
>            Priority: Major
>
> One aspect that is worth checking is how much time Celery takes to receive task statuses.
> https://github.com/apache/airflow/blob/77099b876814ec0008fd8da18f35de70deccbe03/airflow/executors/celery_executor.py#L246-L259
> My clients use MySQL as the result backend, so celery sends 100 queries to the database for 100 tasks.
> https://github.com/celery/celery/blob/77099b876814ec0008fd8da18f35de70deccbe03/airflow/backends/database/__init__.py#L149-L164
> In my opinion, this can speed up if we replace our code by calling the method from Celery - celery.backends.base:BaseKeyValueStoreBackend.get_many
> https://github.com/celery/celery/blob/77099b876814ec0008fd8da18f35de70deccbe03/celery/backends/base.py#L711-L747
> Unfortunately, this method works only with Redis, so we will have to extend the mget / get_many method in DatabaseBackend class to work properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)