You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kamil Bregula (Jira)" <ji...@apache.org> on 2020/03/01 10:37:00 UTC
[jira] [Resolved] (AIRFLOW-6532) Fetch celery states using batch
method instead Pool
[ https://issues.apache.org/jira/browse/AIRFLOW-6532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kamil Bregula resolved AIRFLOW-6532.
------------------------------------
Resolution: Duplicate
> Fetch celery states using batch method instead Pool
> ---------------------------------------------------
>
> Key: AIRFLOW-6532
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6532
> Project: Apache Airflow
> Issue Type: Improvement
> Components: executors
> Affects Versions: 1.10.7
> Reporter: Kamil Bregula
> Priority: Major
>
> One aspect that is worth checking is how much time Celery takes to receive task statuses.
> https://github.com/apache/airflow/blob/77099b876814ec0008fd8da18f35de70deccbe03/airflow/executors/celery_executor.py#L246-L259
> My clients use MySQL as the result backend, so celery sends 100 queries to the database for 100 tasks.
> https://github.com/celery/celery/blob/77099b876814ec0008fd8da18f35de70deccbe03/airflow/backends/database/__init__.py#L149-L164
> In my opinion, this can speed up if we replace our code by calling the method from Celery - celery.backends.base:BaseKeyValueStoreBackend.get_many
> https://github.com/celery/celery/blob/77099b876814ec0008fd8da18f35de70deccbe03/celery/backends/base.py#L711-L747
> Unfortunately, this method works only with Redis, so we will have to extend the mget / get_many method in DatabaseBackend class to work properly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)