You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/07/04 13:56:02 UTC

[jira] [Commented] (AIRFLOW-4478) Operators instantiate many duplicate objects

    [ https://issues.apache.org/jira/browse/AIRFLOW-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878676#comment-16878676 ] 

ASF GitHub Bot commented on AIRFLOW-4478:
-----------------------------------------

ashb commented on pull request #5259: [AIRFLOW-4478] Lazily instantiate default `Resources` object
URL: https://github.com/apache/airflow/pull/5259
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Operators instantiate many duplicate objects
> --------------------------------------------
>
>                 Key: AIRFLOW-4478
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4478
>             Project: Apache Airflow
>          Issue Type: Improvement
>            Reporter: Josh Carp
>            Assignee: Huihua Zhang
>            Priority: Trivial
>
> `BaseOperator` creates a `Resources` instance, which in turn creates four `Resource` instances. Class creation in python isn't free; creating `Resources` and its child classes takes ~5μs out of a total of ~20μs to instantiate a `BaseOperator` on my system. This time adds up when creating tens of thousands of operators, especially in environments like GCP Cloud Composer that are very sensitive to DAG parse time.
> Assuming that most users don't actually configure task resources, since they're only respected by the non-default `CgroupTaskRunner`, we can save time by creating a single `Resources` instance and sharing it across tasks that don't set `resources`. We could do even better by allowing users to pass a `Resources` instance to `BaseOperator` rather than passing a `dict` that's used to instantiate `Resources`, but that would be a breaking change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)