You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "uranusjr (via GitHub)" <gi...@apache.org> on 2023/03/07 05:27:06 UTC

[GitHub] [airflow] uranusjr opened a new pull request, #29946: Delay ConnectionModelView init until it's accessed

uranusjr opened a new pull request, #29946:
URL: https://github.com/apache/airflow/pull/29946

   This removed the "lazy" initialization of ConnectionModelView (and ConnectionForm used by it) that happens when the webserver first launches, and moves the corresponding logic to until the view is actually being accessed.
   
   This further delays the provider manager from loading connection form widgets until the last possible moment and speeds up the webserver's startup sequence.
   
   While this does make the connection form view a bit slower (only the first time since the result is cached), the slowdown should not be as noticeable since the provider manager should generally be partially loaded into memory at that point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on code in PR #29946:
URL: https://github.com/apache/airflow/pull/29946#discussion_r1127363134


##########
airflow/www/views.py:
##########
@@ -4512,17 +4507,14 @@ def process_form(self, form, is_created):
                 del form.extra
         del extra_json
 
-        for key in self.extra_fields:
+        for key, field_name in form.iter_extra_field_names():

Review Comment:
   `extra_fields` is removed since it is simply `extra_field_name_mapping.keys()`, but since Python 3.7 the dict maintains ordering on its own the extra list is not needed anymore.
   
   `extra_field_name_mapping` is delibrately moved to ConnectionForm since Flask-Appbuilder has some annoying introspecting mechanism that tries to resolve values on the view class too eagerly. Moving the property to the form class ensures it is only calculated when needed.
   
   Note that the form does not even cache the value; the providers manager already does the caching, and an additional cache layer provides minimal benefit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on a diff in pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "ashb (via GitHub)" <gi...@apache.org>.
ashb commented on code in PR #29946:
URL: https://github.com/apache/airflow/pull/29946#discussion_r1127720560


##########
airflow/www/views.py:
##########
@@ -4397,7 +4376,42 @@ class ConnectionModelView(AirflowModelView):
 
     base_order = ("conn_id", "asc")
 
-    extra_field_name_mapping: dict[str, str] = {}
+    def _iter_extra_field_names(self) -> Iterator[tuple[str, str]]:
+        """Iterate through provider-backed connection fields.
+
+        Note that this cannot be a property (including a cached property)
+        because Flask-Appbuilder attempts to access all members on startup, and
+        using a property would initialize the providers manager too eagerly.
+        """
+        return ((k, v.field_name) for k, v in ProvidersManager().connection_form_widgets.items())
+
+    @property
+    def add_columns(self) -> list[str]:
+        """A list of columns to show in the Add form.
+
+        This dynamically calculates additional fields from providers and add
+        them to the backing list. This calculation is done exactly once (by
+        checking we're referencing the class-level variable instead of the
+        instance-level), and only after we enter the request context (to skip
+        superfuluous checks done by Flask-Appbuilder on startup).
+        """
+        if self._add_columns is type(self)._add_columns and has_request_context():
+            self._add_columns = [*self._add_columns, *(k for k, _ in self._iter_extra_field_names())]
+        return self._add_columns
+
+    @property
+    def edit_columns(self) -> list[str]:
+        """A list of columns to show in the Edit form.
+
+        This dynamically calculates additional fields from providers and add
+        them to the backing list. This calculation is done exactly once (by
+        checking we're referencing the class-level variable instead of the
+        instance-level), and only after we enter the request context (to skip
+        superfuluous checks done by Flask-Appbuilder on startup).
+        """
+        if self._edit_columns is type(self)._edit_columns and has_request_context():
+            self._edit_columns = [*self._edit_columns, *(k for k, _ in self._iter_extra_field_names())]
+        return self._edit_columns

Review Comment:
   Do add and edit columns differ? Could this be simply be `return self.add_columns`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #29946:
URL: https://github.com/apache/airflow/pull/29946#issuecomment-1459054096

   I also tested it manually and it seems to work fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on PR #29946:
URL: https://github.com/apache/airflow/pull/29946#issuecomment-1457561427

   I think a few tests will break. Submitting first to get a summary from CI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on code in PR #29946:
URL: https://github.com/apache/airflow/pull/29946#discussion_r1127363134


##########
airflow/www/views.py:
##########
@@ -4512,17 +4507,14 @@ def process_form(self, form, is_created):
                 del form.extra
         del extra_json
 
-        for key in self.extra_fields:
+        for key, field_name in form.iter_extra_field_names():

Review Comment:
   `extra_fields` is removed since it is simply `extra_field_name_mapping.keys()`, but since Python 3.7 the dict maintains ordering on its own the extra list is not needed anymore.
   
   `extra_field_name_mapping` is delibrately moved to ConnectionForm since Flask-Appbuilder has some annoying introspecting mechanism that tries to resolve values on the view class too eagerly. Moving the property to the form class ensures it is only calculated when needed.
   
   Note that the form does not even cache the value; the providers manager already does the caching, and an additional cache layer providers minimal benefit.



##########
airflow/www/views.py:
##########
@@ -4512,17 +4507,14 @@ def process_form(self, form, is_created):
                 del form.extra
         del extra_json
 
-        for key in self.extra_fields:
+        for key, field_name in form.iter_extra_field_names():

Review Comment:
   `extra_fields` is removed since it is simply `extra_field_name_mapping.keys()`, but since Python 3.7 the dict maintains ordering on its own the extra list is not needed anymore.
   
   `extra_field_name_mapping` is delibrately moved to ConnectionForm since Flask-Appbuilder has some annoying introspecting mechanism that tries to resolve values on the view class too eagerly. Moving the property to the form class ensures it is only calculated when needed.
   
   Note that the form does not even cache the value; the providers manager already does the caching, and an additional cache layer provides minimal benefit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk merged PR #29946:
URL: https://github.com/apache/airflow/pull/29946


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on PR #29946:
URL: https://github.com/apache/airflow/pull/29946#issuecomment-1458035586

   The form class works (those tests I changed cover it); whether the UI continues to work is more suspective. I did test it by hand but it’s difficult to automate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on code in PR #29946:
URL: https://github.com/apache/airflow/pull/29946#discussion_r1127360432


##########
airflow/www/views.py:
##########
@@ -4512,17 +4507,14 @@ def process_form(self, form, is_created):
                 del form.extra
         del extra_json
 
-        for key in self.extra_fields:
+        for key, field_name in form.extra_field_name_mapping.items():

Review Comment:
   `extra_fields` is removed since it is simply `extra_field_name_mapping.keys()`, but since Python 3.7 the dict maintains ordering on its own the extra list is not needed anymore.
   
   `extra_field_name_mapping` is delibrately moved to ConnectionForm since Flask-Appbuilder has some annoying introspecting mechanism that tries to resolve values on the view class too eagerly. Moving the property to the form class ensures it is only calculated when needed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr commented on code in PR #29946:
URL: https://github.com/apache/airflow/pull/29946#discussion_r1127722841


##########
airflow/www/views.py:
##########
@@ -4397,7 +4376,42 @@ class ConnectionModelView(AirflowModelView):
 
     base_order = ("conn_id", "asc")
 
-    extra_field_name_mapping: dict[str, str] = {}
+    def _iter_extra_field_names(self) -> Iterator[tuple[str, str]]:
+        """Iterate through provider-backed connection fields.
+
+        Note that this cannot be a property (including a cached property)
+        because Flask-Appbuilder attempts to access all members on startup, and
+        using a property would initialize the providers manager too eagerly.
+        """
+        return ((k, v.field_name) for k, v in ProvidersManager().connection_form_widgets.items())
+
+    @property
+    def add_columns(self) -> list[str]:
+        """A list of columns to show in the Add form.
+
+        This dynamically calculates additional fields from providers and add
+        them to the backing list. This calculation is done exactly once (by
+        checking we're referencing the class-level variable instead of the
+        instance-level), and only after we enter the request context (to skip
+        superfuluous checks done by Flask-Appbuilder on startup).
+        """
+        if self._add_columns is type(self)._add_columns and has_request_context():
+            self._add_columns = [*self._add_columns, *(k for k, _ in self._iter_extra_field_names())]
+        return self._add_columns
+
+    @property
+    def edit_columns(self) -> list[str]:
+        """A list of columns to show in the Edit form.
+
+        This dynamically calculates additional fields from providers and add
+        them to the backing list. This calculation is done exactly once (by
+        checking we're referencing the class-level variable instead of the
+        instance-level), and only after we enter the request context (to skip
+        superfuluous checks done by Flask-Appbuilder on startup).
+        """
+        if self._edit_columns is type(self)._edit_columns and has_request_context():
+            self._edit_columns = [*self._edit_columns, *(k for k, _ in self._iter_extra_field_names())]
+        return self._edit_columns

Review Comment:
   It can, I’m mostly just not sure whether we’d want them to be different at some point, and separating the two doesn’t feel wrong (since they are logically different).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #29946: Delay ConnectionModelView init until it's accessed

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #29946:
URL: https://github.com/apache/airflow/pull/29946#issuecomment-1459054629

   Nice one BTW.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org