You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/22 18:13:59 UTC

[GitHub] [airflow] eskarimov commented on a change in pull request #19736: Add Databricks Deferrable Operators

eskarimov commented on a change in pull request #19736:
URL: https://github.com/apache/airflow/pull/19736#discussion_r754525681



##########
File path: airflow/providers/databricks/hooks/databricks.py
##########
@@ -493,3 +504,120 @@ def __init__(self, token: str) -> None:
     def __call__(self, r: PreparedRequest) -> PreparedRequest:
         r.headers['Authorization'] = 'Bearer ' + self.token
         return r
+
+
+class DatabricksAsyncHook(DatabricksHook):
+    """
+    Async version of the ``DatabricksHook``
+    Implements only necessary methods used further in Databricks Triggers.
+    """
+
+    def __init__(self, *args: Any, **kwargs: Any) -> None:
+        super().__init__(*args, **kwargs)
+
+    async def __aenter__(self):
+        self._session = aiohttp.ClientSession()

Review comment:
       Thank you both so much, that's very good points.
   
   Trying to summarise:
   - Ideally all tasks executed by each `trigerrer` should share the same session - however, not sure if it'd be possible without touching the core Airflow `Triggerer` functionality.
   - >Note that hooks are not necessarily run in the same process, so if you want to share sessions among them, you must move the abstraction to the trigger instead.
     
     Does it mean to move `ClientSession` initialisation to the trigger, i.e. as a property of `DatabricksExecutionTrigger`?
     My understanding is that when there're multiple `triggerer` processes, each trigger will be executed as a separate async task with its own `ClientSession` instance. So even if we move it to `DatabricksExecutionTrigger`, it'd still create a session for each trigger run.
   - Refactoring `Databricks` hook - I agree it'd be the perfect solution to keep everything inside a single class. Also, after [the comment](https://github.com/apache/airflow/pull/19736#discussion_r754267058) by @alexott it's even more motivation to do that to prevent making double work in the future. I'll try to refactor it, to minimize repeatable code and isolating IO operations




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org