You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/30 00:03:40 UTC

[GitHub] [airflow] blag commented on a diff in pull request #25972: Support Airflow connections in Datasets

blag commented on code in PR #25972:
URL: https://github.com/apache/airflow/pull/25972#discussion_r957889183


##########
airflow/models/dataset.py:
##########
@@ -100,6 +98,64 @@ def __hash__(self):
     def __repr__(self):
         return f"{self.__class__.__name__}(uri={self.uri!r}, extra={self.extra!r})"
 
+    @property
+    def canonical_uri(self):
+        """
+        Resolve the canonical uri for a dataset.
+
+        If the uri doesn't have an `airflow` scheme, return it as-is.
+
+        If it does have an `airflow` scheme, it takes the connection id from
+        the username in userinfo. It then will combine the connection uri and
+        dataset uri to form the canonical uri. It does this by:
+
+        * Using the scheme from the connection, unless an override is provided
+          in the dataset scheme (e.g. airflow+override://)
+        * Determine the hostname and port, where the dataset values take precedence
+        * Combine the path, connection first followed by the dataset path
+        * Merge the query args
+
+        # airflow://conn_id/...
+        # airflow+override://conn_id/...
+        # airflow://conn_id/some_extra_path?query
+        """
+        parsed = urlparse(self.uri)
+
+        if not parsed.scheme.startswith("airflow"):
+            return self.uri
+
+        conn_id = parsed.username
+        conn = urlparse(Connection.get_connection_from_secrets(conn_id).get_uri())
+
+        # Take the scheme from the connection, unless it is overridden in the dataset
+        scheme = conn.scheme
+        split_scheme = parsed.scheme.split("+")
+        if len(split_scheme) == 2:
+            scheme = split_scheme[1]
+
+        # Strip userinfo from the uri
+        # Allow hostname/port override
+        hostname = parsed.hostname or conn.hostname
+        port = parsed.port or conn.port
+        netloc = hostname
+        if port:
+            netloc = f"{hostname}:{port}"

Review Comment:
   ```suggestion
           netloc = f"{hostname}:{port}" if port else hostname
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org