You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/03 01:42:30 UTC

[GitHub] [airflow] gmcrocetti opened a new pull request, #25494: `AwsBaseHook` not using Connection's host

gmcrocetti opened a new pull request, #25494:
URL: https://github.com/apache/airflow/pull/25494

   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of an existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   Deprecate extra's `host` argument to favor connection's host.
   closes: #17833
   ---
   
   
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on a diff in pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #25494:
URL: https://github.com/apache/airflow/pull/25494#discussion_r940181581


##########
airflow/providers/amazon/aws/utils/connection_wrapper.py:
##########
@@ -154,6 +154,16 @@ def __post_init__(self, conn: "Connection"):
 
         self.endpoint_url = extra.get("host")
 
+        if self.endpoint_url:
+            warnings.warn(
+                "extra['host'] is deprecated and will be removed in a future release."

Review Comment:
   @potuik If the [Amazon provider is going to be a major version bump](https://lists.apache.org/thread/c9254qscbf6yjpqx6q4n3o798pxxyfwk) this month, is the deprecation message even required? The provider could just stop supporting the `host` connection field.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] gmcrocetti commented on pull request #25494: `AwsBaseHook` not using Connection's host

Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1204433467

   > 
   
   Hi @Taragolis . I do agree that `endpoint_url` is way more intuitive than `host`, as thus we should deprecated both `host` and `extra["host"]` toward a more lean alternative. The problem is this is my biased developer POV and I don't know how airflow's community will react to that (they're used with the host field after all).
   I'll mark @potiuk to provide some guidance in here. I'm either ok with closing this PR (and the related issue) or working in the `endpoint_url` suggestion


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #25494: `AwsBaseHook` not using Connection's host

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1203611035

   My personal thought both of `Connection.host` and `Connection.extra['host']` not a good candidate for `endpoint_url`.
   
   I would rather deprecate  `Connection.extra['host']` by `Connection.extra['endpoint_url']`:
   1. In this case it would be easier implement #25326 
   2. keep same name as it use in [boto3.client](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session.client) and [boto3.resource](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session.resource)
   
   Please also note that connection which read from URI (Environment Variables, SSM backend and might be other sources) would [always contain non-None](https://github.com/apache/airflow/blob/298be502c35006b7c3f011b676dbb4db0633bc74/airflow/models/connection.py#L50-L60) `host`. In other hand It might be an issue if we pass `endpoint_url=''` rather than `endpoint_url=None` to boto3
   
   ```python
   from airflow.models.connection import Connection
   
   conn = Connection(uri="aws://")
   assert conn.host is None, f"Expected None but got {conn.host!r}"
   ```
   
   Also feel free to join discussion in this PR #25416 which basically hide unused connections fields from UI, since it not used in `AwsConnectionWrapper` and we do not provide `host` attribute as compatibility for Connection object.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1207373160

   > I just thought that we could inform users if they somehow use `Connection.host` (might be defined in current version of provider) that this option won't work. Same as we inform that if they use `profile` which related to deprecated value, and do not related to actual `profile_name` but it just an idea.
   
   Yeah. Might be a good idea to add warning if host is set ("Host {} specifed in connection.host is not used - the AWS  host should be bassed by "endpoint_url" extra... Or smth.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #25494: `AwsBaseHook` not using Connection's host

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1205817276

   BTW, I think specify `endpoint_url` into `boto3.client` and `boto3.resource` only required for MinIO, Aliyun OSS users and might be for LocalStack and `moto` in standalone mode.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #25494:
URL: https://github.com/apache/airflow/pull/25494#discussion_r940197227


##########
airflow/providers/amazon/aws/utils/connection_wrapper.py:
##########
@@ -154,6 +154,16 @@ def __post_init__(self, conn: "Connection"):
 
         self.endpoint_url = extra.get("host")
 
+        if self.endpoint_url:
+            warnings.warn(
+                "extra['host'] is deprecated and will be removed in a future release."

Review Comment:
   Yeah. That's a bit aggressive, but we could  raise plain error here. I do not see this as being a major problem. But removing it altogether is not a good idea because it would likely stop doing that it was doing before without even a warning in some cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on a diff in pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #25494:
URL: https://github.com/apache/airflow/pull/25494#discussion_r940221080


##########
airflow/providers/amazon/aws/utils/connection_wrapper.py:
##########
@@ -154,6 +154,16 @@ def __post_init__(self, conn: "Connection"):
 
         self.endpoint_url = extra.get("host")
 
+        if self.endpoint_url:
+            warnings.warn(
+                "extra['host'] is deprecated and will be removed in a future release."

Review Comment:
   Understood. So more of a release flow of `feature` -> `deprecate/message about not using feature` -> `remove feature` regardless of major version bump along the way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1208472051

   Yeah I think there was some broken image generation for breeze. Will fix it in a moment.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #25494:
URL: https://github.com/apache/airflow/pull/25494


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1207793324

   I want to make provider's release today - woudl be great to get this in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1207280912

   @gmcrocetti 
   
   Since https://github.com/apache/airflow/pull/25416 merged would be nice also update placeholder for UI:
   https://github.com/apache/airflow/blob/6657684ae0b558906e0da0693e6644511a419e0d/airflow/providers/amazon/aws/hooks/base_aws.py#L626-L628
   
   > but after discussions it occurred me host is not semantically correct for the proposed use case.
   
   I just thought that we could inform users if they somehow use `Connection.host` (might be defined in current version of provider) that this option won't work. Same as we inform that if they use `profile` which related to deprecated value, and do not related to actual `profile_name` but it just an idea.
   
   https://github.com/apache/airflow/blob/2e2e86d9e43989ed039afc07fa8efe29bf5d170c/airflow/providers/amazon/aws/utils/connection_wrapper.py#L138-L147


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] gmcrocetti commented on pull request #25494: `AwsBaseHook` not using Connection's host

Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1207265249

   The original goal of this PR was to allow one using `Connection.host`, as described in #17833, but after discussions it occurred me `host` is not semantically correct for the proposed use case.
   As we all know, `host` is just one part of a [URL](https://www.rfc-editor.org/rfc/rfc1738#section-3.3) in the HTTP scheme. As boto expects a complete HTTP url, `host` is not the most correct description for this parameter and thus, that's why I'm proposing renaming it to `endpoint_url`.
   Please, let me know if that doesn't make any sense.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #25494: `AwsBaseHook` not using Connection's host

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1204477761

   You've never know end users reaction. Someone frustrate when they see Deprecation and User Warnings and tried to fix asap, some other still use operators/sensors/hooks from `airflow.contrib` 😃 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #25494: `AwsBaseHook` not using Connection's host

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1205813393

   I'm still worry of fact that 'host' has additional logic on URI parse which could add additional complexity - in the fact it only happen if user do not follow documentation and do not URL-encoded their values
   
   https://github.com/apache/airflow/blob/298be502c35006b7c3f011b676dbb4db0633bc74/airflow/models/connection.py#L50-L60
   
   And a little clarification just for sure that we on the same page. Not just renaming, we deprecate value just for sure that users have a time to change their connections before actually this parameter would be removed. Something like that
   
   ```python
   self.endpoint_url = extra.get("endpoint_url")
   if not self.endpoint_url and "host" in extra:
       self.endpoint_url = conn.host
       warnings.warn(
           "some deprecation warning",
           DeprecationWarning,
           stacklevel=2,
       )
   ```
   
   And also it would be nice to add this [in to the documentation](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html#configuring-the-connection)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] gmcrocetti commented on pull request #25494: `AwsBaseHook` not using Connection's host

Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1205779583

   > You've never know end users reaction. Someone frustrate when they see Deprecation and User Warnings and tried to fix asap, some other still use operators/sensors/hooks from `airflow.contrib` smiley
   
   I think that renaming `host` to `endpoint_url` and `extra[host]` to `extra[endpoint_url]` covers all cases, WDYT @Taragolis ?
   as suggested by @josh-fell 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] gmcrocetti commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1208110326

   Just sent a patch addressing both @Taragolis  and @potiuk  comments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] gmcrocetti commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection

Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1208251871

   The very same check that failed here is failing on `main`. Can we move forward with this one or fix main before ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org