You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/03 01:42:30 UTC
[GitHub] [airflow] gmcrocetti opened a new pull request, #25494: `AwsBaseHook` not using Connection's host
gmcrocetti opened a new pull request, #25494:
URL: https://github.com/apache/airflow/pull/25494
<!--
Thank you for contributing! Please make sure that your code changes
are covered with tests. And in case of new features or big changes
remember to adjust the documentation.
Feel free to ping committers for the review!
In case of an existing issue, reference it using one of the following:
closes: #ISSUE
related: #ISSUE
How to write a good git commit message:
http://chris.beams.io/posts/git-commit/
-->
Deprecate extra's `host` argument to favor connection's host.
closes: #17833
---
Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] josh-fell commented on a diff in pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #25494:
URL: https://github.com/apache/airflow/pull/25494#discussion_r940181581
##########
airflow/providers/amazon/aws/utils/connection_wrapper.py:
##########
@@ -154,6 +154,16 @@ def __post_init__(self, conn: "Connection"):
self.endpoint_url = extra.get("host")
+ if self.endpoint_url:
+ warnings.warn(
+ "extra['host'] is deprecated and will be removed in a future release."
Review Comment:
@potuik If the [Amazon provider is going to be a major version bump](https://lists.apache.org/thread/c9254qscbf6yjpqx6q4n3o798pxxyfwk) this month, is the deprecation message even required? The provider could just stop supporting the `host` connection field.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] gmcrocetti commented on pull request #25494: `AwsBaseHook` not using Connection's host
Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1204433467
>
Hi @Taragolis . I do agree that `endpoint_url` is way more intuitive than `host`, as thus we should deprecated both `host` and `extra["host"]` toward a more lean alternative. The problem is this is my biased developer POV and I don't know how airflow's community will react to that (they're used with the host field after all).
I'll mark @potiuk to provide some guidance in here. I'm either ok with closing this PR (and the related issue) or working in the `endpoint_url` suggestion
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] Taragolis commented on pull request #25494: `AwsBaseHook` not using Connection's host
Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1203611035
My personal thought both of `Connection.host` and `Connection.extra['host']` not a good candidate for `endpoint_url`.
I would rather deprecate `Connection.extra['host']` by `Connection.extra['endpoint_url']`:
1. In this case it would be easier implement #25326
2. keep same name as it use in [boto3.client](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session.client) and [boto3.resource](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session.resource)
Please also note that connection which read from URI (Environment Variables, SSM backend and might be other sources) would [always contain non-None](https://github.com/apache/airflow/blob/298be502c35006b7c3f011b676dbb4db0633bc74/airflow/models/connection.py#L50-L60) `host`. In other hand It might be an issue if we pass `endpoint_url=''` rather than `endpoint_url=None` to boto3
```python
from airflow.models.connection import Connection
conn = Connection(uri="aws://")
assert conn.host is None, f"Expected None but got {conn.host!r}"
```
Also feel free to join discussion in this PR #25416 which basically hide unused connections fields from UI, since it not used in `AwsConnectionWrapper` and we do not provide `host` attribute as compatibility for Connection object.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1207373160
> I just thought that we could inform users if they somehow use `Connection.host` (might be defined in current version of provider) that this option won't work. Same as we inform that if they use `profile` which related to deprecated value, and do not related to actual `profile_name` but it just an idea.
Yeah. Might be a good idea to add warning if host is set ("Host {} specifed in connection.host is not used - the AWS host should be bassed by "endpoint_url" extra... Or smth.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] Taragolis commented on pull request #25494: `AwsBaseHook` not using Connection's host
Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1205817276
BTW, I think specify `endpoint_url` into `boto3.client` and `boto3.resource` only required for MinIO, Aliyun OSS users and might be for LocalStack and `moto` in standalone mode.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #25494:
URL: https://github.com/apache/airflow/pull/25494#discussion_r940197227
##########
airflow/providers/amazon/aws/utils/connection_wrapper.py:
##########
@@ -154,6 +154,16 @@ def __post_init__(self, conn: "Connection"):
self.endpoint_url = extra.get("host")
+ if self.endpoint_url:
+ warnings.warn(
+ "extra['host'] is deprecated and will be removed in a future release."
Review Comment:
Yeah. That's a bit aggressive, but we could raise plain error here. I do not see this as being a major problem. But removing it altogether is not a good idea because it would likely stop doing that it was doing before without even a warning in some cases.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] josh-fell commented on a diff in pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #25494:
URL: https://github.com/apache/airflow/pull/25494#discussion_r940221080
##########
airflow/providers/amazon/aws/utils/connection_wrapper.py:
##########
@@ -154,6 +154,16 @@ def __post_init__(self, conn: "Connection"):
self.endpoint_url = extra.get("host")
+ if self.endpoint_url:
+ warnings.warn(
+ "extra['host'] is deprecated and will be removed in a future release."
Review Comment:
Understood. So more of a release flow of `feature` -> `deprecate/message about not using feature` -> `remove feature` regardless of major version bump along the way.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1208472051
Yeah I think there was some broken image generation for breeze. Will fix it in a moment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
potiuk merged PR #25494:
URL: https://github.com/apache/airflow/pull/25494
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1207793324
I want to make provider's release today - woudl be great to get this in.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] Taragolis commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1207280912
@gmcrocetti
Since https://github.com/apache/airflow/pull/25416 merged would be nice also update placeholder for UI:
https://github.com/apache/airflow/blob/6657684ae0b558906e0da0693e6644511a419e0d/airflow/providers/amazon/aws/hooks/base_aws.py#L626-L628
> but after discussions it occurred me host is not semantically correct for the proposed use case.
I just thought that we could inform users if they somehow use `Connection.host` (might be defined in current version of provider) that this option won't work. Same as we inform that if they use `profile` which related to deprecated value, and do not related to actual `profile_name` but it just an idea.
https://github.com/apache/airflow/blob/2e2e86d9e43989ed039afc07fa8efe29bf5d170c/airflow/providers/amazon/aws/utils/connection_wrapper.py#L138-L147
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] gmcrocetti commented on pull request #25494: `AwsBaseHook` not using Connection's host
Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1207265249
The original goal of this PR was to allow one using `Connection.host`, as described in #17833, but after discussions it occurred me `host` is not semantically correct for the proposed use case.
As we all know, `host` is just one part of a [URL](https://www.rfc-editor.org/rfc/rfc1738#section-3.3) in the HTTP scheme. As boto expects a complete HTTP url, `host` is not the most correct description for this parameter and thus, that's why I'm proposing renaming it to `endpoint_url`.
Please, let me know if that doesn't make any sense.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] Taragolis commented on pull request #25494: `AwsBaseHook` not using Connection's host
Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1204477761
You've never know end users reaction. Someone frustrate when they see Deprecation and User Warnings and tried to fix asap, some other still use operators/sensors/hooks from `airflow.contrib` 😃
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] Taragolis commented on pull request #25494: `AwsBaseHook` not using Connection's host
Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1205813393
I'm still worry of fact that 'host' has additional logic on URI parse which could add additional complexity - in the fact it only happen if user do not follow documentation and do not URL-encoded their values
https://github.com/apache/airflow/blob/298be502c35006b7c3f011b676dbb4db0633bc74/airflow/models/connection.py#L50-L60
And a little clarification just for sure that we on the same page. Not just renaming, we deprecate value just for sure that users have a time to change their connections before actually this parameter would be removed. Something like that
```python
self.endpoint_url = extra.get("endpoint_url")
if not self.endpoint_url and "host" in extra:
self.endpoint_url = conn.host
warnings.warn(
"some deprecation warning",
DeprecationWarning,
stacklevel=2,
)
```
And also it would be nice to add this [in to the documentation](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html#configuring-the-connection)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] gmcrocetti commented on pull request #25494: `AwsBaseHook` not using Connection's host
Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1205779583
> You've never know end users reaction. Someone frustrate when they see Deprecation and User Warnings and tried to fix asap, some other still use operators/sensors/hooks from `airflow.contrib` smiley
I think that renaming `host` to `endpoint_url` and `extra[host]` to `extra[endpoint_url]` covers all cases, WDYT @Taragolis ?
as suggested by @josh-fell
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] gmcrocetti commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1208110326
Just sent a patch addressing both @Taragolis and @potiuk comments
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] gmcrocetti commented on pull request #25494: Deprecate usage of `extra[host]` in AWS's connection
Posted by GitBox <gi...@apache.org>.
gmcrocetti commented on PR #25494:
URL: https://github.com/apache/airflow/pull/25494#issuecomment-1208251871
The very same check that failed here is failing on `main`. Can we move forward with this one or fix main before ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org