You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/04/18 01:27:58 UTC

[GitHub] [arrow] sahil1105 opened a new pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

sahil1105 opened a new pull request #10088:
URL: https://github.com/apache/arrow/pull/10088


   Added basic support for explicitly specifying use of web identity credentials.
   Also refactored some existing code to make serialization and deserialization in Python more structured, but happy to revert/modify those changes if they seem overly complex.
   Also adds some missing error-checking and tests in the Python API.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r621109555



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       Hmm... I think it would be better to avoid adding boolean arguments for each different kind of authentication.
   I'm not sure what a good API would be, but perhaps we could add a generic `auth` parameter that would accept the following values:
   * `auth='default'`
   * `auth='anonymous'`
   * `auth={'access_key': '...', 'secret_key': '...', 'session_token': '...'}`
   * `auth={'role_arn': '...', 'session_name': '...', 'external_id': '...', 'load_frequency': 123}`
   * `auth='web_identity'`
   
   @jorisvandenbossche What do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 removed a comment on pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 removed a comment on pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#issuecomment-845036834


   @pitrou @jorisvandenbossche I think this is ready. Let me know if any other changes need to be made. Once this is merged I can start on a PR to add the `auth` keyword argument we discussed (and deprecate the others).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r629214182



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       The Python changes in this PR are not needed for backwards compatibility? (since those are new options) 
   So assuming we refactor the python options to use a generic `auth` parameter, then this PR with new functionality should directly use that new parameter, and not introduce a new `use_web_identity` keyword that would then later need to be deprecated. 
   So for this PR, I think either remove any Python bindings in this PR (and keep this PR to only add the C++ functionality, and later do a follow-up PR to add the python bindings) or either first do a precursor PR to refactor the options to use `auth` (and then afterwards this PR can be updated to add the web identity functionality into an `auth` keyword).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r625175844



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       @sahil1105 you could also do a pre-cursor PR to refactor the existing options like discussed here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #10088:
URL: https://github.com/apache/arrow/pull/10088


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r628782124



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       Wouldn't we still need the python changes in this PR even if we move to using a generic `auth` parameter (since we need to keep the keywords around for now for backward compatibility)? Or am I missing something?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r629309951



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       My bad, should've been more explicit. I agree about removing the `use_web_identity` keyword arg. I meant all the other Python changes. Those are actually just simple error-checking changes that we should keep in my opinion.
   Related Q: Should I remove `S3CredentialsKind` and related changes? I think it's good to have them, but I'll defer to you regarding whether to keep them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#issuecomment-821913775


   https://issues.apache.org/jira/browse/ARROW-10675


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#issuecomment-821918854


   @pitrou Could you review when you get a chance? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r621130953



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       Without knowing too much about S3 / AWS auth, your proposal feels quite natural for expressing the different options, and could be a nice alternative to a bunch of partially exclusive keyword arguments each for a different way of authorization. We would (initially) need to keep the keywords for backward compatibility.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rngtng commented on pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
rngtng commented on pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#issuecomment-1003124405


   Thanks for this feature! I require this now for ruby, but can't get it working. Any hints how to activate it?
   
    So far I follow  tis https://github.com/apache/arrow/tree/master/ruby#from-s3  and ensure to have this ENV set:
   `AWS_DEFAULT_REGION, AWS_ROLE_ARN, AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME`. But I guess I explicit have to set `credentials_kind`? how can I do this in ruby? 
   
   Thanks, Tobi
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#issuecomment-1003139202


   @kou  Would know more about the Ruby bindings.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r621308430



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       Whatever you decide, could you please document and write up a JIRA for doing the same in the R package (assuming you're not going to go all the way and update the R bindings in this PR, though you're more than welcome to :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r616703739



##########
File path: cpp/src/arrow/filesystem/s3fs.h
##########
@@ -91,6 +94,12 @@ struct ARROW_EXPORT S3Options {
   /// Whether OutputStream writes will be issued in the background, without blocking.
   bool background_writes = true;
 
+  /// True when using SimpleAWSCredentialsProvider, else false.
+  bool creds_provided = false;
+
+  /// True when using web identity token to assume role
+  bool use_web_identity = false;

Review comment:
       Yes, that'a good idea. Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kou commented on pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
kou commented on pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#issuecomment-1003181195


   @rngtng It seems that we need to implement bindings of this explicitly for Ruby. We can't use this with environment variables.
   Could you open an issue on JIRA for implementing bindings of this for Ruby?
   https://issues.apache.org/jira/browse/ARROW


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r624508726



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       @pitrou That makes sense to me as well.
   I think it might be better to do that in a separate PR though (to keep the scope of this one limited).
   Let me know.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r615705707



##########
File path: cpp/src/arrow/filesystem/s3fs.h
##########
@@ -91,6 +94,12 @@ struct ARROW_EXPORT S3Options {
   /// Whether OutputStream writes will be issued in the background, without blocking.
   bool background_writes = true;
 
+  /// True when using SimpleAWSCredentialsProvider, else false.
+  bool creds_provided = false;
+
+  /// True when using web identity token to assume role
+  bool use_web_identity = false;

Review comment:
       Aren't `anonymous`, `creds_provided` and `use_web_identity` all exclusive? If so, how about replacing them with an enum, e.g.:
   ```c++
   struct S3Options {
     enum CredentialsKind {
       /// Anonymous access (no credentials used)
       Anonymous,
       /// Use default AWS credentials, configured through environment variables
       Default,
       /// Use explicitly-provided username and password
       Explicit,
       /// Use web identity token to assume role, configured through environment variables
       WebIdentity
     };
   
     CredentialsKind credentials_kind = Default;
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rngtng edited a comment on pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
rngtng edited a comment on pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#issuecomment-1003124405


   Thanks for this feature @sahil1105 ! I require this now for ruby, but can't get it working. Any hints how to activate it?
   
    So far I follow  tis https://github.com/apache/arrow/tree/master/ruby#from-s3  and ensure to have this ENV set:
   `AWS_DEFAULT_REGION, AWS_ROLE_ARN, AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME`. But I guess I explicit have to set `credentials_kind`? how can I do this in ruby? 
   
   Thanks, Tobi
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r625173390



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       I would be ok with doing the Python changes in a separate PR, but then you should remove them from this PR ;-)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r629309951



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       I agree about removing the `use_web_identity` keyword arg. I meant all the other Python changes. Those are actually just simple error-checking changes that we should keep in my opinion.
   Related Q: Should I remove `S3CredentialsKind` and related changes? I think it's good to have them, but I'll defer to you regarding whether to keep them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r632948223



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       Done. Sorry for the delay.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#issuecomment-845036834


   @pitrou @jorisvandenbossche I think this is ready. Let me know if any other changes need to be made. Once this is merged I can start on a PR to add the `auth` keyword argument we discussed (and deprecate the others).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r619572891



##########
File path: cpp/src/arrow/filesystem/s3fs.h
##########
@@ -91,6 +94,12 @@ struct ARROW_EXPORT S3Options {
   /// Whether OutputStream writes will be issued in the background, without blocking.
   bool background_writes = true;
 
+  /// True when using SimpleAWSCredentialsProvider, else false.
+  bool creds_provided = false;
+
+  /// True when using web identity token to assume role
+  bool use_web_identity = false;

Review comment:
       @pitrou Done. Apologies for the delay.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sahil1105 commented on pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
sahil1105 commented on pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#issuecomment-849954235


   @pitrou @jorisvandenbossche I think this is ready. Let me know if any other changes need to be made. Once this is merged I can start on a PR to add the auth keyword argument we discussed (and deprecate the others).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10088: ARROW-10675 [C++][Python] Support AWS S3 Web identity credentials

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10088:
URL: https://github.com/apache/arrow/pull/10088#discussion_r629312734



##########
File path: python/pyarrow/_s3fs.pyx
##########
@@ -74,6 +74,13 @@ cdef class S3FileSystem(FileSystem):
         Whether to connect anonymously if access_key and secret_key are None.
         If true, will not attempt to look up credentials using standard AWS
         configuration methods.
+    use_web_identity: boolean, default False
+        Whether to connect using an assumed role authenticated using
+        a web identity token. The required settings are derived from
+        environment variables such as AWS_ROLE_ARN,
+        AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_SESSION_NAME.
+        If true, will not attempt to look up credentials using other
+        AWS configuration methods.

Review comment:
       I agree that it's ok to keep the additional error-checking changes, as well as the `S3CredentialsKind` declaration.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org