You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/06/24 08:54:47 UTC

[GitHub] [flink] gaborgsomogyi opened a new pull request, #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

gaborgsomogyi opened a new pull request, #20066:
URL: https://github.com/apache/flink/pull/20066

   ## What is the purpose of the change
   
   At the moment the new delegation token framework doesn't have Hadoop FS delegation token provider. In this PR I've added `HadoopFSDelegationTokenProvider`. With that new provider now it's possible to obtain tokens for Hadoop based file systems.
   
   ## Brief change log
   
   * Added exception to `DelegationTokenProvider` API
   * Add `HadoopFSDelegationTokenProvider`
   * Added `META-INF` service registration
   * Added new unit tests
   
   ## Verifying this change
   
   * Additional unit tests
   * Manually on YARN and K8S (please see an example [here](https://gist.github.com/gaborgsomogyi/ac4f71ead8494da2f5c35265bcb1e885))
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? yes
     - If yes, how is the feature documented? All documentation is intended to be added in [FLINK-25911](https://issues.apache.org/jira/browse/FLINK-25911) when everything works as a whole
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1165364762

   cc @gyfora @mbalassi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] flinkbot commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
flinkbot commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1165360589

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "070a8552afa26b8399154ad5e71ff1e97bfd0ae3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "070a8552afa26b8399154ad5e71ff1e97bfd0ae3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 070a8552afa26b8399154ad5e71ff1e97bfd0ae3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on code in PR #20066:
URL: https://github.com/apache/flink/pull/20066#discussion_r906643914


##########
flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java:
##########
@@ -1294,7 +1294,7 @@ private void removeLocalhostBindHostSetting(
     private void setTokensFor(ContainerLaunchContext containerLaunchContext) throws Exception {
         LOG.info("Adding delegation tokens to the AM container.");
 
-        Credentials credentials = UserGroupInformation.getCurrentUser().getCredentials();
+        Credentials credentials = new Credentials();

Review Comment:
   I've modified the code not to change the current user's credentials (since the framework is partially done).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1165518581

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] mbalassi merged pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
mbalassi merged PR #20066:
URL: https://github.com/apache/flink/pull/20066


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on code in PR #20066:
URL: https://github.com/apache/flink/pull/20066#discussion_r906720670


##########
flink-runtime/src/test/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManagerITCase.java:
##########
@@ -79,9 +80,8 @@ public void configurationIsNullMustFailFast() {
     }
 
     @Test
-    public void oneProviderThrowsExceptionMustFailFast() {
-        assertThrows(
-                Exception.class,
+    public void oneProviderThrowsExceptionMustFailSilently() {
+        assertDoesNotThrow(

Review Comment:
   This change is questionable but after in-depth consideration this makes sense from my perspective. Several tests were failing because the new provider was not able to be loaded (`hadoop-hdfs-client` was not on classpath).
   
   I've considered and rejected the following alternatives:
   * Make `security.kerberos.fetch.delegation-token` to false by default -> This would be a breaking change even if it is small. If we would like to choose an alternative then I would vote on this.
   * Set `security.kerberos.fetch.delegation-token` to false at all the related tests -> Huge amount of unnecessary test code modifications for nothing.
   * At all the related tests `hadoop-hdfs-client` dependency can be added -> This would be an overkill since these tests are not using DT framework at all.
   * Start the DT framework only if `hadoop-hdfs-client` is on classpath -> This would generate pain for users. A good example is when the Hadoop FS provider is turned off but the Flink is not starting until it's not on classpath.
   
   With this change the end result is that Flink job will start even if one provider is not loaded, only a log error message will appear. What we would gain is that in tests we don't need to suffer from not properly configured DT framework exceptions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on code in PR #20066:
URL: https://github.com/apache/flink/pull/20066#discussion_r907197501


##########
flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java:
##########
@@ -1294,7 +1294,7 @@ private void removeLocalhostBindHostSetting(
     private void setTokensFor(ContainerLaunchContext containerLaunchContext) throws Exception {
         LOG.info("Adding delegation tokens to the AM container.");
 
-        Credentials credentials = UserGroupInformation.getCurrentUser().getCredentials();
+        Credentials credentials = new Credentials();

Review Comment:
   Yes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on code in PR #20066:
URL: https://github.com/apache/flink/pull/20066#discussion_r906643863


##########
flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java:
##########
@@ -1294,7 +1294,7 @@ private void removeLocalhostBindHostSetting(
     private void setTokensFor(ContainerLaunchContext containerLaunchContext) throws Exception {
         LOG.info("Adding delegation tokens to the AM container.");
 
-        Credentials credentials = UserGroupInformation.getCurrentUser().getCredentials();

Review Comment:
   I've modified the code not to change the current user's credentials (since the framework is partially done).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] mbalassi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
mbalassi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1174944521

   Great work, @gaborgsomogyi. I especially appreciate the [guide](https://gist.github.com/gaborgsomogyi/ac4f71ead8494da2f5c35265bcb1e885) you have put together to help local testing. On top of the examples that you specify there I managed to verify your change via a Kubernetes Application mode job. So far so good, looking into the implementation in detail now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1165359161

   Here is an example run log from K8S. Please be aware that for testing purposes every second token obtain throws an expected exception in order to test retry scenario:
   [hdfs.log](https://github.com/apache/flink/files/8974714/hdfs.log)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] JackWangCS commented on a diff in pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
JackWangCS commented on code in PR #20066:
URL: https://github.com/apache/flink/pull/20066#discussion_r907144799


##########
flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java:
##########
@@ -1294,7 +1294,7 @@ private void removeLocalhostBindHostSetting(
     private void setTokensFor(ContainerLaunchContext containerLaunchContext) throws Exception {
         LOG.info("Adding delegation tokens to the AM container.");
 
-        Credentials credentials = UserGroupInformation.getCurrentUser().getCredentials();
+        Credentials credentials = new Credentials();

Review Comment:
   Got it, then the HBase delegation token provider could also move forward.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on code in PR #20066:
URL: https://github.com/apache/flink/pull/20066#discussion_r906721436


##########
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##########
@@ -130,12 +131,14 @@ private Map<String, DelegationTokenProvider> loadProviders() {
                             "Delegation token provider {} is disabled so not loaded",
                             provider.serviceName());
                 }
-            } catch (Exception e) {
+            } catch (Exception | NoClassDefFoundError e) {
                 LOG.error(
-                        "Failed to initialize delegation token provider {}.",
+                        "Failed to initialize delegation token provider {}",
                         provider.serviceName(),
                         e);
-                throw e;
+                if (!(e instanceof NoClassDefFoundError)) {
+                    throw new FlinkRuntimeException(e);
+                }

Review Comment:
   This change is questionable but after in-depth consideration this makes sense from my perspective. Several tests were failing because the new provider was not able to be loaded (`hadoop-hdfs-client` was not on classpath).
   
   I've considered and rejected the following alternatives:
   * Make `security.kerberos.fetch.delegation-token` to false by default -> This would be a breaking change even if it is small. If we would like to choose an alternative then I would vote on this.
   * Set `security.kerberos.fetch.delegation-token` to false at all the related tests -> Huge amount of unnecessary test code modifications for nothing.
   * At all the related tests `hadoop-hdfs-client` dependency can be added -> This would be an overkill since these tests are not using DT framework at all.
   * Start the DT framework only if `hadoop-hdfs-client` is on classpath -> This would generate pain for users. A good example is when the Hadoop FS provider is turned off but the Flink is not starting until it's not on classpath.
   
   With this change the end result is that Flink job will start even if one provider is not loaded, only a log error message will appear. What we would gain is that in tests we don't need to suffer from not properly configured DT framework exceptions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1165484892

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1165434252

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1165710811

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on code in PR #20066:
URL: https://github.com/apache/flink/pull/20066#discussion_r906720670


##########
flink-runtime/src/test/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManagerITCase.java:
##########
@@ -79,9 +80,8 @@ public void configurationIsNullMustFailFast() {
     }
 
     @Test
-    public void oneProviderThrowsExceptionMustFailFast() {
-        assertThrows(
-                Exception.class,
+    public void oneProviderThrowsExceptionMustFailSilently() {
+        assertDoesNotThrow(

Review Comment:
   This change is questionable but after in-depth consideration this makes sense from my perspective. Several tests were failing because the new provider was not able to be loaded (`hadoop-hdfs-client` was not on classpath).
   
   I've considered and rejected the following alternatives:
   * Make `security.kerberos.fetch.delegation-token` to false by default -> This would be a breaking change even if it is small. If we would like to choose an alternative then I would vote on this.
   * Set `security.kerberos.fetch.delegation-token` to false at all the related tests -> Huge amount of unnecessary test code modifications for nothing.
   * At all the related tests `hadoop-hdfs-client` dependency can be added -> This would be an overkill since these tests are not using DT framework at all.
   * Start the DT framework only if `hadoop-hdfs-client` is on classpath -> This would generate pain for users. A good example is when the Hadoop FS provider is turned off but the Flink is not starting until it's not on classpath.
   
   With this change the end result is that Flink job will start even if one provider is not loaded, only a log error message will appear. What we would gain is that in tests we don't need to suffer from not properly configured DT framework exceptions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] mbalassi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
mbalassi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1174956266

   Merging with the note that a separate [ticket](https://issues.apache.org/jira/browse/FLINK-25911) exists for the holistic documentation of this broader feature, which we most provide on finishing the implementation.
   
   Given the timeline I propose that we only target the whole delegation token framework story for 1.17, but we can keep merging portions of it that are switched off by default.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1165517194

   Unrelated error appeared:
   `E: Unsupported file ./libssl1.0.0_1.0.2n-1ubuntu5.9_amd64.deb given on commandline`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1165536968

   @JackWangCS not sure where are you w/ the HBase part but plz have a look how I've changed the API and the manager.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] JackWangCS commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
JackWangCS commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1167079827

   > @JackWangCS not sure where are you w/ the HBase part but plz have a look how I've changed the API and the manager.
   
   Thanks for the reminder, I think I could wait the Hadoop FS part merged first.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] gaborgsomogyi commented on pull request #20066: [FLINK-25908][runtime][security] Add HadoopFSDelegationTokenProvider

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on PR #20066:
URL: https://github.com/apache/flink/pull/20066#issuecomment-1175270701

   @JackWangCS green light.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org