You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "akatona84 (via GitHub)" <gi...@apache.org> on 2023/05/19 12:39:18 UTC

[GitHub] [kafka] akatona84 opened a new pull request, #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

akatona84 opened a new pull request, #13733:
URL: https://github.com/apache/kafka/pull/13733

   
   org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls scans a path and collects plugin candidates from there. However, if a directory is not readable, it fails with AccessDeniedException instead of skipping it. This commit fixes this minor problem with the change of plugin path filter used during scanning.
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] akatona84 closed pull request #13733: KAFKA-13337: Plugin loader error handling is done separately, per plugin

Posted by "akatona84 (via GitHub)" <gi...@apache.org>.
akatona84 closed pull request #13733: KAFKA-13337: Plugin loader error handling is done separately, per plugin
URL: https://github.com/apache/kafka/pull/13733


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] vamossagar12 commented on pull request #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

Posted by "vamossagar12 (via GitHub)" <gi...@apache.org>.
vamossagar12 commented on PR #13733:
URL: https://github.com/apache/kafka/pull/13733#issuecomment-1555458221

   Thanks @akatona84 initially I had the same question as @divijvaidya but your explanation cleared up that doubt to some extent. If I understood the problem correctly that you described, the MM2 user failed because of the presence of a directory(.pki) which only a kafka user has access to. With this PR, MM2 user should be able to start MM2 cluster even though there's a directory to which it doesn't have access to.
   If that is the case, IMO the current behaviour is better that is we let things fail upfront and let the admins figure out what user groups should have access to which directories etc. Atleast with this, we won't have a case where the cluster can be brought in in an unexpected way. This is based on my understanding your explanation and there's a chance I might have missed something. Let me know what you think.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] akatona84 commented on pull request #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

Posted by "akatona84 (via GitHub)" <gi...@apache.org>.
akatona84 commented on PR #13733:
URL: https://github.com/apache/kafka/pull/13733#issuecomment-1558629671

   > Is there any way to decide if a specific file/dir is meant to be a plugin?
   currently the code is only checking whether it is a dir or the extension is zip, jar or class. Yet for an unreadable one, you can't decide anything, even these. Actual plugin scanning is done bit later
   
   Issues can be:
   io exceptions - reading problems
   url/path problems - bad paths were given (don't see other possible cases)
   
   It would be a breaking change to prevent Connect to start in case of errors like these.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] urbandan commented on pull request #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

Posted by "urbandan (via GitHub)" <gi...@apache.org>.
urbandan commented on PR #13733:
URL: https://github.com/apache/kafka/pull/13733#issuecomment-1557329915

   @akatona84 Is there any way to decide if a specific file/dir is meant to be a plugin?
   For the scenario you described, I understand that ".pki" was not an actual plugin, and in that case, it is desirable to just skip the dir. But in any other case, I agree that Connect should fail instead - if the intention of the admin was to add a plugin, but Connect cannot pick it up, that is a critical issue.
   If we have any method to decide if a file is meant to be a plugin or not, we can decide to fail or just skip. If we cannot decide if the file is a plugin or not, failing is the safer choice.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] akatona84 commented on a diff in pull request #13733: KAFKA-13337: Plugin loader error handling is done separately, per plugin

Posted by "akatona84 (via GitHub)" <gi...@apache.org>.
akatona84 commented on code in PR #13733:
URL: https://github.com/apache/kafka/pull/13733#discussion_r1203763926


##########
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/DelegatingClassLoaderTest.java:
##########
@@ -22,6 +22,7 @@
 import org.junit.rules.TemporaryFolder;
 
 import java.io.File;
+import java.io.IOException;

Review Comment:
   yeah, won't hurt :)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] gharris1727 commented on a diff in pull request #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

Posted by "gharris1727 (via GitHub)" <gi...@apache.org>.
gharris1727 commented on code in PR #13733:
URL: https://github.com/apache/kafka/pull/13733#discussion_r1202806994


##########
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/DelegatingClassLoaderTest.java:
##########
@@ -22,6 +22,7 @@
 import org.junit.rules.TemporaryFolder;
 
 import java.io.File;
+import java.io.IOException;

Review Comment:
   There's one more leaked classloader, in testLoadingMixOfValidAndInvalidPlugins if you want to fix that as well.
   I don't see these warnings as relevant, since this is test code. But I won't stop you from adding the try-with-resources.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] akatona84 commented on pull request #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

Posted by "akatona84 (via GitHub)" <gi...@apache.org>.
akatona84 commented on PR #13733:
URL: https://github.com/apache/kafka/pull/13733#issuecomment-1554580666

   Without filtering out non-readable ones it fails later but getting ignored, it won't load any plugins, not just the problematic one skipped.
   around here:
   https://github.com/apache/kafka/blob/3109e9c843e33057dd5d823c50c41fb91dc1a8fc/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/DelegatingClassLoader.java#L269
   
   So we should at least load the plugins what we have access to.
   
   This is the scenario:
   
   - plugins are located in /var/lib/kafka
   - also this happens to be the kafka user's home, yet it's world-readable
   - an only kafka readable directory was put there (.pki)
   - mirrormaker2 uses /var/lib/kafka too but mm2 is executed by another user than kafka
   - mm2 failed to load plugins (any) because of this unreadable .pki dir, and fails to start because its config has entries which would need (the currently not-loaded) config-providers to resolve
   
   This was the motivation to do the ticket and the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] divijvaidya commented on pull request #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

Posted by "divijvaidya (via GitHub)" <gi...@apache.org>.
divijvaidya commented on PR #13733:
URL: https://github.com/apache/kafka/pull/13733#issuecomment-1554545574

   It would be helpful if you could also explain why skipping the plugin (instead of failing hard) is ok? Would silently starting up with some missing plugins be ok?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] akatona84 commented on pull request #13733: KAFKA-13337: Plugin loader error handling is done separately, per plugin

Posted by "akatona84 (via GitHub)" <gi...@apache.org>.
akatona84 commented on PR #13733:
URL: https://github.com/apache/kafka/pull/13733#issuecomment-1568053208

   PR #13334 addresses my concerns, if I'm not mistaking. It was refactored in a way that now it's loading other plugin dirs in case of errors popping up at one. Closing this PR.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] machi1990 commented on a diff in pull request #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

Posted by "machi1990 (via GitHub)" <gi...@apache.org>.
machi1990 commented on code in PR #13733:
URL: https://github.com/apache/kafka/pull/13733#discussion_r1198923768


##########
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/PluginUtils.java:
##########
@@ -142,8 +142,12 @@ public class PluginUtils {
             + "|common\\.config\\.provider\\.(?!ConfigProvider$).*"
             + ")$");
 
-    private static final DirectoryStream.Filter<Path> PLUGIN_PATH_FILTER = path ->
-        Files.isDirectory(path) || isArchive(path) || isClassFile(path);
+    private static final DirectoryStream.Filter<Path> PLUGIN_PATH_FILTER = path -> {
+      if (!Files.isReadable(path)) {
+        log.warn("Non readable plugin path found, skipping: '{}'", path);
+      }
+      return Files.isReadable(path) && (Files.isDirectory(path) || isArchive(path) || isClassFile(path));

Review Comment:
   ```suggestion
         boolean isReadable = Files.isReadable(path);
         if (!isReadable) {
           log.warn("Non readable plugin path found, skipping: '{}'", path);
         }
         return isReadable && (Files.isDirectory(path) || isArchive(path) || isClassFile(path));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] akatona84 commented on pull request #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

Posted by "akatona84 (via GitHub)" <gi...@apache.org>.
akatona84 commented on PR #13733:
URL: https://github.com/apache/kafka/pull/13733#issuecomment-1556885401

   I just wrote down my motivation to exclude unreadable ones from plugin path. This specific case can be easily solved with better dir organizing. But you made me think what I really wanted. :)
   
   IMO the issue is around skipping ALL plugins (or at least the following ones after the faulty) to be loaded whenever there's a bad one. And not failing the startup, just ignoring them all. (in my case mm2 relied on a config provider plugin hence it failed to start)
   
   With the current code if an admin is adding a plugin wrong the connect cluster can startup but the connectors/config providers will fail. So either 
   - we should fail the herder startup entirely OR
   - do the error handling properly, per plugin.
   
   I'm overwriting the PR aiming this behavior, and curious what you guys think about it and my argument.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] vamossagar12 commented on pull request #13733: KAFKA-13337: fix of possible java.nio.file.AccessDeniedException during Connect plugin directory scan

Posted by "vamossagar12 (via GitHub)" <gi...@apache.org>.
vamossagar12 commented on PR #13733:
URL: https://github.com/apache/kafka/pull/13733#issuecomment-1560458136

   Thanks @akatona84 for the clarification. If I go by what @gharris1727 is saying above, i.e 
   
   > When a directory is not accessible, only plugins within that directory should be missing. Plugins from other directories should still be visible if possible.
   
   then the changes that you have made in the PR become relevant. The changes look fine to me 👍 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org