You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/04/05 08:47:56 UTC

[GitHub] [kafka] ncliang opened a new pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

ncliang opened a new pull request #10475:
URL: https://github.com/apache/kafka/pull/10475


   The default implementation, which is inheritted from ClassLoader, searches the
   classloader tree from parent first. This causes issues when the resource is
   available both on the classpath and the plugin path. Instead of attempting to
   load the resource from plugin path first, the system classloader is consulted
   first and loads the resource from classpath.
   
   A testcase is added in PluginsTest to verify this behavior is fixed by this
   commit.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#issuecomment-814500889


   Thanks for review @mageshn and @gharris1727 . I think I've addressed all comments. Please take another look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] gharris1727 commented on a change in pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
gharris1727 commented on a change in pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#discussion_r624048885



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/TestPlugins.java
##########
@@ -143,16 +176,35 @@ public static void assertAvailable() throws AssertionError {
     public static List<String> pluginPath() {

Review comment:
       Yes, but plugins are now no longer unique by name. The `test.plugins.ReadVersionFromResource` plugin class is now provided by two different plugin jars, and when the DelegatingClassLoader attempts to load it, it is arbitrary which one is chosen.
   
   For example, if you set up Plugins with this path, and load the `test.plugins.ReadVersionFromResource`, will you get v1 or v2? I believe it depends on iteration order in the hashmap, and so happens to be deterministic here but not in general.
   
   Currently, Connect does not have first class support for plugin paths with duplicate plugins due to this ambiguity, so I don't think this method should return a plugin path which is unsupported.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] mageshn commented on a change in pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
mageshn commented on a change in pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#discussion_r608248447



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/Plugins.java
##########
@@ -61,12 +61,24 @@ public Plugins(Map<String, String> props) {
         delegatingLoader.initLoaders();
     }
 
+    public Plugins(Map<String, String> props, ClassLoader parent) {

Review comment:
       I assume then it doesn't have to be `public`, right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#issuecomment-827234589


   Ping @mageshn @gharris1727 , the failed tests don't seem to be related to this change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] gharris1727 commented on a change in pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
gharris1727 commented on a change in pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#discussion_r607216362



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/PluginsTest.java
##########
@@ -342,6 +343,47 @@ public void newPluginsShouldConfigureWithPluginClassLoader() {
         assertPluginClassLoaderAlwaysActive(samples);
     }
 
+    @Test
+    public void pluginClassLoaderReadVersionFromResource() {
+        TestPlugins.assertAvailable();
+
+        Map<String, String> pluginProps = new HashMap<>();
+        pluginProps.put(WorkerConfig.PLUGIN_PATH_CONFIG,
+                TestPlugins.pluginPath().stream()
+                        .filter(s -> s.contains("read-version-from-resource-v1"))
+                        .collect(Collectors.joining()));
+        plugins = new Plugins(pluginProps);
+
+        Converter converter = plugins.newPlugin(
+                TestPlugins.READ_VERSION_FROM_RESOURCE,
+                new AbstractConfig(new ConfigDef(), Collections.emptyMap()),
+                Converter.class
+        );
+        assertEquals("1.0.0",
+                new String(converter.fromConnectData(null, null, null)));
+        PluginClassLoader pluginClassLoader = plugins.delegatingLoader()
+                .pluginClassLoader(TestPlugins.READ_VERSION_FROM_RESOURCE);
+        assertNotNull(pluginClassLoader);
+
+
+        // Re-initialize Plugins object with plugin class loader in the class loader tree. This is
+        // to simulate the situation where jars exist on both system classpath and plugin path.
+        pluginProps.put(WorkerConfig.PLUGIN_PATH_CONFIG,
+                TestPlugins.pluginPath().stream()
+                        .filter(s -> s.contains("read-version-from-resource-v2"))
+                        .collect(Collectors.joining()));
+        plugins = new Plugins(pluginProps, pluginClassLoader);

Review comment:
       While I understand that the motivation here for explicitly providing the parent is control the behavior of the parent ClassLoader, I think this test is for a pretty extreme case. This is testing a:
   
   1. PluginClassLoader whose parent is a
   2. DelegatingClassLoader whose parent is a
   3. PluginClassLoader whose parent is a
   4. Delegating ClassLoader whose parent is the
   5. App ClassLoader
   
   Does this test still succeed if we use a different ClassLoader that's not part of a chain like this?
   We could construct a plain `URLClassLoader` instance as the parent, but that would still require the new Plugins constructor.
   If we add a test resource file to the App ClassLoader we could skip the additional constructor. We could use multiple different resource file names in the different test plugins to test the three cases:
   
   1. parent has resource but plugin does not
   2. plugin has resource but parent does not
   3. parent and plugin both have the resource

##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/PluginsTest.java
##########
@@ -342,6 +343,47 @@ public void newPluginsShouldConfigureWithPluginClassLoader() {
         assertPluginClassLoaderAlwaysActive(samples);
     }
 
+    @Test
+    public void pluginClassLoaderReadVersionFromResource() {
+        TestPlugins.assertAvailable();
+
+        Map<String, String> pluginProps = new HashMap<>();
+        pluginProps.put(WorkerConfig.PLUGIN_PATH_CONFIG,
+                TestPlugins.pluginPath().stream()

Review comment:
       nit: TestPlugins has a Map from class name to File, we can add a method for getting the path to a single plugin jar to TestPlugins itself, so we don't have to iterate here.

##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/TestPlugins.java
##########
@@ -114,6 +118,9 @@
             pluginJars.put(SAMPLING_HEADER_CONVERTER, createPluginJar("sampling-header-converter"));
             pluginJars.put(SAMPLING_CONFIG_PROVIDER, createPluginJar("sampling-config-provider"));
             pluginJars.put(SERVICE_LOADER, createPluginJar("service-loader"));
+            // Create two versions of the same plugin reading version string from a resource
+            pluginJars.put(READ_VERSION_FROM_RESOURCE + ".v1", createPluginJar("read-version-from-resource-v1"));

Review comment:
       nit: make each of these a separate constant




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#issuecomment-819353968


   Ping @mageshn @gharris1727 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] gharris1727 commented on a change in pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
gharris1727 commented on a change in pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#discussion_r621586391



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/TestPlugins.java
##########
@@ -98,22 +97,56 @@
      * to load internal classes, and samples information about their initialization.
      */
     public static final String SERVICE_LOADER = "test.plugins.ServiceLoaderPlugin";
+    /**
+     * Class name of a plugin which reads a version string from resource.
+     */
+    public static final String READ_VERSION_FROM_RESOURCE = "test.plugins.ReadVersionFromResource";
 
     private static final Logger log = LoggerFactory.getLogger(TestPlugins.class);
-    private static final Map<String, File> PLUGIN_JARS;
+    private static final Map<String, PluginJar> PLUGIN_JARS;
     private static final Throwable INITIALIZATION_EXCEPTION;
 
+    private static final class PluginJar {
+        String className;
+        File jarFile;
+
+        public PluginJar(String className, File jarFile) {
+            this.className = className;
+            this.jarFile = jarFile;
+        }
+
+        public String getClassName() {
+            return className;
+        }
+
+        public File getJarFile() {
+            return jarFile;
+        }
+    }
+
     static {
         Throwable err = null;
-        HashMap<String, File> pluginJars = new HashMap<>();
+        Map<String, PluginJar> pluginJars = new HashMap<>();
         try {
-            pluginJars.put(ALWAYS_THROW_EXCEPTION, createPluginJar("always-throw-exception"));
-            pluginJars.put(ALIASED_STATIC_FIELD, createPluginJar("aliased-static-field"));
-            pluginJars.put(SAMPLING_CONVERTER, createPluginJar("sampling-converter"));
-            pluginJars.put(SAMPLING_CONFIGURABLE, createPluginJar("sampling-configurable"));
-            pluginJars.put(SAMPLING_HEADER_CONVERTER, createPluginJar("sampling-header-converter"));
-            pluginJars.put(SAMPLING_CONFIG_PROVIDER, createPluginJar("sampling-config-provider"));
-            pluginJars.put(SERVICE_LOADER, createPluginJar("service-loader"));
+            pluginJars.put("always-throw-exception",
+                    new PluginJar(ALWAYS_THROW_EXCEPTION, createPluginJar("always-throw-exception")));
+            pluginJars.put("aliased-static-field",
+                    new PluginJar(ALIASED_STATIC_FIELD, createPluginJar("aliased-static-field")));
+            pluginJars.put("sampling-converter",
+                    new PluginJar(SAMPLING_CONVERTER, createPluginJar("sampling-converter")));
+            pluginJars.put("sampling-configurable",
+                    new PluginJar(SAMPLING_CONFIGURABLE, createPluginJar("sampling-configurable")));
+            pluginJars.put("sampling-header-converter",
+                    new PluginJar(SAMPLING_HEADER_CONVERTER, createPluginJar("sampling-header-converter")));
+            pluginJars.put("sampling-config-provider",
+                    new PluginJar(SAMPLING_CONFIG_PROVIDER, createPluginJar("sampling-config-provider")));
+            pluginJars.put("service-loader",
+                    new PluginJar(SERVICE_LOADER, createPluginJar("service-loader")));
+            // Create two versions of the same plugin reading version string from a resource
+            pluginJars.put("read-version-from-resource-v1",

Review comment:
       Since these string literals are now relevant elsewhere, we should make them reusable constants. Perhaps they should be enums? I realize now that perhaps the class names should have also been enums but 🤷.

##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/TestPlugins.java
##########
@@ -143,16 +176,35 @@ public static void assertAvailable() throws AssertionError {
     public static List<String> pluginPath() {

Review comment:
       The existing structure of this class bakes in the assumption that each plugin only appears once on the plugin path, and that the common use-case plugin path (returned by this method) is then valid. This change would make this method return an invalid plugin path, with flakey behavior when loading the duplicated plugin (which plugin gets loaded would be determined by iteration order over the PLUGIN_JARS hashmap).
   
   There are a couple of ways out:
   1. avoid tackling this now, and separate the two plugins with different class names
   2. have this method pick one of the duplicates to drop deterministically, so that the Plugins class doesn't have undefined loading behavior.
   3. allow/deny some of these plugins from being included in this default plugin path, and keep some plugins back for the more specific tests
   4. remove this method, and have PluginsTest explicitly include the needed plugins in each test, and/or a default list to include if none are specifically requested.

##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/TestPlugins.java
##########
@@ -98,22 +97,56 @@
      * to load internal classes, and samples information about their initialization.
      */
     public static final String SERVICE_LOADER = "test.plugins.ServiceLoaderPlugin";
+    /**
+     * Class name of a plugin which reads a version string from resource.
+     */
+    public static final String READ_VERSION_FROM_RESOURCE = "test.plugins.ReadVersionFromResource";
 
     private static final Logger log = LoggerFactory.getLogger(TestPlugins.class);
-    private static final Map<String, File> PLUGIN_JARS;
+    private static final Map<String, PluginJar> PLUGIN_JARS;
     private static final Throwable INITIALIZATION_EXCEPTION;
 
+    private static final class PluginJar {
+        String className;
+        File jarFile;
+
+        public PluginJar(String className, File jarFile) {

Review comment:
       Perhaps another thing that should be representable by the data structure in this class is a jar file which contains multiple plugins. This is a completely supported use-case, but we don't have any tests related to it. If we're adding the ability to represent an unsupported configuration, maybe we should make sure it's able to represent other supported use-cases too :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on a change in pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on a change in pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#discussion_r608245568



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/Plugins.java
##########
@@ -61,12 +61,24 @@ public Plugins(Map<String, String> props) {
         delegatingLoader.initLoaders();
     }
 
+    public Plugins(Map<String, String> props, ClassLoader parent) {

Review comment:
       Yep. Added the `VisibleForTesting` comment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#issuecomment-861856520


   Thanks for review @gharris1727 . @kkonstantine , @rhauch , could one of you take a look when you get a chance?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] mageshn commented on a change in pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
mageshn commented on a change in pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#discussion_r607285210



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/Plugins.java
##########
@@ -61,12 +61,24 @@ public Plugins(Map<String, String> props) {
         delegatingLoader.initLoaders();
     }
 
+    public Plugins(Map<String, String> props, ClassLoader parent) {

Review comment:
       Is this primarily added for testing purpose?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on a change in pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on a change in pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#discussion_r607232615



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/PluginsTest.java
##########
@@ -342,6 +343,47 @@ public void newPluginsShouldConfigureWithPluginClassLoader() {
         assertPluginClassLoaderAlwaysActive(samples);
     }
 
+    @Test
+    public void pluginClassLoaderReadVersionFromResource() {
+        TestPlugins.assertAvailable();
+
+        Map<String, String> pluginProps = new HashMap<>();
+        pluginProps.put(WorkerConfig.PLUGIN_PATH_CONFIG,
+                TestPlugins.pluginPath().stream()
+                        .filter(s -> s.contains("read-version-from-resource-v1"))
+                        .collect(Collectors.joining()));
+        plugins = new Plugins(pluginProps);
+
+        Converter converter = plugins.newPlugin(
+                TestPlugins.READ_VERSION_FROM_RESOURCE,
+                new AbstractConfig(new ConfigDef(), Collections.emptyMap()),
+                Converter.class
+        );
+        assertEquals("1.0.0",
+                new String(converter.fromConnectData(null, null, null)));
+        PluginClassLoader pluginClassLoader = plugins.delegatingLoader()
+                .pluginClassLoader(TestPlugins.READ_VERSION_FROM_RESOURCE);
+        assertNotNull(pluginClassLoader);
+
+
+        // Re-initialize Plugins object with plugin class loader in the class loader tree. This is
+        // to simulate the situation where jars exist on both system classpath and plugin path.
+        pluginProps.put(WorkerConfig.PLUGIN_PATH_CONFIG,
+                TestPlugins.pluginPath().stream()
+                        .filter(s -> s.contains("read-version-from-resource-v2"))
+                        .collect(Collectors.joining()));
+        plugins = new Plugins(pluginProps, pluginClassLoader);

Review comment:
       I agree that having the additional testcases would be valuable. However, I do think that adding the resource directly to the app classloader makes the test much less readable. I like the idea of constructing and using a plain `URLClassLoader` as the parent. While it still requires the additional constructor to be exposed on `Plugins`, we do already expose the parent parameter on `DelegatingClassLoader` for specifically the flexibility of controlling the parent loader.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on a change in pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on a change in pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#discussion_r608251616



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/Plugins.java
##########
@@ -61,12 +61,24 @@ public Plugins(Map<String, String> props) {
         delegatingLoader.initLoaders();
     }
 
+    public Plugins(Map<String, String> props, ClassLoader parent) {

Review comment:
       Good point. Made the constructor package private.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#issuecomment-844392655


   Thanks for review @gharris1727 . I think I address the concern about undeterministic behavior when multiple jars on the plugin path provide the same plugin by introducing a `hideFromPluginPath` boolean on the enum. With this boolean, we can hide the duplicate jar from the plugin path returned in `pluginPath` and avoid the undeterministic behavior. PTAL?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on a change in pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on a change in pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#discussion_r623622054



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/TestPlugins.java
##########
@@ -98,22 +97,56 @@
      * to load internal classes, and samples information about their initialization.
      */
     public static final String SERVICE_LOADER = "test.plugins.ServiceLoaderPlugin";
+    /**
+     * Class name of a plugin which reads a version string from resource.
+     */
+    public static final String READ_VERSION_FROM_RESOURCE = "test.plugins.ReadVersionFromResource";
 
     private static final Logger log = LoggerFactory.getLogger(TestPlugins.class);
-    private static final Map<String, File> PLUGIN_JARS;
+    private static final Map<String, PluginJar> PLUGIN_JARS;
     private static final Throwable INITIALIZATION_EXCEPTION;
 
+    private static final class PluginJar {
+        String className;
+        File jarFile;
+
+        public PluginJar(String className, File jarFile) {

Review comment:
       I added the test resource `multiple-plugins-in-jar` to this class so that it can be picked up by `DelegatingClassLoaderTest::testLoadingPluginClass` to verify both plugins within the jar loaded.

##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/TestPlugins.java
##########
@@ -143,16 +176,35 @@ public static void assertAvailable() throws AssertionError {
     public static List<String> pluginPath() {

Review comment:
       I'm not sure I understand the concern. The change here is just to have the PLUGIN_JARS list be changed to a map with keys being a the resource directory. The jar itself is still created in the same, unique, resource directory by `createPluginJar` . This list of jar files is also still what is returned here in pluginPath .




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ncliang commented on pull request #10475: KAFKA-12610: Implement PluginClassLoader::getResource

Posted by GitBox <gi...@apache.org>.
ncliang commented on pull request #10475:
URL: https://github.com/apache/kafka/pull/10475#issuecomment-813282995


   @mageshn @C0urante @gharris1727 , please review when you get a chance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org