You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/02/23 17:16:19 UTC

[GitHub] [flink] StephanEwen opened a new pull request #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

StephanEwen opened a new pull request #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196
 
 
   ## What is the purpose of the change
   
   The AWS SDK always registers a MetricAdminMBean at the JMX MBean Server.
   This allows users to turn on / off more detailed metrics via the JMX admin interface.
   
   However, this registered bean keeps the user code classloader alive and thus causes a class leak.
   
   Excluding the `com.amazonaws.jmx.SdkMBeanRegistrySupport` class from the connectors disables the default registration thus preventing the class leak. It should only disable that JMX metric admin interface and still allow users to manually configure/activate metrics for the AWS SDK.
   
   ## Brief change log
   
     - Exclude `com.amazonaws.jmx.SdkMBeanRegistrySupport` from Kinesis fat jar
     - Add a "shading check" in the build scripts to check whether the class is present in the shaded jar.
   
   ## Verifying this change
   
   This change has self-contained tests.
   
   To test the effectiveness of this, users would need to run a Flink session cluster on AWS and repeatedly submit/cancel a job that uses the Kinesis connector.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): **no**
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **no**
     - The serializers: **no**
     - The runtime per-record code paths (performance sensitive): **no**
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: **no**
     - The S3 file system connector: **no**
   
   ## Documentation
   
     - Does this pull request introduce a new feature? **no**
     - If yes, how is the feature documented? **not applicable**
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] hwanju commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
hwanju commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-592035356
 
 
   @StephanEwen The solution seems clean to me.
   Still a connector may not clean something maintained in a library it depends on (e.g. KPL) if cleanup API is not provided, like `FileAgeManager`, but anyway it needs your proposed API to register cleanup hook and once it's there, we can suggest how some library like KPL can be more Flink-friendly (like don't assume sharing lifecycle with JVM - but yeah "library" actually shouldn't have assumed it in the first place...).
   
   Re: multiple jobs and plugin dependency, I had the same understanding as Stephan, specifically about the issues I am aware of, but I was not sure about MBean issue, but based on what Stephan said, it seems like MBean is per classloader. So, that'd be right that cleanup hook is a solution as long as user leak happens per user classloader not leaking into global JVM structure. I see several user leak is such a case per classloader (but not necessarily all).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590489247
 
 
   How about providing a no-op replacement for `SdkMBeanRegistrySupport` that behaves like `SdkMBeanRegistry#NONE`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] tweise commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
tweise commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-597230875
 
 
   @StephanEwen `registerCleanupHook` is a nice idea that would be super useful for applications and frameworks on top of Flink. The Beam Flink runner could probably make use of it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590093632
 
 
   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "768bc3cfaf831d11952cb525d973d3844815f97f",
       "status" : "PENDING",
       "url" : "https://travis-ci.com/flink-ci/flink/builds/150208060",
       "triggerID" : "768bc3cfaf831d11952cb525d973d3844815f97f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 768bc3cfaf831d11952cb525d973d3844815f97f Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/150208060) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591390483
 
 
   > This problem would still exist with the cleanup hook implementation though, no?
   
   I think not, because the cleanup hook is not run when the task exits, but when the classloader is disposed. Which at the moment is once the last task exits, in the future probably once the slot is released.
   
   The point when the classloader is disposed is exactly the point when we want to make sure that all other "side references" are also removed. This seems fairly clean to me.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-593276364
 
 
   @guoweiM I was just trying to figure out whether I understood you correctly.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590130156
 
 
   > I am wondering if then it would not be better to never register it.
   That's a fair point and I agree.
   
   There is a direct reference to the removed class in `com.amazonaws.jmx.JmxInfoProvider`; any idea how troublesome that might be? (I couldn't find any usages of that class quickly).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen removed a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen removed a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591995340
 
 
   > If you have 2 separate jobs using the kinesis connector running on a single TM can it not still occur? Similarly if a plugin were to activate the sdk metrics (the S3 plugins have a dependency on aws-sdk-core and may be doing the same)?
   
   In the general case, depends how these "leak objects" are scoped.
   In this case I think it might work. The two jobs would be in different class loaders, load different copies of the AdminMBean class, and register multiple beans (it looks like there is specific code for collision of names and appending a number suffix in that case).  Deregistration would then deregister only the one from that classloader.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-592587293
 
 
   @guoweiM Singletons are usually scoped to the Class, not the JVM. If a class is loaded two times (by two different classloaders) then you also have two different singletons. Only if you can guarantee that the class comes from the application class loader, then you have truely only once instance of the singleton.
   
   In the Flink case, this is actually somewhat desirable, in my understanding.
   Different jobs that might load different jars (and different versions of a library) have different "singletons", so they do not conflict with each other.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591995340
 
 
   > If you have 2 separate jobs using the kinesis connector running on a single TM can it not still occur? Similarly if a plugin were to activate the sdk metrics (the S3 plugins have a dependency on aws-sdk-core and may be doing the same)?
   
   In the general case, depends how these "leak objects" are scoped.
   In this case I think it might work. The two jobs would be in different class loaders, load different copies of the AdminMBean class, and register multiple beans (it looks like there is specific code for collision of names and appending a number suffix in that case).  Deregistration would then deregister only the one from that classloader.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590751044
 
 
   > How about providing a no-op replacement for SdkMBeanRegistrySupport that behaves like SdkMBeanRegistry#NONE?
   
   That can definitely work, and sounds good to me. It would be effectively the exclusion defined here, but an extra class in the connector project.
   
   I am wondering if we can make this "generically" accessible also to other projects. The user that ran into this issue also had a self-built SQS connector that also triggered the problem.
   I guess this could be solved by making the AWS SDK dependency in his SQS connector "provided" and make sure our adjusted one from the Kinesis SDK is pulled.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590125983
 
 
   I thought about that a bit.
   
   Given that this code is shared between multiple sources / tasks in the same classloader, unregisterin in "close()" means when the first task is canceled or finished, the bean goes away. I am wondering if then it would not be better to never register it. Unless we do some tracking and remove the bean when the last strong reference to a AWS connector is released (some Phantom Reference magic?).
   
   That's why I first proposed this, it seemed simpler.
   If someone would be eager to invest time into a more elaborate solution, like an AWS metrics cleaner util, I'd be +1 for that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen removed a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen removed a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591999051
 
 
   > If you have 2 separate jobs using the kinesis connector running on a single TM can it not still occur? Similarly if a plugin were to activate the sdk metrics (the S3 plugins have a dependency on aws-sdk-core and may be doing the same)?
   
   In the general case, depends how these "leak objects" are scoped.
   In this case I think it might work. The two jobs would be in different class loaders, load different copies of the AdminMBean class, and register multiple beans (it looks like there is specific code for collision of names and appending a number suffix in that case).  Deregistration would then deregister only the one from that classloader.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] guoweiM commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
guoweiM commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-592548023
 
 
   In response to this problem, I think there are two ideas. One is the idea of cleaning up. Everyone has said it in more detail, and I have n’t added anything. There is another idea is to keep the singleton. Some third-party libraries have the design philosophy. There must be individual singleton objects, and these objects are consistent with the life cycle of the JVM. In this case, if we can guarantee that these singleton objects in a TM do not increase with the increase of Task, is it acceptable? In this way, we can also avoid case-by-case resolution.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590431825
 
 
   > There is a direct reference to the removed class in com.amazonaws.jmx.JmxInfoProvider; any idea how troublesome that might be? (I couldn't find any usages of that class quickly).
   
   You are right, if we exclude `com.amazonaws.jmx.SdkMBeanRegistrySupport`, we should exclude `com.amazonaws.jmx.JmxInfoProviderSupport` as well.
   
   But some PhantomReference-based "AWS metrics cleaner util" or so sounds cleaner in the end.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] guoweiM edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
guoweiM edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-592548023
 
 
   In response to this problem, I think there are two ideas. One is the idea of cleaning up. Everyone has said it in more detail, and I have n’t added anything. There is another idea is to keep the singleton. Some third-party libraries have the design philosophy. There must be individual singleton objects, and these objects are consistent with the life cycle of the JVM. In this case, if we can guarantee that these singleton objects in a TM do not increase with the increase of Task, is it acceptable? In this way, we can also avoid case-by-case resolution.  @StephanEwen What do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590435418
 
 
   Maybe someone using Kinesis could chime in here. @jgrier @tweise @mxm or @hwanju ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] guoweiM commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
guoweiM commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-592934606
 
 
   @StephanEwen I agree that the singleton is scoped to the Class and the Class could be loaded multi-times by the different classloaders. This also leads to “many” singleton objects in the JVM scope. In Flink, these may be the normal case because Flink uses the FlinkUserClassLoader.
   
   However, I think if we could provide some mechanism that guarantees that the singleton library could be only loaded once in a JVM. This could lead to only one singleton object. There might be different versions of the library but the number might very limited compared to the failover and many new jobs. 
   
   The con of this option is how to get this target without many efforts from the user in all the scenarios. But in my experience, the connector normally has very heavy dependence libraries and normal leads to leak something. So in the connector scenarios, we could provide some mechanisms such as plugin to reach this goal. 
   
   The pro of this option is that this is a general method. The user does not pay much effort to resolve this problem. Even if we provide some method for user to clean up the resource, the user should pay some efforts to clean up theses resource. Sometimes it is possible. Some time it is very difficult for the user to do it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590431825
 
 
   > There is a direct reference to the removed class in com.amazonaws.jmx.JmxInfoProvider; any idea how troublesome that might be? (I couldn't find any usages of that class quickly).
   
   Some PhantomReference-based "AWS metrics cleaner util" or so sounds cleaner in the end.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591347938
 
 
   @hwanju You just gave me an idea here: What if we expose something like 
   `getUserCodeClassLoader().registerCleanupHook(Runnable)`?
   
   Then one could register an MBean unregister action, a thread shutdown action, etc. whatever helps.
   Prerequisites for that would be:
   
     - Changing the type of the user code class loader to `FlinkDynamicClassLoader extends ClassLoader`.
     - Adjusting methods like `RuntimeContext.getUserCodeClassLoader();` to return that type.
     - Adding a `registerCleanupHook(Runnable)` method on the `FlinkDynamicClassLoader` class and calls these hooks in the `close()` method.
     - This also should go together well with [FLINK-16245](https://issues.apache.org/jira/browse/FLINK-16245)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen removed a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen removed a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591994660
 
 
   > If you have 2 separate jobs using the kinesis connector running on a single TM can it not still occur? Similarly if a plugin were to activate the sdk metrics (the S3 plugins have a dependency on aws-sdk-core and may be doing the same)?
   
   In the general case, depends how these "leak objects" are scoped.
   In this case I think it might work. The two jobs would be in different class loaders, load different copies of the AdminMBean class, and register multiple beans (it looks like there is specific code for collision of names and appending a number suffix in that case).  Deregistration would then deregister only the one from that classloader.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590751044
 
 
   > How about providing a no-op replacement for SdkMBeanRegistrySupport that behaves like SdkMBeanRegistry#NONE?
   
   That can definitely work, and sounds good to me. It would be effectively the exclusion defined here, but an extra class in the connector project.
   
   I am wondering if we can make this "generically" accessible also to other projects. The user that ran into this issue also had a self-built SQS connector that ran into a similar issue.
   I guess this could be solves by making the AWS SDK dependency in his SQS connector "provided" and make sure our adjusted one from the Kinesis SDK is pulled.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
flinkbot commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590093632
 
 
   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "768bc3cfaf831d11952cb525d973d3844815f97f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "768bc3cfaf831d11952cb525d973d3844815f97f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 768bc3cfaf831d11952cb525d973d3844815f97f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zentol edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
zentol edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590130156
 
 
   > I am wondering if then it would not be better to never register it.
   
   That's a fair point and I agree.
   
   There is a direct reference to the removed class in `com.amazonaws.jmx.JmxInfoProvider`; any idea how troublesome that might be? (I couldn't find any usages of that class quickly).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590431825
 
 
   > There is a direct reference to the removed class in com.amazonaws.jmx.JmxInfoProvider; any idea how troublesome that might be? (I couldn't find any usages of that class quickly).
   
   You are right, if we exclude `com.amazonaws.jmx.SdkMBeanRegistrySupport`, we should exclude `com.amazonaws.jmx.JmxInfoProviderSupport` as well.
   
   But some PhantomReference-based "AWS metrics cleaner util" or so sounds cleaner in the end. (No pun intended).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] hwanju commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
hwanju commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591253938
 
 
   Although I am also not sure what would be the ramification of excluding`SdkMBeanRegistrySupport` to other depending classes, it seems like no-op replacement seems safe and clever.
   
   Related to `FlinkKinesisProducer#close()` solution, would it be viable to include `AwsSdkMetrics#unregisterMetricAdminMbean()` inside `BlobLibraryCacheManager#releaseClassLoader` to get that unregistered once all the tasks on the job are closed? But it seems too specific call.
   
   At any rate, we may not make sure that removing this can resolve the leak, right? (if I am reading FLINK-16142 correctly). AFAIR, KPL also has statically launched daemon thread (file age manager), which is not shut down on task close and holds on to the class loaded by user class loader, leading to leak. In this case, not sure how just removing this MBean registry would get things better. We've seen OOM killed with unlimited metaspace, but it seems like it hits metaspace OOM before eating up all the memory as of 1.10 (which we have not run yet).
   
   Curious if you have measured number of class loaders before/after the fix.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590093632
 
 
   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "768bc3cfaf831d11952cb525d973d3844815f97f",
       "status" : "SUCCESS",
       "url" : "https://travis-ci.com/flink-ci/flink/builds/150208060",
       "triggerID" : "768bc3cfaf831d11952cb525d973d3844815f97f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "768bc3cfaf831d11952cb525d973d3844815f97f",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5488",
       "triggerID" : "768bc3cfaf831d11952cb525d973d3844815f97f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 768bc3cfaf831d11952cb525d973d3844815f97f Travis: [SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150208060) Azure: [SUCCESS](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5488) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591380175
 
 
   > Given that this code is shared between multiple sources / tasks in the same classloader, unregisterin in "close()" means when the first task is canceled or finished, the bean goes away. I am wondering if then it would not be better to never register it.
   
   This problem would still exist with the cleanup hook implementation though, no?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
flinkbot commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590092938
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress of the review.
   
   
   ## Automated Checks
   Last check on commit db39bd196a25cd3a79d50b3ec97c4332b0716788 (Sun Feb 23 17:19:16 UTC 2020)
   
   **Warnings:**
    * **1 pom.xml files were touched**: Check for build and licensing issues.
    * No documentation files were touched! Remember to keep the Flink docs up to date!
   
   
   <sub>Mention the bot in a comment to re-run the automated checks.</sub>
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process.<details>
    The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`)
    - `@flinkbot approve all` to approve all aspects
    - `@flinkbot approve-until architecture` to approve everything until `architecture`
    - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention
    - `@flinkbot disapprove architecture` to remove an approval you gave earlier
   </details>

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] guoweiM commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
guoweiM commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-593172858
 
 
   There is no addition. Thanks for your patient explanation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591392713
 
 
   > I think now, because the cleanup hook is not run when the task exits, but when the classloader is disposed. Which at the moment is once the last task exits [...]
   
   If you have 2 separate jobs using the kinesis connector can it not still occur? Similarly if a plugin were to activate the sdk metrics (the S3 plugins have a dependency on aws-sdk-core and may be doing the same)?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
zentol commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-590125367
 
 
   Have you thought about calling `AwsSdkMetrics#unregisterMetricAdminMbean()` in `FlinkKinesisProducer#close()`? While your approach works and is safe, it seems like a fairly nuclear option.
   
   What about the S3 hadoop/presto filesystems? Are they not affected, or are we not worried since they are loaded only once?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-593107095
 
 
   @guoweiM What you describe sounds like the often recommended solution:
   
     - For sessions, if the dependency is "too heavy or leaky for dynamic loading", put it in "/lib" and done. Then you exactly get the dependency in the AppClassLoader, get "one singleton", avoid reloading unloading.
     - For single jobs, use the "Application" (or "single-job") mode to avoid dynamic class loading.
   
   Being able to load connectors from "plugins" (via some factories) would help solve dependency conflicts.
   
   Are you suggesting something in addition to that?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
StephanEwen commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591390483
 
 
   > This problem would still exist with the cleanup hook implementation though, no?
   
   I think now, because the cleanup hook is not run when the task exits, but when the classloader is disposed. Which at the moment is once the last task exits, in the future probably once the slot is released.
   
   The point when the classloader is disposed is exactly the point when we want to make sure that all other "side references" are also removed. This seems fairly clean to me.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zentol edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

Posted by GitBox <gi...@apache.org>.
zentol edited a comment on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591392713
 
 
   > I think now, because the cleanup hook is not run when the task exits, but when the classloader is disposed. Which at the moment is once the last task exits [...]
   
   If you have 2 separate jobs using the kinesis connector running on a single TM can it not still occur? Similarly if a plugin were to activate the sdk metrics (the S3 plugins have a dependency on aws-sdk-core and may be doing the same)?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services