You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by mxm <gi...@git.apache.org> on 2016/09/15 11:04:17 UTC

[GitHub] flink pull request #2499: [FLINK-4485] close and remove user class loader af...

GitHub user mxm opened a pull request:

    https://github.com/apache/flink/pull/2499

    [FLINK-4485] close and remove user class loader after job completion

    Keeping the user class loader around after job completion may lead to
    excessive temp space usage because all user jars are kept until the
    class loader is garbage collected. Tests showed that garbage collection
    can be delayed for a long time after the class loader is not referenced
    anymore. Note that for the class loader to not be referenced anymore,
    its job has to be removed from the archive.
    
    The fastest way to minimize temp space usage is to close and remove the
    URLClassloader after job completion. This requires us to keep a
    serializable copy of all data which needs the user class loader after
    job completion, e.g. to display data on the web interface.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mxm/flink FLINK-4485

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2499.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2499
    
----
commit 6ed17b9f5b9c13c80200ccf3db82bbfe727830bb
Author: Maximilian Michels <mx...@apache.org>
Date:   2016-09-15T09:00:58Z

    [FLINK-4485] close and remove user class loader after job completion
    
    Keeping the user class loader around after job completion may lead to
    excessive temp space usage because all user jars are kept until the
    class loader is garbage collected. Tests showed that garbage collection
    can be delayed for a long time after the class loader is not referenced
    anymore. Note that for the class loader to not be referenced anymore,
    its job has to be removed from the archive.
    
    The fastest way to minimize temp space usage is to close and remove the
    URLClassloader after job completion. This requires us to keep a
    serializable copy of all data which needs the user class loader after
    job completion, e.g. to display data on the web interface.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:

    https://github.com/apache/flink/pull/2499
  
    Thanks! Just a few words to @nielsbasjes who reported the issue. I've tested the fix using the test instructions you provided. Even before this fix, I could get rid of the temp files by forcing a manual garbage collection on the JVM, using `jcmd <pid> GC.run`. However, that only worked once the job meta data had been removed from the archive, i.e. it doesn't show up in the web interface anymore. With this fix, the class loader is cleared upon job completion and the files are immediately removed. `lsof | fgrep blob_` didn't show any of these files anymore.
    
    Note, that we don't perform any cleanup on the TaskManager side. There we also wind up with some left over files but they don't seem to pile up. It must be that the garbage collector can figure out when to clean much earlier. Plus, we don't keep a reference to old Task instances like we do for the web interface on the JobManager side.
    
    @StephanEwen I'm thinking about adding a similar fix for the TaskManager side. What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:

    https://github.com/apache/flink/pull/2499
  
    @StephanEwen Yes, it is simple. I just pushed a commit. This now releases all temp files after job completion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/2499
  
    Looks good to me.
    
    +1 to merge when tests pass


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/2499
  
    Looks food to me.
    
    +1 to merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:

    https://github.com/apache/flink/pull/2499
  
    Merging after tests pass.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:

    https://github.com/apache/flink/pull/2499
  
    This needed another fix because in some tests we use the system class loader instead of a class loader instantiated by the BlobLibraryCacheManager. If we close that one, we cause tests to fail. The solution is to close only `FlinkUserCodeClassLoader`s.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2499: [FLINK-4485] close and remove user class loader af...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/2499


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/2499
  
    What would the fix for the TaskManager look like? Simply explicitly closing the UserCodeClassloader, or does it need more?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---