You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by mxm <gi...@git.apache.org> on 2016/09/15 11:04:17 UTC
[GitHub] flink pull request #2499: [FLINK-4485] close and remove user class loader af...
GitHub user mxm opened a pull request:
https://github.com/apache/flink/pull/2499
[FLINK-4485] close and remove user class loader after job completion
Keeping the user class loader around after job completion may lead to
excessive temp space usage because all user jars are kept until the
class loader is garbage collected. Tests showed that garbage collection
can be delayed for a long time after the class loader is not referenced
anymore. Note that for the class loader to not be referenced anymore,
its job has to be removed from the archive.
The fastest way to minimize temp space usage is to close and remove the
URLClassloader after job completion. This requires us to keep a
serializable copy of all data which needs the user class loader after
job completion, e.g. to display data on the web interface.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mxm/flink FLINK-4485
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/2499.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2499
----
commit 6ed17b9f5b9c13c80200ccf3db82bbfe727830bb
Author: Maximilian Michels <mx...@apache.org>
Date: 2016-09-15T09:00:58Z
[FLINK-4485] close and remove user class loader after job completion
Keeping the user class loader around after job completion may lead to
excessive temp space usage because all user jars are kept until the
class loader is garbage collected. Tests showed that garbage collection
can be delayed for a long time after the class loader is not referenced
anymore. Note that for the class loader to not be referenced anymore,
its job has to be removed from the archive.
The fastest way to minimize temp space usage is to close and remove the
URLClassloader after job completion. This requires us to keep a
serializable copy of all data which needs the user class loader after
job completion, e.g. to display data on the web interface.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...
Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:
https://github.com/apache/flink/pull/2499
Thanks! Just a few words to @nielsbasjes who reported the issue. I've tested the fix using the test instructions you provided. Even before this fix, I could get rid of the temp files by forcing a manual garbage collection on the JVM, using `jcmd <pid> GC.run`. However, that only worked once the job meta data had been removed from the archive, i.e. it doesn't show up in the web interface anymore. With this fix, the class loader is cleared upon job completion and the files are immediately removed. `lsof | fgrep blob_` didn't show any of these files anymore.
Note, that we don't perform any cleanup on the TaskManager side. There we also wind up with some left over files but they don't seem to pile up. It must be that the garbage collector can figure out when to clean much earlier. Plus, we don't keep a reference to old Task instances like we do for the web interface on the JobManager side.
@StephanEwen I'm thinking about adding a similar fix for the TaskManager side. What do you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...
Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:
https://github.com/apache/flink/pull/2499
@StephanEwen Yes, it is simple. I just pushed a commit. This now releases all temp files after job completion.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...
Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/2499
Looks good to me.
+1 to merge when tests pass
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...
Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/2499
Looks food to me.
+1 to merge
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...
Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:
https://github.com/apache/flink/pull/2499
Merging after tests pass.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...
Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:
https://github.com/apache/flink/pull/2499
This needed another fix because in some tests we use the system class loader instead of a class loader instantiated by the BlobLibraryCacheManager. If we close that one, we cause tests to fail. The solution is to close only `FlinkUserCodeClassLoader`s.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request #2499: [FLINK-4485] close and remove user class loader af...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/flink/pull/2499
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #2499: [FLINK-4485] close and remove user class loader after job...
Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/2499
What would the fix for the TaskManager look like? Simply explicitly closing the UserCodeClassloader, or does it need more?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---