You are viewing a plain text version of this content. The canonical link for it is here.

Posted to droids-dev@incubator.apache.org by "Mingfai Ma (JIRA)" <ji...@apache.org> on 2009/04/15 13:59:15 UTC

[jira] Created: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

MultiThreadedTaskMaster (WorkRunner) memory leak
------------------------------------------------

                 Key: DROIDS-46
                 URL: https://issues.apache.org/jira/browse/DROIDS-46
             Project: Droids
          Issue Type: Bug
          Components: core
    Affects Versions: 0.01
            Reporter: Mingfai Ma
            Priority: Blocker


In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)

 num     #instances         #bytes  class name
----------------------------------------------
   1:        957740     1007648216  [C
   2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
   3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node

For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.

Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701654#action_12701654 ] 

Ryan McKinley commented on DROIDS-46:
-------------------------------------

this patch merges The SimpleTaskMaster with the MultiThreadedTaskMaster.

The big change is that the loop in MultiThreadedTaskMaster#processAllTasks() happens in its own thread.  This gives the same behavior as the existing implementation.

> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Priority: Blocker
>         Attachments: DROIDS-46-MultiThreadedTaskMaster.patch, DROIDS-46-SimpleTaskMaster-GC.png, SimpleTaskMaster.java
>
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Commented: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by Mingfai <mi...@gmail.com>.

On Fri, Apr 17, 2009 at 9:06 PM, Thorsten Scherler <
thorsten.scherler.ext@juntadeandalucia.es> wrote:

> On Fri, 2009-04-17 at 00:00 -0700, Mingfai Ma (JIRA) wrote:
> > [
> https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700040#action_12700040]
> >
> > Mingfai Ma commented on DROIDS-46:
> > ----------------------------------
> >
> > After studied the MT TaskMaster, I've implemented my own TaskMaster to
> get rid of the memory-leak problem.
>
> can you contribute this TaskMaster?
>

sure. I'm working on tuning the whole thing and will attach my TaskMaster to
DROIDS-46 later.

regards,
mingfai

Re: [jira] Commented: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.

On Fri, 2009-04-17 at 00:00 -0700, Mingfai Ma (JIRA) wrote:
> [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700040#action_12700040 ] 
> 
> Mingfai Ma commented on DROIDS-46:
> ----------------------------------
> 
> After studied the MT TaskMaster, I've implemented my own TaskMaster to get rid of the memory-leak problem. 

can you contribute this TaskMaster?

salu2
-- 
Thorsten Scherler <thorsten.at.apache.org>
Open Source Java <consulting, training and solutions>

Sociedad Andaluza para el Desarrollo de la Sociedad 
de la Información, S.A.U. (SADESI)

[jira] Commented: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by "Mingfai Ma (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700040#action_12700040 ] 

Mingfai Ma commented on DROIDS-46:
----------------------------------

After studied the MT TaskMaster, I've implemented my own TaskMaster to get rid of the memory-leak problem. The MultiThreadedTaskMaster is a bit complex and it would be good to refactor it and add docs for readability, and also good to add some test cases. Besides, the use of ThreadPoolExecutor actually introduce another Queue, and the tasks are transferred from one queue to another. I think the 2-queue design should be reviewed. (my own TaskMaster also uses 2 queues, so i don't have suggestion yet)

for the Task Master design as a whole:
 - the sequential task master is conceptually the same as a single thread multi-thread task master. i suggest there should be one SimpleTaskMaster that could sprawn multiple thread to do the job, and people who need single-thread/sequential usage could specify 1 as maxThreads (which could be the default if no parameter is provided) I think most ppl won't use single thread anyway.

 - The pausable implementation seems to make things more complex, i suggest to make it extend a SimpleTaskMaster (if possible) and to implement the additional interface.



> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Priority: Blocker
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by "Thorsten Scherler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thorsten Scherler resolved DROIDS-46.
-------------------------------------

    Resolution: Fixed

Committed revision 772441.
Applying patch DROIDS-46-MultiThreadedTaskMaster.patch

I did not close this issue since there are still a couple of todos in the patch. If you think we can close it please do so.

> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Assignee: Thorsten Scherler
>            Priority: Blocker
>         Attachments: DROIDS-46-MultiThreadedTaskMaster.patch, DROIDS-46-SimpleTaskMaster-GC.png, SimpleTaskMaster.java
>
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by "Mingfai Ma (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mingfai Ma updated DROIDS-46:
-----------------------------

    Attachment: DROIDS-46-SimpleTaskMaster-GC.png
                SimpleTaskMaster.java

SimpleTaskMaster
 - there are lots of room for improvement. I've added a lot of TODO to it. Most of the features in the original multi-thread task master are "discarded". And this kind of program needs a longer time to test. So don't assume it is production ready. I'll keep updating it over time.
 - an important to note is this TaskMaster uses a bounded (fixed size) queue for thread, and in each thread, it polls a task every time it run(). This design is diff from the original MT Task Master that has an unbounded queue and possibly copy every task from the TaskQueue to the thread queue. 
 - there are some last min changes that are not tested. I suppose they won't cause problem. 
 - no test case. I do have an internal test case for testing the behavior of the underlying ThreadExecutor. 

Also attached is a GC chart for reference. The job with 200 threads is running for 12 hours and the completedCount is around 120k. There are 300k seeds tasks and new tasks are keep adding to the queue. (so after 12 hrs, the queue size is still around 300k) From the log, which is not attached, shows that the 200 threads are almost always used up (occasionally drop to 199), and a completed thread start a new job in short interval. 
(however, sth ain't done very well and in some samples, the threads run() for 3-4 mins. so the 200 threads) 



> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Priority: Blocker
>         Attachments: DROIDS-46-SimpleTaskMaster-GC.png, SimpleTaskMaster.java
>
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by "Thorsten Scherler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thorsten Scherler reassigned DROIDS-46:
---------------------------------------

    Assignee: Thorsten Scherler

> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Assignee: Thorsten Scherler
>            Priority: Blocker
>         Attachments: DROIDS-46-MultiThreadedTaskMaster.patch, DROIDS-46-SimpleTaskMaster-GC.png, SimpleTaskMaster.java
>
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated DROIDS-46:
--------------------------------

    Attachment: DROIDS-46-MultiThreadedTaskMaster.patch

trying again (JIRA was down on the last post)

> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Priority: Blocker
>         Attachments: DROIDS-46-MultiThreadedTaskMaster.patch, DROIDS-46-SimpleTaskMaster-GC.png, SimpleTaskMaster.java
>
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by "Mingfai Ma (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699215#action_12699215 ] 

Mingfai Ma commented on DROIDS-46:
----------------------------------

with some investigation and research, it seems the problem come from LinkedBlockingQueue. we are not alone in getting memory leak for using LinkedBlockingQueue, 
http://cs.oswego.edu/pipermail/concurrency-interest/2005-January/001319.html
http://cs.oswego.edu/pipermail/concurrency-interest/2009-February/005829.html | http://thread.gmane.org/gmane.comp.java.jsr.166-concurrency/5758

There is at least one open bug:
http://bugs.sun.com/view_bug.do?bug_id=6805775

My case seems to be different from the jdk bug above. I found that the WorkerRunner reside in tenured cannot be clear even with a full GC. 

> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Priority: Blocker
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by "Mingfai Ma (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700576#action_12700576 ] 

Mingfai Ma edited comment on DROIDS-46 at 4/19/09 4:03 AM:
-----------------------------------------------------------

SimpleTaskMaster
 - This implementation has a lots of room for improvement, and you could find many TODO comment. It's intended to be a Simple implementation and any feature in the original MT Task Master beyond the minimum set of features are not included.  I don't consider this as production ready and we need more time to test.
 - Notice that TaskMaster uses a bounded (fixed size) queue for the ThreadPoolExecutor. Every time a thread run(), it polls an item from the TaskQueue. This design is diff from the original MT Task Master that has an unbounded queue and possibly copy every task from the TaskQueue to the thread queue. 
 - there are some last min changes and refactoring that are not tested. I suppose they won't cause problem. 
 - there is no test case. but I do have an internal test case for testing the behavior of the underlying ThreadExecutor to ensure it behaves in a way I expect. The terminate feature is manually tested. 

Also attached is a GC chart for reference. The job with 200 threads is running for 12 hours and the completedCount is around 120k. There are 300k seeds tasks and new tasks are keep adding to the queue. (so after 12 hrs, the queue size is still around 300k) From the log, which is not attached, shows that the 200 threads are almost always used up (occasionally drop to 199), and a completed thread start a new job in short interval. 
(however, sth ain't done very well and in some samples, the threads run() for 3-4 mins. So the whole job is not a busy one.) 



      was (Author: mingfai):
    SimpleTaskMaster
 - there are lots of room for improvement. I've added a lot of TODO to it. Most of the features in the original multi-thread task master are "discarded". And this kind of program needs a longer time to test. So don't assume it is production ready. I'll keep updating it over time.
 - an important to note is this TaskMaster uses a bounded (fixed size) queue for thread, and in each thread, it polls a task every time it run(). This design is diff from the original MT Task Master that has an unbounded queue and possibly copy every task from the TaskQueue to the thread queue. 
 - there are some last min changes that are not tested. I suppose they won't cause problem. 
 - no test case. I do have an internal test case for testing the behavior of the underlying ThreadExecutor. 

Also attached is a GC chart for reference. The job with 200 threads is running for 12 hours and the completedCount is around 120k. There are 300k seeds tasks and new tasks are keep adding to the queue. (so after 12 hrs, the queue size is still around 300k) From the log, which is not attached, shows that the 200 threads are almost always used up (occasionally drop to 199), and a completed thread start a new job in short interval. 
(however, sth ain't done very well and in some samples, the threads run() for 3-4 mins. so the 200 threads) 


  
> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Priority: Blocker
>         Attachments: DROIDS-46-SimpleTaskMaster-GC.png, SimpleTaskMaster.java
>
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701596#action_12701596 ] 

Ryan McKinley commented on DROIDS-46:
-------------------------------------

This implementation looks good -- I think it should replace the existing MultiThreadedTaskMaster.  

The remaining issues seem to be:
 * wire in the TaskExceptionHandler
 * PausableTaskMaster?
 * terminate cleanup -- "//TODO it isn't a very good idea to check the activeCount"

When I added the PausableTaskMaster, i wanted a way to pause operations and continue.  Since the 'pause' does not actually pause the workers, and the application state is held in the TaskQueue, i'm not sure how valuable it is.  shutting down and restarting would have the same affect as a pause.

I think we can safely remove the PausableTaskMaster interface.

---------

Any thoughts on hanging on to TaskMaster hanging on to all running FutureTasks?  This could potentially give us the interrogate currently running tasks and perhaps kill them Future#cancel(boolean mayInterruptIfRunning) 



> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Priority: Blocker
>         Attachments: DROIDS-46-SimpleTaskMaster-GC.png, SimpleTaskMaster.java
>
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.