You are viewing a plain text version of this content. The canonical link for it is here.
Posted to droids-dev@incubator.apache.org by "Mingfai Ma (JIRA)" <ji...@apache.org> on 2009/04/19 10:56:47 UTC

[jira] Updated: (DROIDS-46) MultiThreadedTaskMaster (WorkRunner) memory leak

     [ https://issues.apache.org/jira/browse/DROIDS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mingfai Ma updated DROIDS-46:
-----------------------------

    Attachment: DROIDS-46-SimpleTaskMaster-GC.png
                SimpleTaskMaster.java

SimpleTaskMaster
 - there are lots of room for improvement. I've added a lot of TODO to it. Most of the features in the original multi-thread task master are "discarded". And this kind of program needs a longer time to test. So don't assume it is production ready. I'll keep updating it over time.
 - an important to note is this TaskMaster uses a bounded (fixed size) queue for thread, and in each thread, it polls a task every time it run(). This design is diff from the original MT Task Master that has an unbounded queue and possibly copy every task from the TaskQueue to the thread queue. 
 - there are some last min changes that are not tested. I suppose they won't cause problem. 
 - no test case. I do have an internal test case for testing the behavior of the underlying ThreadExecutor. 

Also attached is a GC chart for reference. The job with 200 threads is running for 12 hours and the completedCount is around 120k. There are 300k seeds tasks and new tasks are keep adding to the queue. (so after 12 hrs, the queue size is still around 300k) From the log, which is not attached, shows that the 200 threads are almost always used up (occasionally drop to 199), and a completed thread start a new job in short interval. 
(however, sth ain't done very well and in some samples, the threads run() for 3-4 mins. so the 200 threads) 



> MultiThreadedTaskMaster (WorkRunner) memory leak
> ------------------------------------------------
>
>                 Key: DROIDS-46
>                 URL: https://issues.apache.org/jira/browse/DROIDS-46
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>            Priority: Blocker
>         Attachments: DROIDS-46-SimpleTaskMaster-GC.png, SimpleTaskMaster.java
>
>
> In a Droids job that has been run for around 6 hours, it eats a lot of memory that cannot be free by the GC. with "jmap -histo", the report shows there are 1.5m instance of MultiThreadedTaskMaster$WorkerRunner and LinkedBlockingQueue$Node that consumes 594+475m memory. The instance cannot be free by any full GC (and they reside in tenured generation of the heap)
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        957740     1007648216  [C
>    2:      14874175      594967000  org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner
>    3:      14873977      475967264  java.util.concurrent.LinkedBlockingQueue$Node
> For #1, I'm not use what's that but that may be a problem with my own program. But #2 and #3 shall come from Droids. I haven't checked the source of MultiThreadedTaskMaster yet and it would be great if the original developer could take a quick look to see if there are any chance that the WorkRunner may be referenced.
> Besides, there is another issue related to MultiThreadedTaskMaster DROIDS-43

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.