You are viewing a plain text version of this content. The canonical link for it is here.
Posted to droids-dev@incubator.apache.org by "Mingfai Ma (JIRA)" <ji...@apache.org> on 2009/04/19 11:32:49 UTC

[jira] Updated: (DROIDS-48) Support prioritizing in the TaskQueue

     [ https://issues.apache.org/jira/browse/DROIDS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mingfai Ma updated DROIDS-48:
-----------------------------

    Attachment: DROIDS-48.patch

the patches changes quite a number of files, but it's all about
- added int getWeight() to Task
   remarks: LinkTask consumes 72 bytes per instance in a sample test. If the servers do not handle links fast enough, LinkTask will be kept adding to the memory. Just a quick calculation (maybe wrong), 1.5G memory could hold 20M LinkTask. It is preferable to minimize the field in a LinkTask, and use the shortest field. (int instead of long)

 - changed the SimpleTaskQueue from using ConcurrentLinkedQueue to PriorityBlockingQueue by default. Notice that there is a constructor for the user to provide a Queue, so it's not necessary to provide more configuration options such as providing a comparator. (there is no harm to do so, however)

- notice that the method for FileTask is not implemented. not sure if a FileTask need a weight.

How it works:
  - when a task is added to the queue, it checks the weight to decide if a task should be position at the top. 
  - if two tasks has the same weight, the older one go first.




> Support prioritizing in the TaskQueue
> -------------------------------------
>
>                 Key: DROIDS-48
>                 URL: https://issues.apache.org/jira/browse/DROIDS-48
>             Project: Droids
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
>         Attachments: DROIDS-48.patch
>
>
> Use case:
>  - when looping a directory, (imagine someone is too stupid and dunno the dmoz database can be downloaded and try to crawl it with Droids) we got collect a lot of links that will be handled later. assume the requirement is to fetch dmoz directory +1 link outside dmoz.org, In the original mechanism, it will keep adding new links to the TaskQueue. Ideally, there should be a mechanism to give a higher priority to the non-dmoz.org links, so when non-dmoz links are added, they are processed first, and be removed from the TaskQueue asap.
> with the patch in DROIDS-47, a constructor is added to the SimpleTaskQueue to support a custom Queue. This issue suggests to change the SimpleTaskQueue to use a PriorityBlockingQueue by default, and add a getWeight to the Task interface
> I'm also thinking about a more complex TaskQueue. to be discussed in the mail list later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.