You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Julien Nioche (JIRA)" <ji...@apache.org> on 2014/06/23 12:17:24 UTC

[jira] [Assigned] (NUTCH-1687) Pick queue in Round Robin

     [ https://issues.apache.org/jira/browse/NUTCH-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien Nioche reassigned NUTCH-1687:
------------------------------------

    Assignee: Julien Nioche

> Pick queue in Round Robin
> -------------------------
>
>                 Key: NUTCH-1687
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1687
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>            Reporter: Tien Nguyen Manh
>            Assignee: Julien Nioche
>            Priority: Minor
>             Fix For: 1.9
>
>         Attachments: NUTCH-1687.patch, NUTCH-1687.tejasp.v1.patch
>
>
> Currently we chose queue to pick url from start of queues list, so queue at the start of list have more change to be pick first, that can cause problem of long tail queue, which only few queue available at the end which have many urls.
> public synchronized FetchItem getFetchItem() {
>       final Iterator<Map.Entry<String, FetchItemQueue>> it =
>         queues.entrySet().iterator(); ==> always reset to find queue from start
>       while (it.hasNext()) {
> ....
> I think it is better to pick queue in round robin, that can make reduce time to find the available queue and make all queue was picked in round robin and if we use TopN during generator there are no long tail queue at the end.



--
This message was sent by Atlassian JIRA
(v6.2#6252)