You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "MilleBii (JIRA)" <ji...@apache.org> on 2009/12/17 09:06:18 UTC

[jira] Created: (NUTCH-776) Configurable queue depth

Configurable queue depth
------------------------

                 Key: NUTCH-776
                 URL: https://issues.apache.org/jira/browse/NUTCH-776
             Project: Nutch
          Issue Type: Improvement
          Components: fetcher
    Affects Versions: 1.1
            Reporter: MilleBii
            Priority: Minor
             Fix For: 1.1


I propose that we create a configurable item for the queuedepth in Fetcher.java instead of the hard-coded value of 50.

key name : fetcher.queues.depth

Default value : remains 50 (of course)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (NUTCH-776) Configurable queue depth

Posted by MilleBii <mi...@gmail.com>.
Actually I created a key to set it adequately... The best results came
with a depth of 1 and a big number of threads (I use 1800) ?!?
That is because I have numerous sites (like blogs) that have different
domain names and single IP... This a result of topical focused
crawling.


Since it was not enough speed, I decided to allow fetching on same
site every second... Works fine although not according to
netetiquette.

We should still create this key because it is very handy when trying
to optimize.

In terms of ressources we should be explicit it consummes #Threads x
#depth ... On my 4GB it was saturating around 4000 total Q size.

2010/1/7, Julien Nioche (JIRA) <ji...@apache.org>:
>
>     [
> https://issues.apache.org/jira/browse/NUTCH-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797653#action_12797653
> ]
>
> Julien Nioche commented on NUTCH-776:
> -------------------------------------
>
> Did you notice any improvement in the fetch rate after I suggested on the
> mailing list to use a value larger than 50? Does the memory consumption
> remain reasonable?
>
>> Configurable queue depth
>> ------------------------
>>
>>                 Key: NUTCH-776
>>                 URL: https://issues.apache.org/jira/browse/NUTCH-776
>>             Project: Nutch
>>          Issue Type: Improvement
>>          Components: fetcher
>>    Affects Versions: 1.1
>>            Reporter: MilleBii
>>            Priority: Minor
>>             Fix For: 1.1
>>
>>
>> I propose that we create a configurable item for the queuedepth in
>> Fetcher.java instead of the hard-coded value of 50.
>> key name : fetcher.queues.depth
>> Default value : remains 50 (of course)
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>


-- 
-MilleBii-

[jira] Updated: (NUTCH-776) Configurable queue depth

Posted by "Julien Nioche (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien Nioche updated NUTCH-776:
--------------------------------

    Fix Version/s:     (was: 1.1)

Moving this issue post 1.1
Needs a patch file, some description of the param in nutch-default.xml and more importantly some experimentation to see how it impacts the performance of the fetching

> Configurable queue depth
> ------------------------
>
>                 Key: NUTCH-776
>                 URL: https://issues.apache.org/jira/browse/NUTCH-776
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>    Affects Versions: 1.1
>            Reporter: MilleBii
>            Priority: Minor
>
> I propose that we create a configurable item for the queuedepth in Fetcher.java instead of the hard-coded value of 50.
> key name : fetcher.queues.depth
> Default value : remains 50 (of course)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-776) Configurable queue depth

Posted by "Julien Nioche (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797653#action_12797653 ] 

Julien Nioche commented on NUTCH-776:
-------------------------------------

Did you notice any improvement in the fetch rate after I suggested on the mailing list to use a value larger than 50? Does the memory consumption remain reasonable?  

> Configurable queue depth
> ------------------------
>
>                 Key: NUTCH-776
>                 URL: https://issues.apache.org/jira/browse/NUTCH-776
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>    Affects Versions: 1.1
>            Reporter: MilleBii
>            Priority: Minor
>             Fix For: 1.1
>
>
> I propose that we create a configurable item for the queuedepth in Fetcher.java instead of the hard-coded value of 50.
> key name : fetcher.queues.depth
> Default value : remains 50 (of course)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.