You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "MilleBii (JIRA)" <ji...@apache.org> on 2009/12/17 09:06:18 UTC
[jira] Created: (NUTCH-776) Configurable queue depth
Configurable queue depth
------------------------
Key: NUTCH-776
URL: https://issues.apache.org/jira/browse/NUTCH-776
Project: Nutch
Issue Type: Improvement
Components: fetcher
Affects Versions: 1.1
Reporter: MilleBii
Priority: Minor
Fix For: 1.1
I propose that we create a configurable item for the queuedepth in Fetcher.java instead of the hard-coded value of 50.
key name : fetcher.queues.depth
Default value : remains 50 (of course)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (NUTCH-776) Configurable queue depth
Posted by MilleBii <mi...@gmail.com>.
Actually I created a key to set it adequately... The best results came
with a depth of 1 and a big number of threads (I use 1800) ?!?
That is because I have numerous sites (like blogs) that have different
domain names and single IP... This a result of topical focused
crawling.
Since it was not enough speed, I decided to allow fetching on same
site every second... Works fine although not according to
netetiquette.
We should still create this key because it is very handy when trying
to optimize.
In terms of ressources we should be explicit it consummes #Threads x
#depth ... On my 4GB it was saturating around 4000 total Q size.
2010/1/7, Julien Nioche (JIRA) <ji...@apache.org>:
>
> [
> https://issues.apache.org/jira/browse/NUTCH-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797653#action_12797653
> ]
>
> Julien Nioche commented on NUTCH-776:
> -------------------------------------
>
> Did you notice any improvement in the fetch rate after I suggested on the
> mailing list to use a value larger than 50? Does the memory consumption
> remain reasonable?
>
>> Configurable queue depth
>> ------------------------
>>
>> Key: NUTCH-776
>> URL: https://issues.apache.org/jira/browse/NUTCH-776
>> Project: Nutch
>> Issue Type: Improvement
>> Components: fetcher
>> Affects Versions: 1.1
>> Reporter: MilleBii
>> Priority: Minor
>> Fix For: 1.1
>>
>>
>> I propose that we create a configurable item for the queuedepth in
>> Fetcher.java instead of the hard-coded value of 50.
>> key name : fetcher.queues.depth
>> Default value : remains 50 (of course)
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
--
-MilleBii-
[jira] Updated: (NUTCH-776) Configurable queue depth
Posted by "Julien Nioche (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Nioche updated NUTCH-776:
--------------------------------
Fix Version/s: (was: 1.1)
Moving this issue post 1.1
Needs a patch file, some description of the param in nutch-default.xml and more importantly some experimentation to see how it impacts the performance of the fetching
> Configurable queue depth
> ------------------------
>
> Key: NUTCH-776
> URL: https://issues.apache.org/jira/browse/NUTCH-776
> Project: Nutch
> Issue Type: Improvement
> Components: fetcher
> Affects Versions: 1.1
> Reporter: MilleBii
> Priority: Minor
>
> I propose that we create a configurable item for the queuedepth in Fetcher.java instead of the hard-coded value of 50.
> key name : fetcher.queues.depth
> Default value : remains 50 (of course)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-776) Configurable queue depth
Posted by "Julien Nioche (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797653#action_12797653 ]
Julien Nioche commented on NUTCH-776:
-------------------------------------
Did you notice any improvement in the fetch rate after I suggested on the mailing list to use a value larger than 50? Does the memory consumption remain reasonable?
> Configurable queue depth
> ------------------------
>
> Key: NUTCH-776
> URL: https://issues.apache.org/jira/browse/NUTCH-776
> Project: Nutch
> Issue Type: Improvement
> Components: fetcher
> Affects Versions: 1.1
> Reporter: MilleBii
> Priority: Minor
> Fix For: 1.1
>
>
> I propose that we create a configurable item for the queuedepth in Fetcher.java instead of the hard-coded value of 50.
> key name : fetcher.queues.depth
> Default value : remains 50 (of course)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.