You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by cihat güzel <c....@gmail.com> on 2013/08/26 15:04:16 UTC
NUTCH-1317 patch
I have made a patch for that purpose (
https://issues.apache.org/jira/browse/NUTCH-1317<https://issues.apache.org/jira/browse/NUTCH-1317?focusedCommentId=13749989&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13749989>
).
If you would like , you can set limit by mimetype for nutch-2.1 in
nutch-site.xml as follow:
Default limit property:
<property>
<name>http.content.limit</name>
<value>65536</value>
</property>
For example: application/pdf:
<property>
<name>http.content.limit.application.pdf</name>
<value>1000</value>
</property>
For example: text/plain:
<property>
<name>http.content.limit.text.plain</name>
<value>1000</value>
</property>
...
Re: NUTCH-1317 patch
Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi Cihat,
Great. Lets move the conversation to the issue.
Best
Lewis
On Mon, Aug 26, 2013 at 6:04 AM, cihat güzel <c....@gmail.com> wrote:
> I have made a patch for that purpose (
> https://issues.apache.org/jira/browse/NUTCH-1317<
> https://issues.apache.org/jira/browse/NUTCH-1317?focusedCommentId=13749989&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13749989
> >
> ).
> If you would like , you can set limit by mimetype for nutch-2.1 in
> nutch-site.xml as follow:
>
> Default limit property:
>
> <property>
> <name>http.content.limit</name>
> <value>65536</value>
> </property>
>
> For example: application/pdf:
>
> <property>
> <name>http.content.limit.application.pdf</name>
> <value>1000</value>
> </property>
>
> For example: text/plain:
>
> <property>
> <name>http.content.limit.text.plain</name>
> <value>1000</value>
> </property>
>
> ...
>
--
*Lewis*