You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by cihat güzel <c....@gmail.com> on 2013/08/26 15:04:16 UTC

NUTCH-1317 patch

I have made a patch for that purpose (
https://issues.apache.org/jira/browse/NUTCH-1317<https://issues.apache.org/jira/browse/NUTCH-1317?focusedCommentId=13749989&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13749989>
).
If you would like , you can set limit by mimetype for nutch-2.1 in
nutch-site.xml as follow:

Default limit property:

<property>
  <name>http.content.limit</name>
  <value>65536</value>
</property>

For example: application/pdf:

<property>
  <name>http.content.limit.application.pdf</name>
  <value>1000</value>
</property>

For example: text/plain:

<property>
  <name>http.content.limit.text.plain</name>
  <value>1000</value>
</property>

...

Re: NUTCH-1317 patch

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi Cihat,
Great. Lets move the conversation to the issue.
Best
Lewis


On Mon, Aug 26, 2013 at 6:04 AM, cihat güzel <c....@gmail.com> wrote:

> I have made a patch for that purpose (
> https://issues.apache.org/jira/browse/NUTCH-1317<
> https://issues.apache.org/jira/browse/NUTCH-1317?focusedCommentId=13749989&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13749989
> >
> ).
> If you would like , you can set limit by mimetype for nutch-2.1 in
> nutch-site.xml as follow:
>
> Default limit property:
>
> <property>
>   <name>http.content.limit</name>
>   <value>65536</value>
> </property>
>
> For example: application/pdf:
>
> <property>
>   <name>http.content.limit.application.pdf</name>
>   <value>1000</value>
> </property>
>
> For example: text/plain:
>
> <property>
>   <name>http.content.limit.text.plain</name>
>   <value>1000</value>
> </property>
>
> ...
>



-- 
*Lewis*