You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Alaak <al...@gmx.de> on 2012/08/27 16:36:43 UTC

Content of size X was truncated to Y

Hi

I recieve the messages similar to "Content of size 109690 was truncated 
to 64937".

I fear this results in incomplete content within my index. So my 
question is, what causes this truncation and is there a possibility to 
disable it?

Regards

Re: Content of size X was truncated to Y

Posted by Lewis John Mcgibbney <le...@gmail.com>.
further to Markus' comments please also see

<property>
  <name>parser.skip.truncated</name>
  <value>true</value>
  <description>Boolean value for whether we should skip parsing for
truncated documents. By default this
  property is activated due to extremely high levels of CPU which
parsing can sometimes take.
  </description>
</property>

On Mon, Aug 27, 2012 at 3:46 PM, Markus Jelsma
<ma...@openindex.io> wrote:
> please see the http.content.limit parameter.
>
>
> -----Original message-----
>> From:Alaak <al...@gmx.de>
>> Sent: Mon 27-Aug-2012 16:39
>> To: user@nutch.apache.org
>> Subject: Content of size X was truncated to Y
>>
>> Hi
>>
>> I recieve the messages similar to "Content of size 109690 was truncated
>> to 64937".
>>
>> I fear this results in incomplete content within my index. So my
>> question is, what causes this truncation and is there a possibility to
>> disable it?
>>
>> Regards
>>



-- 
Lewis

RE: Content of size X was truncated to Y

Posted by Markus Jelsma <ma...@openindex.io>.
please see the http.content.limit parameter.
 
 
-----Original message-----
> From:Alaak <al...@gmx.de>
> Sent: Mon 27-Aug-2012 16:39
> To: user@nutch.apache.org
> Subject: Content of size X was truncated to Y
> 
> Hi
> 
> I recieve the messages similar to "Content of size 109690 was truncated 
> to 64937".
> 
> I fear this results in incomplete content within my index. So my 
> question is, what causes this truncation and is there a possibility to 
> disable it?
> 
> Regards
>